惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

N
News and Events Feed by Topic
Malwarebytes
Malwarebytes
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
C
Cybersecurity and Infrastructure Security Agency CISA
F
Future of Privacy Forum
C
Cisco Blogs
T
The Exploit Database - CXSecurity.com
A
Arctic Wolf
S
Securelist
K
Kaspersky official blog
S
Schneier on Security
T
ThreatConnect
T
Tenable Blog
Spread Privacy
Spread Privacy
T
True Tiger Recordings
AWS News Blog
AWS News Blog
F
Fox-IT International blog
量子位
T
Threatpost
V
Vulnerabilities – Threatpost
C
CERT Recently Published Vulnerability Notes
Cisco Talos Blog
Cisco Talos Blog
GbyAI
GbyAI
宝玉的分享
宝玉的分享
腾讯CDC
G
Google Developers Blog
aimingoo的专栏
aimingoo的专栏
Cyberwarzone
Cyberwarzone
有赞技术团队
有赞技术团队
S
SegmentFault 最新的问题
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
U
Unit 42
雷峰网
雷峰网
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Simon Willison's Weblog
Simon Willison's Weblog
O
OpenAI News
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
The GitHub Blog
The GitHub Blog
The Register - Security
The Register - Security
MyScale Blog
MyScale Blog
小众软件
小众软件
A
About on SuperTechFans
Last Week in AI
Last Week in AI
Y
Y Combinator Blog
博客园 - 三生石上(FineUI控件)
美团技术团队
Google Online Security Blog
Google Online Security Blog
P
Proofpoint News Feed
MongoDB | Blog
MongoDB | Blog

Hacker News: Show HN

Show HN: TalkTimer, a micro-SaaS run by an AI agent team Trickster's Table invest-like - AI value-investing: best-tier consensus beat S&P 500 by +72.7% GitHub - kimjune01/swebench-verified: Reproducible recon/craft/audit agent pipeline for SWE-bench Verified. Official-graded, codex-attested, GPL-3.0. Run it yourself. GitHub - mupt-ai/context-drop: cli tool to make sharing context between remote agents dead easy Multiple Real Desktops for Windows GitHub - lionello/han64: Handling Chinese text on the Commodore 64 Show HN: Strudel – Generate commit messages via Apple's on-device LLM Show HN: Audiomass – a free, open-source multitrack audio editor for the web move-reminder The Front Page HtmlUnit – Welcome to HtmlUnit GitHub - kouhxp/textsnap: Snap any image, screenshot, or webpage into plaintext. No GPU. No cloud. One command. Show HN: Pro Health Ledger – An open-source, net-neutral reputation system iPhone 版“Today” - App Store LLMRequirements.com — Hardware for Local LLMs Show HN: Hookwarden – npx tool to find and fix webhook HMAC bugs (JS/TS/Py/PHP) Frello — A small revolt against bloated software Career tools for data professionals | Datamata Studios Show HN: Kanban CLI (A local-first, agent-first task manager for the terminal) Show HN: Fleet – Python supervisor for running coding agents in parallel TravElly | A travel diary app for kids TapToyPia Show HN:An LED display app that lights up concerts, events, and fan moments Show HN: Logo Fonts Home | Qavvāli Wiki Show HN: Panorama – Review Code, Faster Show HN: Slow Code, a monthly meetup to practice coding by hand GitHub - abakh/nbsdgames: A package of 21 new, improved, text-based modern games. Some are entirely original ideas. Best and lightest! Let's Jam Show HN: CurRant->Screw Google scourge, help people notice what is worth a look Fruitsy Show HN: World Cup 2026 free family and friends prediction platform AgentLens — Know if your AI features are actually working Planet Maiko Specter — Use AI on your Ghost blog, locally Money Me - Personal Financial Management Smart Runner: Adaptive Running Plans for iPhone and Apple Watch Show HN: nsS3ui – A non sucking GUI for S3 Show HN: Vibe-coded Steam, but in the browser AI Model Idle · 인공지능 키우기 Show HN: Running BitNet b1.58 inside DRAM by breaking DDR4 timing rules GitHub - muhammadmuzzammil1998/pack-src: Recursively pack source code directories into clean ZIP archives — fast, smart, and ignore-aware. Streakout: Visualize Your Apple Health Workout Data Show HN: Memory for LLM apps that cuts input tokens up to 80% (avg 68%) AI Chess Coach — Know Why Every Move Matters Gemini Omni Flash Video Generator & Editor | Vivify Show HN: Invoker, an IDE for orchestrating massively parallel agents Show HN: I built a RAG and knowledge graph agent that runs locally Clankerfights Show HN: Fed.run – online collaborative Rust IDE and Markdown editor GitHub - EliotAndres/SimStream: SimStream is a library to stream iOS simulators from your Mac to your iPhone (or any web browser) Solana Index | Slot-Based Token Balance API GitHub - mainline-org/mainline: Git-native memory for coding agents. Repo memory before the diff. CostHawk - Track AI Adoption, Cost, and Rollout Across Your Team Show HN: Calculator Music – play songs with number keys in the browser Show HN: First MCP server for Guesty property mgmt – 43 tools, open source GitHub - stukenov/rav2d: AV2 video decoder in Rust — full port of dav2d C logic to memory-safe Rust. 47K lines, 786 tests. Assembly DSP stays via FFI. Show HN: I built a powerful RAG and knowledge graph agent that runs locally Show HN: Directionally bad – a newsletter about risks of AI centralization Welcome chord-commander GitHub - klimentij/klimkit: Agentic engineering across machines, under control. Congress Disclosure Watchlist Digest - TinyOps Studio GitHub - arthurzhu29/larksson Brev - Frictionless Notes - Apps on Google Play TruLayer — Evals, Closed Control Loop & Auto-Rollback for Production AI GitHub - pilatesjs/pilates: Headless flex layout engine for terminal UIs. NOW and CSCO Politician Trading Claims ThinkLLM — Think through your LLM choices Show HN: Waiting for AI Grand Prix racing SIM? Me too So I made one WAR.GOV/UFO Microfilm5 Show HN: Claude Code for Customer Support Show HN: Agentikus Purr - Apps on Google Play atom.plumocracy.com Show HN: BonzAI – self-sovereign, local LLM inference in the browser GitHub - T0nd3/logatory: Local-first log analysis with PII redaction, threat detection, anomaly detection and LLM insights — CLI, web dashboard and REST API GitHub - doshareme/synchole: P2P Data Sync Protocol GitHub - secluso/core: A privacy-preserving Raspberry Pi home security camera that uses advanced end-to-end encryption. Show HN: Microcodegen.py – PRD → FastAPI app, one file, no LLM calls iPhone 版“Today” - App Store ScrollLaunch — Launch Your Product. Get Seen Weekly. GitHub - ppserapiao/mneme: the open, user-sovereign memory layer for AI. local-first · client-side encrypted · open protocol. your memory. your keys. every model. Datapoint AI Home Codeep — Go Deep into Code OpenRig — Local control plane for multi-agent coding topologies GitHub - allenwu-blip/mcpaudit: Static pre-install security scanner for MCP (Model Context Protocol) servers — `npx mcpaudit <path>` flags command injection, credential/env exfiltration into LLM-visible output, over-broad filesystem/tool scope and dynamic eval before you wire a server into your agent. Show HN: Quit All, an iOS app with an SOS mode for cravings GitHub - dmichael-fastly/fastly-examples-live-betting-fanout: A working example of distributing live game scores and betting odds to millions of concurrent users without overwhelming origin — built on Fastly's edge stack. 404 Page Generator — Make your 404 page a needle-mover Show HN: Neuz, a self-hosted news dashboard curated by Claude Senior SWE interview prep — Semicolony GitHub - DefangLabs/pulumi-defang: Defang Pulumi providers - Take your app from Docker Compose to a secure and scalable cloud deployment with Pulumi. OpenYardage — Printable Golf Yardage Books GitHub - uAIex/KeyMouseRecorder Ship Mobile Features Instantly — Nativeblocks SDUI Platform CoreMem - Your context, any AI agent Show HN: AI-Mirror - Self-optimising ranking engine for modern web applications.
GitHub - aloth/cred-1: CRED-1: An Open Multi-Signal Domain Credibility Dataset (2,672 domains)
xlth · 2026-05-25 · via Hacker News: Show HN

CRED-1 Domain Credibility Dataset Banner

DOI License: CC BY 4.0

CRED-1 is an open, reproducible domain-level credibility dataset combining multiple openly-licensed source lists with computed enrichment signals. It provides credibility scores for 2,672 domains known to publish mis/disinformation, conspiracy theories, or other unreliable content.

🎓 Presented at ACM WebSci 2026 (Braunschweig). Landing page: aloth.github.io/agentic-ai-information-integrity/cred-1. First production integration: Trackless Links for iOS and macOS, with free codes for readers and attendees: gutscheinhub.de/ratgeber/trackless-links-cred-1-acm-websci-2026.

Paper: A. Loth, M. Kappes, and M.-O. Pahl, "CRED-1: An Open Multi-Signal Domain Credibility Dataset for Automated Pre-Bunking of Online Misinformation," Preprint, 2026. doi:10.2139/ssrn.6448466

Key Features

  • 2,672 domains with credibility scores (0.0–1.0)
  • Fully reproducible — Python pipeline rebuilds the dataset from scratch
  • Multi-signal scoring combining source labels, domain age, web popularity, fact-check frequency, and threat intelligence
  • Privacy-preserving — designed for on-device client-side deployment (no server calls needed)
  • Two openly-licensed sources — no proprietary data dependencies

Quick Start

import json

with open("data/cred1_current.json") as f:
    cred = json.load(f)

domain = "infowars.com"
if domain in cred:
    score = cred[domain]["credibility_score"]  # 0.0 (least credible) to 1.0 (most credible)
    print(f"{domain}: credibility = {score}")
else:
    print(f"{domain}: not in dataset (neutral)")

Dataset Schema

JSON Format (cred1_current.json)

{
  "infowars.com": {
    "category": "fake",
    "credibility_score": 0.14,
    "domain_age_years": 26.4,
    "domain_registered": "1999-10-04T04:00:00Z",
    "iffy_factual": "VL",
    "iffy_bias": "FN",
    "iffy_score": 0.1,
    "factcheck_claims": 52,
    "safe_browsing_flagged": false,
    "score_age": 0.2,
    "score_cat": 0.05,
    "score_factcheck": 0.0,
    "score_iffy": 0.1,
    "score_safebrowsing": 0.05,
    "score_tranco": 0.1,
    "sources": 2,
    "tranco_rank": 4382
  }
}
Field Description
category Full category name: fake, unreliable, mixed, conspiracy, satire, reliable
credibility_score Credibility score (0.0-1.0, lower = less credible)
sources Number of independent source lists flagging this domain
tranco_rank Tranco rank (optional, absent if not ranked)
domain_registered Domain registration date from RDAP, ISO 8601 (optional)
domain_age_years Domain age in years, computed from domain_registered (optional)
iffy_factual MBFC factual reporting rating (optional)
iffy_bias MBFC political bias rating (optional)
iffy_score Iffy.news credibility score, 0.0-1.0 (optional)
factcheck_claims Number of fact-check claims from Google Fact Check Tools API (optional)
safe_browsing_flagged Google Safe Browsing threat flag (optional)
score_cat Category score component
score_iffy Iffy.news score component
score_tranco Tranco rank score component
score_age Domain age score component
score_factcheck Fact-check frequency score component
score_safebrowsing Safe Browsing score component

CSV Format (cred1_current.csv)

Same fields as JSON, in tabular format with 18 columns. Sorted by credibility_score ascending (least credible first).

Compact Format (cred1_compact.json)

Minimal format for on-device embedding (e.g., browser extensions). Short keys, no whitespace, ~168KB.

Key Field
s credibility_score
c category code (f, u, m, c, s, r)
n sources
r tranco_rank (optional)
d domain registration date as YYYY-MM-DD (optional)

Scoring Model

CRED-1 computes credibility scores as a weighted blend of five independent signals:

Signal Weight Source
Source category 50% OpenSources.co + Iffy.news consensus label
Iffy.news score 15% Iffy.news credibility rating (when available)
Fact-check frequency 15% Google Fact Check Tools API — number of claims
Web popularity 5% Tranco Top-1M rank (log-normalized)
Domain age 5% WHOIS/RDAP registration date
Google Safe Browsing Override Hard cap at 0.05 if flagged as malware/social engineering

Remaining weight (when signals are unavailable) defaults to the source category score.

Data Sources

Source Domains License Type
OpenSources.co 825 CC BY 4.0 Curated mis/disinformation domain list
Iffy.news Index 2,040 MIT MBFC-derived unreliable source index
Tranco Top-1M 1,000,000 Free to use Aggregated web popularity ranking
RDAP Public protocol N/A Domain registration data
Google Fact Check Tools API N/A Free (attribution) Fact-check claim database
Google Safe Browsing API N/A Free (attribution) Threat intelligence

Reproduce the Dataset

# 1. Build base dataset (fetch + merge sources)
python3 pipeline/build_dataset.py

# 2. Enrich with signals (requires Google Cloud API key)
export GOOGLE_API_KEY="your-key-here"  # or macOS Keychain
python3 pipeline/enrich_dataset.py

# Individual enrichment steps:
python3 pipeline/enrich_dataset.py --step tranco
python3 pipeline/enrich_dataset.py --step rdap
python3 pipeline/enrich_dataset.py --step factcheck
python3 pipeline/enrich_dataset.py --step safebrowsing
python3 pipeline/enrich_dataset.py --step score

Requirements: Python 3.10+, no external dependencies (stdlib only).

Category Distribution

Category Count %
Mixed 1,335 50.0%
Unreliable 589 22.0%
Fake 493 18.4%
Conspiracy 153 5.7%
Satire 94 3.5%
Reliable 8 0.3%

Applications

CRED-1 is designed for:

  • Browser extensions — on-device pre-bunking at the content delivery stage
  • Misinformation research — ground truth for domain-level credibility studies
  • Content moderation — automated flagging of low-credibility sources
  • Education — media literacy tools and curricula

Citation

If you use CRED-1 in your research, please cite the paper:

@article{loth2026cred1,
  title     = {{CRED-1}: An Open Multi-Signal Domain Credibility Dataset for Automated Pre-Bunking of Online Misinformation},
  author    = {Loth, Alexander and Kappes, Martin and Pahl, Marc-Oliver},
  year      = {2026},
  doi       = {10.2139/ssrn.6448466},
  url       = {https://ssrn.com/abstract=6448466},
  note      = {Preprint available at SSRN}
}

To cite the dataset archive directly:

@dataset{loth2026cred1data,
  title     = {{CRED-1}: An Open Multi-Signal Domain Credibility Dataset},
  author    = {Loth, Alexander},
  year      = {2026},
  publisher = {Zenodo},
  doi       = {10.5281/zenodo.18769460}
}

Contributing

Found a misclassified domain? Want to propose a new credibility signal? We welcome community input.

License

This repository (code and data) is licensed under CC BY 4.0.

Acknowledgments

This dataset builds on the work of:

  • Melissa Zimdars and the OpenSources.co project
  • The Iffy.news team at the Reynolds Journalism Institute
  • Google Fact Check Tools and Safe Browsing APIs

Powered by Google Fact Check Tools and Google Safe Browsing.