惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Google DeepMind News
Google DeepMind News
F
Fortinet All Blogs
阮一峰的网络日志
阮一峰的网络日志
Apple Machine Learning Research
Apple Machine Learning Research
爱范儿
爱范儿
WordPress大学
WordPress大学
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
J
Java Code Geeks
罗磊的独立博客
S
SegmentFault 最新的问题
V
V2EX
V
Visual Studio Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
美团技术团队
博客园 - 三生石上(FineUI控件)
Stack Overflow Blog
Stack Overflow Blog
Y
Y Combinator Blog
MyScale Blog
MyScale Blog
D
Docker
Google DeepMind News
Google DeepMind News
Blog — PlanetScale
Blog — PlanetScale
M
Microsoft Research Blog - Microsoft Research
Martin Fowler
Martin Fowler
S
Secure Thoughts
B
Blog
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Recent Announcements
Recent Announcements
MongoDB | Blog
MongoDB | Blog
C
Cisco Blogs
C
CERT Recently Published Vulnerability Notes
T
True Tiger Recordings
GbyAI
GbyAI
P
Proofpoint News Feed
P
Privacy International News Feed
Jina AI
Jina AI
The Cloudflare Blog
I
Intezer
AWS News Blog
AWS News Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
S
Security Archives - TechRepublic
NISL@THU
NISL@THU
The Register - Security
The Register - Security
Recent Commits to openclaw:main
Recent Commits to openclaw:main
P
Palo Alto Networks Blog
S
Schneier on Security
L
LINUX DO - 热门话题
C
CXSECURITY Database RSS Feed - CXSecurity.com
Security Latest
Security Latest
C
Cybersecurity and Infrastructure Security Agency CISA

DEV Community

I Turned My npm Package Into a Full DevOps Security Toolkit (v2.0.0) n8n for Manufacturing & Industrial: 5 Automations That Cut Downtime and Boost Production (Free Workflow JSON) Stop Using Data Loader for Backfills: A Guide to Parameterized Batch Apex Why sameSite: "lax" doesn't save your Next.js admin routes from CSRF The Edge AI Revolution: Why Gemma 4 E4B is a Game-Changer for Offline Multimodality Beyond Text Rewrites: The Shift to AST-Aware Code Refactoring for AI Agents When Networks Fail, SARA Stands Up: Offline Flood Rescue with Gemma 4 E4B Avoiding the Great Treasure Hunt Stall of 2025: What I Learned from Building a Scalable Hytale Server How we moderate a live video-chat app in real time (without going broke on AI calls) I Built a Multi-Tenant SaaS for 50+ Tenants — Here's the Complete Architecture From Hermes outputs to a UI for Garage 👋 Hello Dev Community — I’m Excited to Join! AWS Backup: Resiliencia ante Desastres y Ransomware (en español sencillo) ASP.NET Core Request & Exception Logging with a Built-In Dashboard Building Agentra, An Enterprise AI Engineering Control Plane for Secure Coding Agents Google Antigravity 1.0 to 2.0/IDE Quick Migration Guide Запуск Flux Schnell (12B) + LLM на устаревшей AMD RX 580 (8 ГБ) через Vulkan — Полное архитектурное руководство [2026] I turned my gesture calculator hobby project into a pip package — so you can detect and use hand gestures in your project in just 3 lines of Python code ISP Didn't Know What CGNAT Is Don't Make the Agent Re-Run the Test Suite to Find the Failure Assembly Code to Machine Code (ARM) Faire tourner Flux Schnell (12B) + LLMs sur une ancienne AMD RX 580 (8 Go) via Vulkan — Guide d'architecture complet [2026] Spring boot Interview Questions LambdaTest vs BrowserStack : Detail Comparison in 2026 Como eu acelerei o desenvolvimento frontend utilizando ferramentas de IA e o MCP do Figma Track YC Demo Day Companies in Real Time (with code) I Got Tired of Passing --profile on Every OCI CLI Command Running Flux Schnell (12B) + LLMs on a Legacy AMD RX 580 (8GB) via Native Vulkan — Full Architecture Guide [2026] Investigation Reports: When Monitors Get Smarter Semantic Layer Best Practices: 7 Mistakes to Avoid I Run MCP Servers. Here's What the Recent Vulnerabilities Actually Mean for Me Phive v1.1.1 — automatic port conflict handling for local VS Code environments Building a SQL-like Relational Database Engine in C++ From Scratch How a Self-Documenting Semantic Layer Reduces Data Team Toil The Adopter: Advocating for OSS You Use (But Don't Own) Optimizing Vite Build Output: A Practical Guide to Tree-Shaking I built a free audit tool that runs 12 checks in parallel against any domain. Here is the architecture. I made a free 7-video series to prep for the new GH-600 (GitHub Agentic AI Developer) cert Why One Model Is Never Enough: Routing Incident Analysis With cascadeflow Forecast Cone: A Grand Theorem for Computable Software Evolution Choosing the Right Treasure Map to Avoid Data Decay in Veltrix Migrating to Apache Iceberg: Strategies for Every Source System Stop Reviewing Every Line of AI Code - Build the Trust Stack Instead Implementation of AI in mobile applications: Comparative analysis of On-Device and On-Server approaches on Native Android and Flutter Should you use Gemma 4 for your Development? A Multiversal Analysis to Determine if Gemma 4 is Right for You! The Rising Trend of Creative Interview Questions in Tech I Spent Hours Fighting a Silent Subnet Conflict to Build an Isolated ICS Security Lab (And What It Taught Me About the Linux Kernel) It Worked When I Closed the Laptop. I Swear. We Built an Agent That Flags Fake Internships #kryx Your Personal AI Stack Is the New Dotfiles Your LLM Bill Is Exploding Because of Architecture, Not Pricing -- Here's the Fix How We Prevent Attendance Fraud Using GPS Verification AI Code Review in 2026: How the Tools Actually Differ (A Builder's Field Guide) From Problems to Patterns: Generative AI in .Net (C#) GemmaOps Edge: From 373 Alarms to 1 Root Cause Using Local AI (Gemma 4) Building an Amazon EKS Security Baseline Hands-On with Apache Iceberg Using Dremio Cloud 🤫 Firebase Is Quietly Preparing for an Offline-First AI Future Should Angular Apps Still Rely on RxJS in 2025? Gaslighting Gemma 4: Can Open-Weight Reasoning Models Withstand a Confident Liar? AI Workflow Automation Needs More Than Another Script Reviving Cineverse: From Local Storage to Firebase 🚀 Approaches to Streaming Data into Apache Iceberg Tables How to Add Rounded Corners to an Image Online The subtle impact of AI (&amp; IT) on jobs Made a Rust based AI agent Your AI is not bad, your instructions are What Clicked for Me After Building on Solana for a Few Days WhatsApp's Encryption Stack: What It Covers, What It Doesn't, and What a Federal Agent Spent 10 Months Investigating Building CogniPlan: A Local-First Task Planning System Using Apache Iceberg with Python and MPP Query Engines How I Built AegisDesk: A Zero-Token Semantic IT Agent with <5ms Latency I built CodeArchy: an open-source that turns any codebase into a visual, explainable architectural experience, powered by Gemma 4. The Day Our Bot Ran Out of Money How we're using Gemini Embeddings to build a smarter, community-driven feed on DEV The Speculative Decoding Pattern The PKCE "Gotcha" in Expo’s exchangeCodeAsync TharVA : Keeping India's Desert Heritage Alive with Offline AI (Gemma4) n8n for Healthcare: 5 Automations for Clinics, Practices, and Health Tech Teams (Free Workflow JSON) How I Built an OWASP Memory Guard for AI Agents (ASI06) Condition-Based vs Time-Based Maintenance: Making the Switch I Tested Spam Protection on Formspree vs Formgrid. The Results Were Surprising. May 27 - Video Understanding Workshop Beyond Keywords: How Google's 2026 Algorithms are Redefining SEO From Click to Cart: Ensuring an Accessible Customer Journey in WooCommerce Your company won't replace you with good AI. They'll replace you with bad AI. How to Use an SVG Icon Search Engine as a Claude Custom Connector O fim do “modelo que faz tudo”? Conheça o Conductor, a IA que orquestra outras IAs 10 First-Principles Strategies to Learn Any Programming Language Deeply 10 First-Principles Strategies to Learn Any Programming Language Deeply Understanding Embeddings easily. The Hidden Cost of “Move Fast and Break Things” Why Your Logs Are Useless Without Traces DressCode: Your AI Stylist for Tomorrow The Documented Shortcoming of Our Production Treasure Hunt Engine I'm 16, and I Built an AI Tool That Audits Your Technical Debt Without Ever Touching code Building Your Own Crypto Poker Bot: A Developer's Guide to Blockchain Gaming Logic Apache Iceberg Metadata Tables: Querying the Internals Hermes, The Self-Improving Agent You Can Actually Run Yourself Unity vs Unreal: 5 Things I Had to Relearn the Hard Way
Building a News Aggregator Without an Engagement Algorithm
Bryan Leonar · 2026-05-04 · via DEV Community

Building a News Aggregator Without an Engagement Algorithm

I have been building a project called WeSearch:

https://wesearch.press

It is a free news aggregator that pulls from hundreds of sources, keeps discovery mostly chronological, adds source/bias context where available, preserves permanent daily archives, and allows anonymous discussion on stories.

The project started from a simple frustration:

Most news discovery products are either too personalized, too paywalled, too noisy, too opaque, or too socially distorted.

I wanted something closer to this:

  • a wide source feed
  • no account required
  • no paywall
  • no tracking
  • chronological discovery
  • source context
  • permanent archives
  • anonymous discussion
  • less algorithmic manipulation

That sounds simple, but once you start building it, the hard part is not fetching headlines.

The hard part is trust.


The problem with modern news discovery

There are several existing models for news discovery.

Google News is broad, but opaque. You get a feed, but you do not always know why certain stories are ranked or why certain sources are emphasized.

Reddit and X are fast, but socially distorted. Stories become memes, outrage cycles, or identity signals before they become information.

RSS readers are powerful, but require setup and source selection. They are great for people who already know what they want to follow. They are less useful for broad public discovery.

Ground News, AllSides, and similar products are useful because they introduce comparison and bias context, but some of the most useful features are often gated behind subscriptions or limited interfaces.

Hacker News is extremely high signal for technical and startup-related topics, but it is not a general-purpose news aggregator.

So the question I kept coming back to was:

What would a news aggregator look like if it tried to be less addictive, less opaque, and more useful for comparing coverage?

That is the question behind WeSearch.


Why chronological discovery still matters

A lot of modern feeds are optimized around engagement.

That usually means the system decides what you should see based on some mixture of clicks, dwell time, reactions, shares, prior behavior, and predicted interest.

That can be useful, but it creates a problem:

The feed stops being a window into what is happening and becomes a mirror of what the system thinks will keep you engaged.

For news, that is dangerous.

A chronological feed is not perfect. It can be noisy. It can be overwhelming. It can miss importance. But it has one major advantage:

It is legible.

You can understand why something appears.

It appeared because it was published or discovered recently.

That does not solve ranking, source quality, duplication, or bias. But it gives the user a clean baseline. From that baseline, you can add filtering, clustering, search, source context, and archive views without turning the whole thing into a black box.

That is why WeSearch leans chronological first.


Source context is useful, but bias labels are not enough

One of the obvious features for a news aggregator is source labeling.

People want to know:

  • where the article came from
  • whether the outlet has a known political tendency
  • whether the source is reliable
  • whether the article is reporting, opinion, analysis, or commentary
  • how other outlets are covering the same event

But a simple left / center / right label is dangerously incomplete.

Two articles can both come from “left” sources and still be completely different in quality.

One may be careful reporting with primary sources.

Another may be mostly emotional framing.

The same is true for “right” sources.

And “center” does not always mean “truthful” or “neutral.” Sometimes it means careful. Sometimes it means bland. Sometimes it means institutionally cautious. Sometimes it means avoiding claims that should actually be made.

So the long-term goal should not be:

Put a political label next to every article and call it solved.

The better goal is:

Show source tendency, article framing, sourcing depth, factual density, tone, and coverage asymmetry separately.

That is much harder, but it is also much more honest.


The difference between source bias and article framing

This distinction matters.

Source bias is about the outlet over time.

For example:

  • What stories does it usually emphasize?
  • What language does it tend to use?
  • Which political or institutional assumptions does it carry?
  • What audience does it appear to serve?
  • How often does it correct mistakes?
  • How close is it to primary-source material?

Article framing is about one specific article.

For example:

  • What facts does the headline emphasize?
  • What facts are buried?
  • What words carry emotional weight?
  • Who is quoted?
  • Who is ignored?
  • Is the piece written as reporting, analysis, advocacy, or outrage?
  • Does it separate claims from interpretation?

A serious news aggregator should not collapse those into one score.

An outlet can have a general bias while still publishing a fair article.

A generally reliable outlet can still publish a weak or misleading article.

A low-reputation source can sometimes surface a real story before institutions do.

That is why the interface needs to preserve nuance.


Permanent daily archives

One design choice I care about is permanent daily archives.

A normal feed disappears as it updates. Yesterday’s information gets buried. Last week’s framing is hard to reconstruct. The user sees the present feed, but not the shape of coverage over time.

Permanent daily archives solve part of that.

Each day becomes a stable page.

That makes it easier to answer questions like:

  • What was being covered on a specific day?
  • Which stories dominated?
  • Which topics disappeared quickly?
  • Which sources covered an event early?
  • How did the language around a story change?
  • What did the news environment look like before later context emerged?

This is useful for users, but it is also useful structurally.

A news aggregator should not only be a live feed. It should become a public memory layer.


Anonymous discussion: useful or dangerous?

WeSearch currently allows anonymous discussion.

That decision is controversial.

The upside is obvious:

People can comment without creating an account, building a profile, or turning every opinion into part of a permanent identity graph.

That lowers friction.

It also makes the product feel less like a social network and more like a public annotation layer.

But anonymity has risks:

  • spam
  • abuse
  • low-quality comments
  • astroturfing
  • drive-by political noise
  • reduced accountability
  • lower trust

The challenge is designing anonymous discussion so it does not become anonymous garbage.

Some possible approaches:

  • rate-limit comments
  • add lightweight moderation
  • separate “questions” from “opinions”
  • let users mark comments as useful, misleading, or low-effort
  • encourage source-backed replies
  • show discussion quality signals instead of identity signals
  • avoid follower counts and personality-driven posting

The key design question is whether discussion should be social or analytical.

For a news product, I think discussion should be closer to annotation than performance.


The trust problem

A news aggregator has a harder trust problem than most products.

If you build a todo app, users ask:

Does it work?

If you build a news aggregator, users ask:

Why should I trust what this thing chooses to show me?

That means the product needs visible trust signals.

Not fake authority. Real transparency.

Examples:

  • source list
  • source policy
  • correction policy
  • ranking methodology
  • bias-label methodology
  • explanation of what is automated
  • explanation of what is human-reviewed
  • clear distinction between source labels and article labels
  • visible date/time metadata
  • no pretending that the system is perfectly objective

The worst thing a news product can do is imply neutrality while hiding all the decisions that shape what people see.

A better approach is to expose the machinery.


What I would avoid

If I were designing a serious news comparison system, I would avoid a few traps.

1. Do not pretend one bias score explains an article

A single label can help orient the user, but it should not be the whole analysis.

Bias is multi-dimensional.

2. Do not over-personalize the feed

Personalization is convenient, but it quietly narrows perception.

For news, user control is better than hidden behavioral targeting.

3. Do not hide the source list

If a product claims to aggregate many sources, users should be able to see what those sources are.

4. Do not turn discussion into another social network

Follower mechanics, clout loops, and identity performance can damage the informational value of a news product.

5. Do not index thousands of empty pages

This is more of a technical SEO point, but it matters.

If a site creates source pages, tag pages, archive pages, and story pages, it needs to avoid exposing too many thin or empty URLs. Search engines and users both interpret that as low quality.


What I am still figuring out

The project is still early, and several hard questions are unresolved.

Story clustering

When ten outlets cover the same event, should those articles be grouped together automatically?

Probably yes.

But clustering can go wrong. Similar headlines do not always mean identical stories. Different angles may deserve separation.

Source weighting

Should a more reliable source receive stronger visibility?

Probably yes.

But if weighting is too aggressive, the system becomes another hidden ranking engine.

Bias display

Should bias labels be visible immediately, or should users first see the article/source and then open a deeper comparison panel?

I am not sure yet.

Immediate labels are useful, but they can also prime users before they read.

Anonymous discussion

Should anonymous comments be central to the product, or should they be secondary to source comparison?

This is still an open product question.

Search vs feed vs comparison

A news aggregator can become several different products:

  • live feed
  • searchable archive
  • RSS replacement
  • media-bias comparison tool
  • anonymous news discussion layer
  • research tool

Trying to be all of them at once can make the product confusing.

The hard part is choosing the primary job.


The current direction

Right now, I think the strongest direction is:

A chronological news aggregator with source context, permanent archives, and lightweight anonymous discussion.

Then, over time, add stronger comparison features:

  • related coverage clusters
  • source diversity views
  • article-level framing analysis
  • factuality/source-depth indicators
  • topic timelines
  • left/right/center coverage maps
  • correction and update tracking
  • “what is missing?” indicators

The product should not just answer:

What happened?

It should also help answer:

Who is covering it?
How are they framing it?
What context is missing?
Which claims are confirmed?
Which parts are interpretation?
How did coverage change over time?

That is where a news aggregator can become more than a headline feed.


Why I think this matters

The internet does not have an information shortage.

It has a context shortage.

There are endless headlines, feeds, posts, clips, takes, screenshots, and reactions.

But it is still hard to see the shape of coverage across sources.

It is hard to know which parts of a story are factual, which parts are framing, and which parts are omission.

It is hard to compare coverage without manually opening ten tabs.

It is hard to discuss news without the conversation becoming identity performance.

That is the space I am trying to explore with WeSearch.

Not a perfect truth machine.

Not another engagement feed.

Not another paywalled dashboard.

Just a clearer way to scan, compare, archive, and discuss what is being published.

The site is here:

https://wesearch.press

It is still rough in places, but the core structure is live.

I would be interested in criticism from people who care about search, RSS, journalism, media bias, recommendation systems, moderation, or information retrieval.

The question I keep coming back to is:

What would a news aggregator need to show before you would actually trust it?