Building a News Aggregator Without an Engagement Algorithm

I have been building a project called WeSearch:

It is a free news aggregator that pulls from hundreds of sources, keeps discovery mostly chronological, adds source/bias context where available, preserves permanent daily archives, and allows anonymous discussion on stories.

The project started from a simple frustration:

Most news discovery products are either too personalized, too paywalled, too noisy, too opaque, or too socially distorted.

I wanted something closer to this:

a wide source feed
no account required
no paywall
no tracking
chronological discovery
source context
permanent archives
anonymous discussion
less algorithmic manipulation

That sounds simple, but once you start building it, the hard part is not fetching headlines.

The hard part is trust.

The problem with modern news discovery

There are several existing models for news discovery.

Google News is broad, but opaque. You get a feed, but you do not always know why certain stories are ranked or why certain sources are emphasized.

Reddit and X are fast, but socially distorted. Stories become memes, outrage cycles, or identity signals before they become information.

RSS readers are powerful, but require setup and source selection. They are great for people who already know what they want to follow. They are less useful for broad public discovery.

Ground News, AllSides, and similar products are useful because they introduce comparison and bias context, but some of the most useful features are often gated behind subscriptions or limited interfaces.

Hacker News is extremely high signal for technical and startup-related topics, but it is not a general-purpose news aggregator.

So the question I kept coming back to was:

What would a news aggregator look like if it tried to be less addictive, less opaque, and more useful for comparing coverage?

That is the question behind WeSearch.

Why chronological discovery still matters

A lot of modern feeds are optimized around engagement.

That usually means the system decides what you should see based on some mixture of clicks, dwell time, reactions, shares, prior behavior, and predicted interest.

That can be useful, but it creates a problem:

The feed stops being a window into what is happening and becomes a mirror of what the system thinks will keep you engaged.

For news, that is dangerous.

A chronological feed is not perfect. It can be noisy. It can be overwhelming. It can miss importance. But it has one major advantage:

It is legible.

You can understand why something appears.

It appeared because it was published or discovered recently.

That does not solve ranking, source quality, duplication, or bias. But it gives the user a clean baseline. From that baseline, you can add filtering, clustering, search, source context, and archive views without turning the whole thing into a black box.

That is why WeSearch leans chronological first.

Source context is useful, but bias labels are not enough

One of the obvious features for a news aggregator is source labeling.

People want to know:

where the article came from
whether the outlet has a known political tendency
whether the source is reliable
whether the article is reporting, opinion, analysis, or commentary
how other outlets are covering the same event

But a simple left / center / right label is dangerously incomplete.

Two articles can both come from “left” sources and still be completely different in quality.

One may be careful reporting with primary sources.

Another may be mostly emotional framing.

The same is true for “right” sources.

And “center” does not always mean “truthful” or “neutral.” Sometimes it means careful. Sometimes it means bland. Sometimes it means institutionally cautious. Sometimes it means avoiding claims that should actually be made.

So the long-term goal should not be:

Put a political label next to every article and call it solved.

The better goal is:

Show source tendency, article framing, sourcing depth, factual density, tone, and coverage asymmetry separately.

That is much harder, but it is also much more honest.

The difference between source bias and article framing

This distinction matters.

Source bias is about the outlet over time.

For example:

What stories does it usually emphasize?
What language does it tend to use?
Which political or institutional assumptions does it carry?
What audience does it appear to serve?
How often does it correct mistakes?
How close is it to primary-source material?

Article framing is about one specific article.

For example:

What facts does the headline emphasize?
What facts are buried?
What words carry emotional weight?
Who is quoted?
Who is ignored?
Is the piece written as reporting, analysis, advocacy, or outrage?
Does it separate claims from interpretation?

A serious news aggregator should not collapse those into one score.

An outlet can have a general bias while still publishing a fair article.

A generally reliable outlet can still publish a weak or misleading article.

A low-reputation source can sometimes surface a real story before institutions do.

That is why the interface needs to preserve nuance.

Permanent daily archives

One design choice I care about is permanent daily archives.

A normal feed disappears as it updates. Yesterday’s information gets buried. Last week’s framing is hard to reconstruct. The user sees the present feed, but not the shape of coverage over time.

Permanent daily archives solve part of that.

Each day becomes a stable page.

That makes it easier to answer questions like:

What was being covered on a specific day?
Which stories dominated?
Which topics disappeared quickly?
Which sources covered an event early?
How did the language around a story change?
What did the news environment look like before later context emerged?

This is useful for users, but it is also useful structurally.

A news aggregator should not only be a live feed. It should become a public memory layer.

Anonymous discussion: useful or dangerous?

WeSearch currently allows anonymous discussion.

That decision is controversial.

The upside is obvious:

People can comment without creating an account, building a profile, or turning every opinion into part of a permanent identity graph.

That lowers friction.

It also makes the product feel less like a social network and more like a public annotation layer.

But anonymity has risks:

spam
abuse
low-quality comments
astroturfing
drive-by political noise
reduced accountability
lower trust

The challenge is designing anonymous discussion so it does not become anonymous garbage.

Some possible approaches:

rate-limit comments
add lightweight moderation
separate “questions” from “opinions”
let users mark comments as useful, misleading, or low-effort
encourage source-backed replies
show discussion quality signals instead of identity signals
avoid follower counts and personality-driven posting

The key design question is whether discussion should be social or analytical.

For a news product, I think discussion should be closer to annotation than performance.

The trust problem

A news aggregator has a harder trust problem than most products.

If you build a todo app, users ask:

Does it work?

If you build a news aggregator, users ask:

Why should I trust what this thing chooses to show me?

That means the product needs visible trust signals.

Not fake authority. Real transparency.

Examples:

source list
source policy
correction policy
ranking methodology
bias-label methodology
explanation of what is automated
explanation of what is human-reviewed
clear distinction between source labels and article labels
visible date/time metadata
no pretending that the system is perfectly objective

The worst thing a news product can do is imply neutrality while hiding all the decisions that shape what people see.

A better approach is to expose the machinery.

What I would avoid

If I were designing a serious news comparison system, I would avoid a few traps.

1. Do not pretend one bias score explains an article

A single label can help orient the user, but it should not be the whole analysis.

Bias is multi-dimensional.

2. Do not over-personalize the feed

Personalization is convenient, but it quietly narrows perception.

For news, user control is better than hidden behavioral targeting.

3. Do not hide the source list

If a product claims to aggregate many sources, users should be able to see what those sources are.

4. Do not turn discussion into another social network

Follower mechanics, clout loops, and identity performance can damage the informational value of a news product.

5. Do not index thousands of empty pages

This is more of a technical SEO point, but it matters.

If a site creates source pages, tag pages, archive pages, and story pages, it needs to avoid exposing too many thin or empty URLs. Search engines and users both interpret that as low quality.

What I am still figuring out

The project is still early, and several hard questions are unresolved.

Story clustering

When ten outlets cover the same event, should those articles be grouped together automatically?

Probably yes.

But clustering can go wrong. Similar headlines do not always mean identical stories. Different angles may deserve separation.

Source weighting

Should a more reliable source receive stronger visibility?

Probably yes.

But if weighting is too aggressive, the system becomes another hidden ranking engine.

Bias display

Should bias labels be visible immediately, or should users first see the article/source and then open a deeper comparison panel?

I am not sure yet.

Immediate labels are useful, but they can also prime users before they read.

Anonymous discussion

Should anonymous comments be central to the product, or should they be secondary to source comparison?

This is still an open product question.

Search vs feed vs comparison

A news aggregator can become several different products:

live feed
searchable archive
RSS replacement
media-bias comparison tool
anonymous news discussion layer
research tool

Trying to be all of them at once can make the product confusing.

The hard part is choosing the primary job.

The current direction

Right now, I think the strongest direction is:

A chronological news aggregator with source context, permanent archives, and lightweight anonymous discussion.

Then, over time, add stronger comparison features:

related coverage clusters
source diversity views
article-level framing analysis
factuality/source-depth indicators
topic timelines
left/right/center coverage maps
correction and update tracking
“what is missing?” indicators

The product should not just answer:

What happened?

It should also help answer:

Who is covering it?
How are they framing it?
What context is missing?
Which claims are confirmed?
Which parts are interpretation?
How did coverage change over time?

That is where a news aggregator can become more than a headline feed.

Why I think this matters

The internet does not have an information shortage.

It has a context shortage.

There are endless headlines, feeds, posts, clips, takes, screenshots, and reactions.

But it is still hard to see the shape of coverage across sources.

It is hard to know which parts of a story are factual, which parts are framing, and which parts are omission.

It is hard to compare coverage without manually opening ten tabs.

It is hard to discuss news without the conversation becoming identity performance.

That is the space I am trying to explore with WeSearch.

Not a perfect truth machine.

Not another engagement feed.

Not another paywalled dashboard.

Just a clearer way to scan, compare, archive, and discuss what is being published.

The site is here:

https://wesearch.press

It is still rough in places, but the core structure is live.

I would be interested in criticism from people who care about search, RSS, journalism, media bias, recommendation systems, moderation, or information retrieval.

The question I keep coming back to is:

What would a news aggregator need to show before you would actually trust it?

推荐订阅源

DEV Community