AI Workflows Need Topological Sort

Hacker News - Newest: "AI"

AI can't read an investor deck Your "AI" Pull Request was rejected and you don't know why · Codepipes Blog A Proposed Framework For Evaluating Skills [Research Eng Blog] The AI-Ready Product Data Framework for B2B Commerce minisforum_ai_pro_hx_370_aux_fan_controller/INVESTIGATION.md at main · MiniPcThinker/minisforum_ai_pro_hx_370_aux_fan_controller GitHub - pengrambo3-tech/ZeusHammer: ZeusHammer - AI Super Agent with Local Brain, Voice Interaction & Three-Tier Memory People and AI: Understanding and Implementing The Right AI People Strategy in Insurance GitHub - browser-use/browser-harness: Self-healing browser harness that enables LLMs to complete any task. Epistemic Suicide: Why AI Is Collapsing into Mediocrity Uber's AI Push Hits a Wall–CTO Says Budget Struggles Despite $3.4B Spend The Missing Human Half of AI GitHub - vinkius-labs/mcp-explorer: Universal GUI to manage the Model Context Protocol (MCP). Discover and 1-click install a catalog of over 2,500+ premium MCP servers across ANY AI client. GitHub - mosidze/aiheal: GitHub-native self-healing CI with AI triage + HITL AI is Killing Open Source SaaS Too 543 Hours: What Happens When AI Runs While You Sleep GitHub - superhq-ai/superhq: Sandboxed AI agent orchestration platform Scoring 500 Show HN pages for AI design patterns Nvidia's once-tight bond with gamers is cracking over AI GitHub - Nour833/StegoForge: The ultimate steganography and digital forensics toolkit. Hide and extract data across images, audio, video, documents, and network packets, or run 11 advanced detection engines to uncover hidden payloads. Panic says the Playdate Catalog won't accept games made with generative AI Update: Panic won't release Playdate titles that use certain forms of generative AI GitHub - OpenCognit/Opencognit: The open-source AI agent OS. CEO orchestrator, persistent memory, real execution, atomic budgets. Self-hosted. No cloud lock-in. Ask HN: AI exhaust is probably worth something to you or someone else From Future of Work to Future of Workers: Addressing Asymptomatic AI Harms for Dignified Human-AI Interaction AI's Mainframe Moment Universal INCOME via check is best way to deal with unemployment caused by AI GitHub - run-llama/ParseBench: ParseBench - A Document Parsing Benchmark for AI Agents Analytics group signals possible delays at 40% of AI data center construction sites — companies deny schedule holdups, but satellite imagery indicates otherwise Smart glasses were used to capture login credentials as part of $500K fraud: police The Download: bad news for inner Neanderthals, and AI warfare’s human illusion Can AI Agents Autonomously Design Components on Photonic Chips? AI-generated images behind increase in insurance fraud GitHub - NickCirv/engram: The context spine that 10x's every AI coding session. Live in 8 IDEs (Claude Code, Cursor, Cline, Continue, Aider, Codex, Windsurf, Zed) via npm + OpenVSX + Anthropic plugin directory. 89% measured token reduction. Local SQLite, zero cloud, Apache 2.0. I'm tired about hearing about AI startups Is AI a tool or are you? The $10B Startup Training AI to Replace the White-Collar Workforce AI boom is city’s weirdest tech boom, says S.F.’s chief economist The LLM Fallacy: Misattribution in AI-Assisted Cognitive Workflows Hijacking Large Audio-Language Models via Context-Agnostic and Imperceptible Auditory Prompt Injection Poisoning AI Training Data Taiwan Market Cap Tops $4T on AI Boom, Overtaking UK How AI changed your daily work at office? Google's AI Expansion Challenges Microsoft & Apple Claude Mythos: Finance ministers and top bankers raise serious concerns about AI model The tech jobs bust is real. Don't blame AI (yet) GitHub - jgoy-labs/server-nexe: Local AI server with persistent memory, RAG, and multi-backend inference (MLX / llama.cpp / Ollama). Runs entirely on your machine — zero data sent to external services. GitHub - TecharoHQ/anubis: Weighs the soul of incoming HTTP requests to stop AI crawlers GitHub - Agent-FM/agentfm-core: AgentFM is a peer-to-peer network that turns everyday computers into a decentralized AI supercomputer. AgentFM lets you run massive AI workloads directly across a global mesh of idle CPUs and GPUs. 15% of Reddit Posts are Likely AI-generated in 2025 Could AI's leading men become as powerful as Ford or Rockefeller? GitHub - GainSec/AutoProber: Hardware hacker’s flying probe automation stack for agent-driven target discovery, microscope mapping, safety-monitored CNC motion, probe review, and controlled pin probing. The Vertical AI OS: What I'd Build If I Were Starting a SaaS Today Allbirds goes soleless and pivots to AI A new way to explore the web with AI Mode in Chrome #007: Your AI Excitement is Someone’s AI Apprehension Boycott of major AI conference exposes a growing US–China divide Anthropic's AI downgrade stings power users GitHub - PurpleDoubleD/locally-uncensored: Local AI desktop app — chat, agent mode, image gen, video gen. Supports Ollama, Gemma 4, Llama, Qwen, OpenAI, Anthropic. Single .exe, no Docker. 15 Ways I'm Using AI to Manage my Small Business GitHub - SaneethSunkari/Ai-Business-Analyst: Agent-compatible AI data middleware for live database Q&A with schema introspection, read-only SQL validation, saved connections, and audit logs. AIM Media House - AI, Technology & Business Insights Introducing the Common AI Provider: LLM and AI Agent Support for Apache Airflow Runway CEO says AI could help Hollywood make 50 films instead of one $100M blockbuster What hackers talk about when they talk about AI: Early-stage diffusion of a cybercrime innovation GitHub - labsai/EDDI: Config-driven engine that turns JSON into production-grade AI agents. Multi-agent orchestration, 12+ LLM providers, MCP/A2A protocols, RAG, persistent memory, and enterprise compliance (EU AI Act, GDPR, HIPAA). Built on Quarkus. GitHub - glitchnsec/fortyone-oss: AI Executive Assistant Platform Quickstart | Alien First trailer released for western starring AI version of Val Kilmer Treating enterprise AI as an operating layer Project Glasswing and the ASF: Open Source's Chance to Win the AI Era. Allbirds pivots from shoes to AI after 99% collapse, stock surges 300% on NewBird AI plan - Tech Startups I Was an Enthusiastic Early Adopter of AI Scribes. Here’s Why I Stopped Adobe’s new Firefly AI assistant can use Creative Cloud apps to complete tasks How George Orwell Predicted the Rise of “AI Slop” in Nineteen Eighty-Four (1949) Allbirds shares soar 580% after pivot from shoes to AI pfsync(4) Packet Header Field Renamed to Avoid AI Bug Report Noise GitHub - EdoardoBambini/Agent-Armor-Iaga: AI agents are getting tool access — shell, file system, databases, APIs, secrets. But **nobody is governing what they actually do with it**. Frameworks like LangChain, CrewAI, AutoGen, and Claude Code give agents the power to execute. Agent Armor gives you the power to control, audit, and approve every single action before it happens. GitHub - peterretief/biometric Education experts to Mamdani: Why are you foisting AI on our kids? | Fortune AI Data Residency: When Cloud APIs Don't Meet Your Compliance Requirements NVIDIA Ising — AI Models for Quantum Computing AI as an attorney? Student uses ChatGPT, Gemini to sue UW over alleged racial discrimination Hacking MCP Servers in AI Systems – The Rug Pull: Tool Changes After Approval GitHub - MeepCastana/KubeezCut: Free Web based video editor GitHub - GenAI-Gurus/awesome-eu-ai-act: Curated tools, official sources, OSS, templates, and guides for EU AI Act compliance. Amazon AI Cancelling Webcomics The Great AI Layoff Boomerang. 55% of companies regret their AI-drven layoffs How YC Founders Run on AI Employees GitHub - audiodude/sudomake-friends Peak absurdity, Part II Shares in Allbirds surge after maker of wool sneakers announces pivot to AI GitHub - joedaviesio/tirith: Cost observability for Claude and OpenAI APIs. One import. Done. zappa: an AI powered mitmproxy Switch: Faster, smarter AI audio conversion Allbirds, once a buzzy shoe startup, pivots to AI Apple's mandatory 'AI coding bootcamp' could help make a better Siri Where is My agent.lock file? There's yet another study about how bad AI is for our brains Can AI judge journalism? A Thiel-backed startup says yes, even if it risks chilling whistleblowers What Claude Opus 4.7 means for AI code review

Arpit Bhayani · 2026-06-01 · via Hacker News - Newest: "AI"

Every AI workflow is a dependency problem. You have steps that produce outputs, other steps that consume those outputs, and a hard constraint: consumers cannot run before their producers finish. Get the order wrong and you read stale data, call a tool with missing context, or trigger an agent before its inputs are ready.

Directed acyclic graphs (DAGs) are the right model for this. Topological sort turns a DAG into an execution order. Together they form a primitive in applied AI system execution, and understanding them at a first-principles level is important when you design, debug, and scale workflows.

The Dependency Problem in AI Workflows

Take a realistic document processing pipeline:

Fetch raw documents from storage
Chunk and clean the text
Embed each chunk
Store embeddings in a vector index
Run a retrieval query against the index
Pass retrieved chunks into a prompt template
Call the LLM
Parse and validate the output

Each step depends on the one before it. If you run step 3 before step 2 finishes, you embed dirty text. If you run step 5 before step 4, you query an incomplete index. The dependencies are not optional constraints - they are correctness constraints.

In a simple linear pipeline you can just run steps 1 through 8 in order. But real workflows branch. Some steps are independent of each other. Some steps fan out into parallel subtasks and fan back in. A linear list breaks down fast.

This is where a DAG becomes the right abstraction.

Modeling Workflows as DAGs

A DAG represents a workflow as a set of nodes (tasks) and directed edges (dependencies). An edge from A to B means “A must complete before B starts.” The acyclicity constraint means there are no circular dependencies - a task cannot transitively depend on itself.

Here is a branching RAG pipeline modeled as a DAG:

Workflow-RAG

keyword_filter and retrieve are independent of each other once store_index finishes. They can run in parallel. merge_context cannot start until both finish. A linear list cannot express this. A DAG can.

Topological Sort

A topological ordering of a DAG is a sequence of all nodes such that for every edge u→vu \to v, node uu appears before vv. Producers always precede consumers. Kahn’s algorithm computes this in O(V+E)O(V + E) time.

Applied to the pipeline above, this produces an order where fetch_documents runs first, call_llm runs last, and keyword_filter and retrieve appear before merge_context but without any ordering constraint between each other. That freedom is what enables parallelism.

Topological Sort Beyond Ordering

Parallelism for Free

Nodes at the same “level” of the topological order have no dependency between them. They can run concurrently. A smarter scheduler groups nodes by their earliest possible start time:

def execution_levels(graph: dict[str, list[str]]) -> list[list[str]]:
    in_degree = defaultdict(int)
    for node in graph:
        for dep in graph[node]:
            in_degree[dep] += 1

    levels = []
    ready = [n for n in graph if in_degree[n] == 0]

    while ready:
        levels.append(ready)
        next_ready = []
        for node in ready:
            for dep in graph[node]:
                in_degree[dep] -= 1
                if in_degree[dep] == 0:
                    next_ready.append(dep)
        ready = next_ready

    return levels

Each list in levels is a batch of tasks that can execute in parallel. Tools like Prefect and Airflow compute exactly this to maximize executor throughput.

Cycle Detection Before Execution

If a user or a config declares a circular dependency, len(order) != len(graph) catches it before a single task runs. This is not a nice-to-have. A cycle means a deadlock: A waits for B, B waits for A, nothing makes progress. Detecting it at definition time rather than runtime is the difference between a clear error message and a hung pipeline at 2am.

Multi-Agent Systems Are DAGs

Multi-agent orchestration frameworks like LangGraph model agent interactions as DAGs for the same reason: to enforce correct execution order across agents that produce and consume each other’s outputs.

Consider a research workflow with four agents:

Workflow - Multi-agent

The orchestrator cannot dispatch writer_agent until critic_agent finishes, and cannot dispatch critic_agent until summarizer_agent finishes. Topological sort produces this order automatically from the dependency declarations. The orchestrator does not need to hardcode sequencing logic.

Now add a parallel branch:

Workflow - Pipeline

search_agent and retrieval_agent are independent. They can run simultaneously. summarizer_agent waits on both. The topological sort respects this: both search agents appear in the same execution level, and summarizer_agent appears in the next level only after both have zero remaining dependencies.

This scales. Add ten agents, add conditional branches, add fan-outs - the DAG model and topological sort handle the complexity. Hardcoded sequential dispatch does not.

Incremental Re-execution

When an upstream task fails or an input changes, you do not need to re-run the entire pipeline. Traverse the graph forward from the affected node and recompute only the nodes in its subgraph. Every node outside that subgraph has valid cached output.

This is standard in build systems (Bazel, Buck) and is increasingly common in AI pipeline frameworks. The DAG structure is what makes it tractable. Without explicit dependency edges you cannot know which downstream nodes are affected.

What to Watch Out For

Fan-in bottlenecks are real. If ten parallel tasks all feed into one merge node, the merge node cannot start until the slowest of the ten finishes. Topological sort tells you the order correctly but does not automatically balance work. Profile the critical path - the longest chain of dependent tasks - to find where parallelism gains are actually limited.

Cycles in configuration are a user error, not a framework bug. Build validation that catches them at workflow definition time, surfaces a clear error with the offending cycle identified, and rejects the workflow before any execution begins.

The Mental Model to Keep

A DAG is a contract. It says: here are the tasks, here are their dependencies, and here is the guarantee that there are no circular waits. Topological sort is the mechanism that converts that contract into an actionable execution schedule.

When you design an AI workflow, draw the dependency graph first. Identify which tasks are truly sequential and which are independent. That graph is your specification. The topological ordering is your scheduler’s input. Everything else - parallelism, cycle safety, incremental recomputation - falls out of the structure you have already declared.

Footnote: DAGs model AI workflows and multi-agent systems as dependency graphs where edges encode execution order constraints. Topological sort converts this graph into a valid task schedule in O(V+E)O(V + E) time, detects circular dependencies before execution begins, and reveals which tasks can run in parallel. For multi-agent orchestration, this means agents are dispatched only when their inputs are ready, with no hardcoded sequencing logic required.

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。

推荐订阅源

Hacker News - Newest: "AI"

The Dependency Problem in AI Workflows

Modeling Workflows as DAGs

Topological Sort

Topological Sort Beyond Ordering

Parallelism for Free

Cycle Detection Before Execution

Multi-Agent Systems Are DAGs

Incremental Re-execution

What to Watch Out For

The Mental Model to Keep