惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
G
GRAHAM CLULEY
P
Privacy & Cybersecurity Law Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
宝玉的分享
宝玉的分享
P
Proofpoint News Feed
H
Help Net Security
V
Visual Studio Blog
阮一峰的网络日志
阮一峰的网络日志
C
Cisco Blogs
人人都是产品经理
人人都是产品经理
Know Your Adversary
Know Your Adversary
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
Recorded Future
Recorded Future
I
Intezer
罗磊的独立博客
T
The Exploit Database - CXSecurity.com
Blog — PlanetScale
Blog — PlanetScale
Malwarebytes
Malwarebytes
Spread Privacy
Spread Privacy
T
Tor Project blog
V
Vulnerabilities – Threatpost
云风的 BLOG
云风的 BLOG
腾讯CDC
B
Blog RSS Feed
Stack Overflow Blog
Stack Overflow Blog
F
Future of Privacy Forum
MyScale Blog
MyScale Blog
Latest news
Latest news
IT之家
IT之家
MongoDB | Blog
MongoDB | Blog
The Hacker News
The Hacker News
S
Securelist
博客园 - 【当耐特】
C
CXSECURITY Database RSS Feed - CXSecurity.com
T
Threat Research - Cisco Blogs
Jina AI
Jina AI
Cisco Talos Blog
Cisco Talos Blog
B
Blog
博客园 - 三生石上(FineUI控件)
Last Week in AI
Last Week in AI
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
M
MIT News - Artificial intelligence
V
V2EX
D
Darknet – Hacking Tools, Hacker News & Cyber Security
The Cloudflare Blog
The GitHub Blog
The GitHub Blog
博客园 - 聂微东
F
Full Disclosure
C
CERT Recently Published Vulnerability Notes

DEV Community

I built a CLI that eliminates README reading forever Measuring AI Gateway Failover: 30 Days of Production Data The Folly of Global AI Platforms: Or How We Built a System That Actually Works in Cameroon Week 9 The 10-Minute Race: Scaling the "Cancel Order" Button to 100K+ Requests Per Second SQL Performance: Indexing, Query Tuning & Explain Plans (Developer Guide) Why I Got Tired of Class-Heavy UI Code and Started Building Around Attributes GitHub Is No Longer a Place for Serious Work Build an AI-Powered Developer Portal with Backstage and .NET Updates to developer experience on Setapp Node.Js Express CRUD template Lint Your Phishing Templates Like You Lint Your Code From Code to Cloud: 3 Labs for Deploying Your AI Agent I built Voice2Sub: a local AI subtitle generator for video and audio The OCR Rabbit Hole Built a 100k-Document RAG System by Hand. Hermes Read the Architecture in 47 Seconds. I tried monetizing my MCP server with x402 — production needs more than npm install Understanding Tracking Dimensions in Accounting Integrations I Ran My Local, NOT AI, AI Code Auditor on Its Own Source Code Agent Surface Map: Gemma 4 review before you install an MCP Stop Being Nice, Start Being Right": The Day My User Reconfigured My Reward Function Building a Database Performance Testing Tool With AI: The Honest Breakdown Hot To Run LLMs Locally Research blockchain with post-quantum Dilithium and custom zk-STARKs from scratch AI agents do not just need tool access. They need execution control. The CTO’s Blueprint for Governing Multi-Agent AI Systems in the Enterprise I audited our CMS and 86% of our articles were invisible. A Sanity gotcha. Upselling Explained Industry-Specific Tactics for EC Owners 2026 I Keep Hermes Agent's Self-Improvement OFF For the First 14 Days — Here's What Happens When I Don't I Built the Hermes + Claude Code Dual-Stack: Orchestrator Meets Coder — Here's the Full Architecture Stop Using .iterrows(). Here's What Actually Fast Looks Like I Built a SaaS to Stop the Awkward "Hey, Did You Get My Invoice?" Conversation I Renamed a Hot Postgres Table Without Dropping a Request How to Build a Self-Hosted AI Gateway With LiteLLM and Open WebUI What is a Webhook? A Complete Guide for Beginners Headless BI: How a Universal Semantic Layer Replaces Tool-Specific Models Beyond Translation: A Developer's Guide to App Localization (i18n & l10n) Aegis: Designing an Offline Ambient Co-Working Companion for High-Burnout Medical and STEM Grinds Local LLM Code Completion Showdown: Zed AI vs Continue vs Cursor (Honest 2026 Review) The Agentic Payment Protocol Wars Your No-Code AI Agent Has a Memory Problem The Agentic Payment Protocol Wars How to Bypass LinkedIn Commercial Use Limit in 2026 (Without Paying $150/mo) We built a statechart hosting platform where two actors in the same state can migrate to different versions — here's why that matters Playwright vs TWD: A Frontend Developer's Honest Comparison Claude Code's skillListingBudgetFraction: The Undocumented Setting Silently Killing Half Your Skills O GitHub pode mudar sua carreira mais do que você imagina Just redesigned and launched my developer portfolio 🚀 Would genuinely love some honest feedback from the dev community 👨‍💻 Data Virtualization and the Semantic Layer: Query Without Copying Launching opub: donated compute for open-source maintainers Four iteration rounds on a security scanner I run, all of them visible. Here is what the loop actually looks like. Why Good Abstractions Make Debugging Harder Found a Coordinated Inauthentic Network on GitHub: 24 Accounts, Fabricated History, and a Generator That Left Its PID in Three READMEs Cursor Just Released Composer 2.5. Here's What Actually Changed for AI Coding Agents. What Wrong Docs Cost Test Automation Teams Export Your DeepSeek Chats to Word, PDF, Google Docs, Markdown & Notion in One Click When the Docs Lie OpenShift Observability: Built-in vs. Bring-Your-Own If your AI initiative is pending for 6 months, the bottleneck is probably not technology Hermes Agent Under the Hood: The Open-Source Runtime for Autonomous AI Systems Expert Systems -The AI That Existed Before AI Was Cool AI-generated accessibility, an update — frontier models still fail, but skills change the game My HTML Learning Journey 🚀 The Day PayPal Failed and the Rust Rewrite Saved the Product Launch Google Sheets CRM: 4 Ways I've Actually Done It (with Apps Script Code) BrontoScope: AI-Powered Error Investigations The job of an AI engineer inside a 40-person company is not what most CEOs think it is Building a Clinical Speech-Therapy App With a Real SLP: 4 Lessons From PhoenixSteps 7 overlooked .Net features How Stripe Took 48 Hours and 3 API Calls to Break My Freelance Income Stream in Lagos Pretty normal Both Camps in the 'Left Behind' Argument Are Right About Each Other Flutter MCP Toolkit v3 Google Just Shipped Gemini 3.5 Flash. Here's What Developers Actually Need to Know. 🔐 Working with Private Symfony Recipes Rate limiting in web apps: what to protect before picking a library Rate limiting en aplicaciones web: qué proteger antes de elegir una librería What Are Lakehouse Catalogs? The Role of Catalogs in Apache Iceberg What It Really Takes to Become a Senior Software Engineer Microservices Were Never About Technology JS Crime Scene: The Misleading Array Project-as-code for a Directus v9 backend When the API literally burned your database after a typo COOKIES DPRK Hacking Trends 2026: AI‑Powered Supply Chain and Developer Environment Attacks Phone control for AI coding sessions is not a tiny terminal PayPal and Crypto Are Not Equals: How I Built a Gumroad Alternative for Restricted Countries Exploring Tech as a Content Writer I Raised Gemma 4's Token Cap. The Dense Model Stopped Refusing. React Server Components Don't Make Your App Fast by Default Multi-Stage Builds for a Next.js App — Reduce Image Size by 70% I Built a Chrome Extension That Teaches Vocabulary While You Browse Why I Walked Back from Next.js and RSC to a Plain SPA and a Separate Backend NeuralPocket: Private On-Device AI with Gemma 4 — Android & Web Github Speckit: Revolucionando o Desenvolvimento com SDD Cloud Cost Elasticity I Built a Payment System for Bangladesh—Heres Why Stripe Failed Us Polyglot Persistence in Microservices: Choosing the Right Database for Each Service Centralized Authentication for a Multi-Brand Laravel Ecosystem How I made a perfect recording button. Simple yet complex thing.
Tutorial: This AI Now Tells You if a Meeting Could Be an Email
Andrew Dugan · 2026-05-22 · via DEV Community

Key Takeaways

  • DigitalOcean's Inference Router semantically routes prompts to the most appropriate model based on custom instructions. The setup process is 'point-and-click', with no hardcoded "if/else" logic required.
  • The router is built directly into the inference pipeline. Users can make inference requests normally, and the router automatically handles the workflow.
  • In our workflow, it determines the nature of the task and routes the request to a cheaper, faster model to write an email or a larger, more advanced model to write a meeting agenda. This architecture can scale beyond meetings and can be used for support tickets, code reviews, legal documents, and more.

Think back to the last time you received a calendar invite with no agenda, 12 attendees, and a title that says "Quick Sync". We've all either held or attended meetings that "could have been an email" at some point, but what if there was a way to have a gentle nudge built straight into your workflow that only leads us into a meeting when the task requires it. Instead of defaulting to a meeting, one could describe the details of the task that needs to be addressed, and immediately either an email is written for you to send out or a meeting agenda is written ready to attach to your calendar invites. To take it a step further, emails and meeting agendas require different levels of depth and consideration, and ultimately different LLMs to write them.

We've built exactly this using DigitalOcean's new Inference Router, a policy-driven routing layer that matches each incoming prompt to the right model based on task complexity without hardcoded "if/else" logic required. In this tutorial, we will cover the "Could have been an email" router that we built using this new feature, how it works, and how to build your own custom router with DigitalOcean's tools.

How the router works with DigitalOcean

Traditional LLM (large language model) inference involves sending a request to a single model and getting a response. The better or worse the model, the better or worse the response. LLM routers are a layer in between you and a group of models that takes your request, identifies the best model for the request, and has that specific model handle it. Routers can be customized to choose models based on speed, price, specific task, or any other optimization you are looking for. It allows teams to set up a single endpoint for a wide range of needs while getting the best possible price and speed for each request.

In our case, we built a router with two tasks. The first task we made is write_email. It is backed by a cheap, fast model (Llama 3.3 Instruct 70B) for writing a simple email. The second task is write_meeting_agenda. It is backed by a frontier model (Anthropic Claude Opus 4.7) to create a detailed meeting plan to discuss decisions that genuinely require talking to each other. In the request, you describe what you need done, the topic, the stakeholders, and any agenda items, and the router reads that description, matches it against the task definitions, and routes it to whichever model fits. If the request lands on the write_email task, the router delivers a verdict of "this could be an email" and generates a ready-to-send email draft. If it lands on write_meeting_agenda, the app confirms the meeting is warranted and produces a structured agenda with talking points and action items. The routing decision itself is the verdict. No additional classification logic is needed.

Step 1. Build the router

The first step to building a router is to log in to your DigitalOcean cloud account, or create an account if you don't have one already. Navigate to the router page and select "Create Router".

The DigitalOcean Create a Router page showing the name and description fields

On the Create a Router page, give the router a unique name and a description. That description is not just metadata. It serves as a routing prompt, giving the router overall context so it can identify the most appropriate task for each incoming request. From there you define the tasks that make up the router's logic. Each task combines a name, a description, and a model pool with a selection policy. You can either add pre-configured tasks that DigitalOcean has already benchmarked and optimized, or define fully custom tasks that specify exactly which models to use and how to rank them, whether by cost efficiency, speed (Time To First Token), or a manual ranking you control.

The Add Custom Task dialog showing task name, description, and model pool fields

Once your tasks are in place, the last piece is specifying fallback models. Fallback models catch any request that does not cleanly match one of your configured tasks, and they are tried in the priority order you set. This gives the router a safety net so that even if the incoming prompt is ambiguous or outside the scope of your named tasks, a response is still generated rather than failing silently. For our email/meeting router, that means a borderline "is this a meeting or an email?" input never goes unanswered.

If you prefer automation over the control panel, you can also create the router with a single POST request to https://api.digitalocean.com/v2/gen-ai/models/routers, passing in the same names, task definitions, selection policies, and fallback models as a JSON body, which is also useful for version-controlling your router alongside your application code.

Step 2. Build the app

With the router created, integrating it into an application is straightforward because the router is a drop-in replacement for any direct model call. You use the same Chat Completions endpoint (https://inference.do-ai.run/v1/chat/completions) and the same request shape, but instead of naming a specific model you prefix your router's name with router: in the model field. For this app, the field would look like "model": "router:meeting-or-email". Authentication works the same way. You generate a Model Access Key from the DigitalOcean Control Panel, export it as MODEL_ACCESS_KEY, and pass it as a Bearer token in your request header. The user's meeting description, agenda, and attendee list become the message content, and the router takes it from there.

[label meeting_or_email.py]
import requests

def meeting_or_email(user_input):
    url = "https://inference.do-ai.run/v1/chat/completions"
    headers = {
        "Content-Type": "application/json",
        "Authorization": "Bearer <^>YOUR_MODEL_ACCESS_KEY<^>",
    }
    data = {
        "model": "router:meeting-or-email",
        "messages": [
            {
                "role": "system",
                "content": (
                    "You are a workplace productivity assistant that evaluates whether a task "
                    "requires a live meeting or can be handled asynchronously via email. "
                    "If the request involves a straightforward update, announcement, or single-topic "
                    "communication with no real-time decision-making needed, write a concise, "
                    "professional email draft and state that this could have been an email. "
                    "If the request requires discussion, real-time collaboration, debate, or "
                    "coordination among multiple stakeholders with competing priorities, produce "
                    "a structured meeting agenda with talking points and action items, and confirm "
                    "that a meeting is warranted. Always begin your response by clearly stating "
                    "your verdict: 'This could be an email.' or 'This warrants a meeting.'"
                ),
            },
            {
                "role": "user",
                "content": user_input,
            }
        ]
    }

    response = requests.post(url, headers=headers, json=data)
    response_body = response.json()
    print(f"Model: {response_body['model']}")
    print(f"Message: {response_body['choices'][0]['message']['content']}")

Enter fullscreen mode Exit fullscreen mode

The model field in the response body tells you exactly which model the router selected for that request. Requests the router judged as routine land on the cheaper, faster model, while requests it judged as genuinely complex land on the frontier model. The x-model-router-selected-route response header tells you which task was matched, for example write_email vs write_meeting_agenda, or fallback if none of the tasks matched. The app does not need any if/else logic to decide what kind of meeting it is. It reads the header the router already populated and maps it to a verdict message for the user.

meeting_or_email("I need to plan a large event with multiple stakeholders that will all be involved.")

Enter fullscreen mode Exit fullscreen mode

[secondary_label Output]
Model: anthropic-claude-opus-4.7
Message: This warrants a meeting.

Coordinating a large event with multiple stakeholders involves competing priorities, real-time negotiation of responsibilities, and collaborative decision-making that simply cannot be handled efficiently via email threads. Below is a structured agenda to make the meeting productive.

---

## Event Planning Kickoff Meeting
...

Enter fullscreen mode Exit fullscreen mode

You can see above that with a large project the task is routed to Opus 4.7. With a smaller task that just warrants an email, below, the task is routed to Llama3.3.

meeting_or_email("I have some metrics I want to share with my team.")

Enter fullscreen mode Exit fullscreen mode

[secondary_label Output]
Model: llama3.3-70b-instruct
Message: This could be an email. 

Here's a draft email you could send to your team:

Subject: Update on Key Metrics

Dear Team,
...

Enter fullscreen mode Exit fullscreen mode

Step 3. Deploy to DigitalOcean App Platform

Before deploying your own router, it is worth spending a few minutes in the Inference Router playground to validate that the router is routing the way you expect. From the My Routers tab, click the menu next to your router and select a model to compare it against. The Playground opens in a split view where you can type a meeting description and see both the router's response and the comparison model's response side by side. Each result shows the cost difference, end-to-end latency, the specific model the router selected, and the task that was matched for that query. This is a useful check to confirm that your task descriptions are correctly discriminating between routine syncs and complex-coordination requests before any real traffic hits the router.

Once deployed, the Analyze tab gives you a live view of how the router is performing in production. You can see aggregate metrics across all your routers or drill into a specific one, including total requests, total token usage, model match rate, and fallback rate. Model match rate is the percentage of requests matched to a configured task, and fallback rate is the percentage that fell through to the fallback models instead. For accuracy evaluation, the Router Evaluation tool in the Playground tab lets you upload a labeled dataset and run an LLM-as-a-Judge evaluation that scores responses on completeness, correctness, token usage, and latency. Together these two views give you what you need to iterate on your task descriptions and model pools after launch as you accumulate real meeting data.

Conclusion

The meeting app we built is a thin wrapper around a genuinely powerful idea. You do not have to choose which model handles a request, you just have to describe the conditions under which each model makes sense and let the router enforce those conditions at runtime. The router does not just save money on tokens. It changes how you think about designing for complexity. Instead of building one prompt that works adequately for everything, you build narrow, well-described task buckets and let semantic matching handle the dispatch.

The broader lesson here extends well beyond meetings and emails. The same pattern applies anywhere you have a mix of requests hitting a single endpoint. This could include a customer support queue where most tickets are simple FAQs but a few require nuanced reasoning, a code review pipeline where style fixes and architecture feedback warrant very different models, or a legal document classifier where boilerplate and novel clauses should not cost the same to process. Once you have written a router description and a pair of task definitions, you have infrastructure that scales horizontally without adding branching logic to your application code. DigitalOcean's platform keeps that infrastructure on one bill and one security model, which removes the operational overhead that typically discourages teams from adopting multi-model strategies in the first place.

Related links