惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
V
Vulnerabilities – Threatpost
有赞技术团队
有赞技术团队
小众软件
小众软件
O
OpenAI News
C
Cyber Attacks, Cyber Crime and Cyber Security
I
Intezer
NISL@THU
NISL@THU
D
Darknet – Hacking Tools, Hacker News & Cyber Security
N
News and Events Feed by Topic
MongoDB | Blog
MongoDB | Blog
阮一峰的网络日志
阮一峰的网络日志
Hacker News: Ask HN
Hacker News: Ask HN
D
Docker
WordPress大学
WordPress大学
Security Archives - TechRepublic
Security Archives - TechRepublic
A
About on SuperTechFans
Stack Overflow Blog
Stack Overflow Blog
C
CERT Recently Published Vulnerability Notes
L
LINUX DO - 最新话题
Application and Cybersecurity Blog
Application and Cybersecurity Blog
M
MIT News - Artificial intelligence
Blog — PlanetScale
Blog — PlanetScale
S
Security @ Cisco Blogs
Cloudbric
Cloudbric
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
V
V2EX
Hacker News - Newest:
Hacker News - Newest: "LLM"
G
Google Developers Blog
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
W
WeLiveSecurity
Google DeepMind News
Google DeepMind News
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
H
Hackread – Cybersecurity News, Data Breaches, AI and More
G
GRAHAM CLULEY
S
Schneier on Security
T
Tor Project blog
Spread Privacy
Spread Privacy
PCI Perspectives
PCI Perspectives
Microsoft Security Blog
Microsoft Security Blog
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
F
Fortinet All Blogs
L
Lohrmann on Cybersecurity
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
T
The Exploit Database - CXSecurity.com
TaoSecurity Blog
TaoSecurity Blog
Apple Machine Learning Research
Apple Machine Learning Research
T
Threat Research - Cisco Blogs
T
Troy Hunt's Blog
罗磊的独立博客

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python) The Hidden Cost of AI Systems Nobody Talks About. undefined vs undeclared, and how typeof behaves Switching from file-based jobs to NATS/Kafka in Rust without changing code io_uring Adventures: Rust Servers That Love Syscalls Why Agentic AI is Killing the Traditional Database The POUR principles of web accessibility for developers and designers Quantum Neural Network 3D — A Deep Dive into Interactive WebGL Visualization How To Install Caveman In Codex On macOS And Windows Automation Pipeline Reliability: Why Your Workflow Breaks When Nobody Is Watching I Built an 'Open World' AI Coding Agent — It Works From ANY Folder From Freelancing to Product: A Tech Service Company's SaaS Transformation China's AI Giants: Adding Tencent Hunyuan & ByteDance Doubao to AI University (74 Providers) On the Vibe Coders and Their Lies clerk: Auto-Summarize Your Claude Code Sessions AI Weekly — 2026/04/10–04/17 | The Model Lockdown Is Here, but the Toolchain Is the Real Battleground AI 週報 — 2026/04/10–2026/04/17 模型封鎖潮來了,但工具鏈才是真戰場 Maybe this is how Open-Source apps are born... 🚀 Fine-Tune LLMs with LoRA and QLoRA: 2026 Guide tRPC v11 + Next.js App Router: End-to-End Type Safety Without the Boilerplate ShadCN UI in 2026: Why I Stopped Installing Component Libraries and Started Owning My Components SaaS Billing in React Server Components: Stripe + Supabase Without a Single `useEffect` Join our DEV Weekend Challenge — $1,000 in Prizes Across TEN winners! Submissions Due April 20 at 6:59 AM UTC. Implementing FSRS Spaced Repetition in Flutter + Supabase — Adding Memory Science to an AI Learning App "I Texted My Localhost From the Train — Claude Code Fixed the Bug Before I Got Home" I Built a Sales Prep AI and It Went Deeper Than Expected Design to Code #2: One JSON, Eleven Outputs Solving the 100M-Row Problem: A Summary Table Pattern for High-Volume Push Notification Logs Flutter Web With Wasm: What Actually Changes For Developers I Built 50 Royalty-Free Soundtracks for My Side Project in a Weekend Using AI Music Generation The Vibe Coding Security Checklist: 7 Things to Check Before You Ship Stop Letting Googlebot Guess Fix Your React App's SEO Right Desconstruindo o Streaming do LinkedIn: Como Criar um Engine de Extração de Vídeo de Alta Performance com HLS e FFmpeg (EDA Part-1) EDA (Exploratory Data Analysis) Explained With Real Life — Why Looking at Your Data Is the Most Important Step in Machine Learning Brand Relationship Management at Scale: Our 4-Touch Outreach System for 200+ Brands Why String.fromEnvironment() Might Return an Empty String in Dart JGuardrails 1.0.0 — Hardening Java LLM Apps Against Jailbreaks, Toxicity, and Prompt Injection Plan and Schedule a Full Week of Threads Content From One Claude Conversation Coding Cat Oran Ep3, Five Tables Changed Everything Updated: BFF Pattern I'm done watching freelancers get buried by 200 proposals. So I'm building the alternative. This is my first post BFS Algorithm in Java Step by Step Tutorial with Examples Tracking LLM Pricing Monthly: An Open Dataset for 22 AI Models How We Measure Content ROI on a Comparison Site: Revenue Attribution Without Perfect Data Introducing Nova AI Ops: The AI-Native Operating System for SRE Teams I built a free desktop video downloader for Windows — Grabbit How Talkie OCR Helps Vision-Impaired & Dyslexic Users Read the World Around Them VRCFaceTracking安装和iPhone面捕配置教程,有bug Even CrowdStrike Can't See Your Agents The Automation Gold Rush: What n8n Workflows and Claude Are Opening Up for Developers Right Now
From 9 Tiles to 900: Scaling Computer Vision Pipelines
Eric D Johnson · 2026-06-05 · via DEV Community

The scale wall

A computer vision pipeline that works on one image at one resolution isn't a pipeline. It's a prototype. The moment you move beyond controlled inputs, you hit the reality of production images: a 4K video frame, a satellite capture, a whole-slide pathology image, a high-resolution document scan. These images don't fit in a single model call. They're too large, too detailed, and too information-dense for one inference pass to handle well.

So you tile it. You divide the image into a grid of regions and run inference on each region independently. A 3×3 grid means 9 inference calls. An 8×8 grid means 64. A whole-slide pathology image at diagnostic resolution? Tens of thousands of tiles.

The orchestration problem scales directly with the image.

And as that tile count grows, so do the failure modes. Nine concurrent inference calls might all succeed. Sixty-four concurrent calls will occasionally hit a throttle limit or a timeout. At hundreds of tiles, partial failures aren't edge cases. They're expected. You need orchestration for your CV pipeline. The real requirement is that your orchestration scales with your image.

The pattern you already use

Tiled inference isn't a niche technique. It's the industry standard for any image that exceeds a model's input constraints. SAHI (Slicing Aided Hyper Inference) has over 35,000 stars on GitHub. It partitions images into overlapping slices, runs detection on each slice, and stitches results together. Digital pathology pipelines routinely tile gigapixel whole-slide images into thousands of patches for parallel inference. Satellite imagery processing architectures on AWS all involve the same core pattern: tile, infer in parallel, aggregate.

The pattern is well-established. What's missing is the orchestration layer that makes it durable at scale. SAHI runs on a single machine. Production pathology pipelines require custom coordinator services, worker pools, and explicit failure handling infrastructure. Everyone builds the same glue differently.

AWS Lambda durable functions introduce an operation called context.map() that maps directly onto this pattern. It fans out an array of items as independent concurrent invocations, each independently checkpointed, with a configurable concurrency cap. One failed tile retries only that tile, not the entire image. The same line of code handles 9 tiles or 900.

What I built

In this post, I walk through an image analysis pipeline I built using durable functions to demonstrate this pattern concretely. The application accepts an image and divides it into an N×N grid of regions. It runs concurrent Amazon Bedrock inferences across the grid, synthesizes the results into a scene description with per-object bounding boxes, and streams progress to a real-time dashboard via WebSocket.

The request flow:

  1. Upload: The browser requests a presigned S3 URL and uploads the image directly to Amazon S3.
  2. Trigger: The browser calls the analyze endpoint. An API Lambda fires the durable pipeline asynchronously and returns AWS AppSync connection details.
  3. Subscribe: The browser opens a WebSocket to AppSync Events and subscribes to the pipeline's execution channel.
  4. Pipeline: A single durable function executes four checkpointed steps: preprocess, analyze (fan-out), synthesize, and store.
  5. Dashboard: Results stream to a shared display as each tile completes, with Jarvis-style bounding box overlays on detected objects.

The entire backend is two Lambda functions: one API handler and one durable pipeline function. No queue infrastructure. No separate orchestration service. No worker pool management.

Walking through the pipeline

Take a look at the pipeline handler. The entire orchestration reads as sequential code: four steps, top to bottom.

export const handler = withDurableExecution(
  async (event: AnalysisPipelineEvent, context: DurableContext) => {

    // Step 1: preprocess - moderate + build region grid
    const preprocessed = await context.step('preprocess', async () => {
      const gridSize = Number(event.gridSize ?? 3);
      const imageBase64 = await fetchImageBase64(event);
      await moderateImage(imageBase64, imageFormat);
      return { regions: buildRegions(gridSize) };
    });

    // Step 2: context.map - parallel region inference
    const mapResults = await context.map(
      'analyze-regions',
      preprocessed.regions,
      async (ctx: DurableContext, region: ImageRegion, index: number) => {
        return await ctx.step(`analyze-region-${index}`, async () => {
          const imageBase64 = await fetchImageBase64(event);
          const finding = await analyzeRegion(imageBase64, imageFormat, region);
          await publish(ch, [{ type: 'region', index, status: 'done', finding }]);
          return {
            regionIndex: finding.regionIndex,
            regionLabel: finding.regionLabel,
            analysis: finding.analysis.slice(0, 500),
            detectedObjects: (finding.detectedObjects ?? []).slice(0, 8),
          };
        });
      },
      { maxConcurrency: 5 },
    );

    const successfulFindings = mapResults.succeeded()
      .map(item => item.result as RegionFinding);

    // Step 3: synthesize
    const synthesis = await context.step('synthesize', () =>
      synthesizeFindings(successfulFindings)
    );

    // Step 4: store
    const stored = await context.step('store', async () => {
      // Persist to DynamoDB + publish dashboard event via AppSync
    });
  }
);

Enter fullscreen mode Exit fullscreen mode

I'll walk through each step and what it does for you at scale.

Step 1: Preprocess

The first step handles content moderation and builds the region grid. The grid size is a parameter. Set it to 3 for a 3×3 grid (9 regions) or 8 for an 8×8 grid (64 regions). The grid size is a function of the image: larger or more complex images benefit from finer-grained tiling.

The durable runtime checkpoints this step. If the Lambda function dies after preprocessing completes, replay skips directly to step 2. The moderation check and grid computation don't repeat.

Step 2: context.map(), the tiled inference step

This is the core of the pattern. context.map() takes the array of regions from step 1 and fans them out as independent concurrent invocations. Each region gets its own checkpointed step. Each invocation fetches the image independently, runs inference against Bedrock, and returns findings for that region.

const mapResults = await context.map(
  'analyze-regions',
  preprocessed.regions,
  async (ctx: DurableContext, region: ImageRegion, index: number) => {
    return await ctx.step(`analyze-region-${index}`, async () => {
      const imageBase64 = await fetchImageBase64(event);
      const finding = await analyzeRegion(imageBase64, imageFormat, region);
      return { /* region findings */ };
    });
  },
  { maxConcurrency: 5 },
);

Enter fullscreen mode Exit fullscreen mode

Three things to notice here.

First, maxConcurrency: 5 caps how many tiles process simultaneously. For the demo I set this to 5. In production, you'd match this to your Bedrock throughput quota: 20, 50, or higher depending on your provisioned capacity.

Second, each tile re-fetches the image from S3 rather than receiving it as input. Image bytes are too large for checkpoint storage, so each tile must be self-contained.

Third, each tile's result is independently checkpointed. If tile 6 out of 9 fails, tiles 1–5 keep their results. Only tile 6 retries.

The model invocation itself uses the Amazon Bedrock Converse API:

export async function invokeNova(
  prompt: string,
  imageBase64: string,
  imageFormat: ImageFormat
): Promise<string> {
  const response = await client.send(new ConverseCommand({
    modelId: MODEL_ID,
    messages: [{
      role: 'user',
      content: [
        { image: { format: imageFormat, source: { bytes: new Uint8Array(Buffer.from(imageBase64, 'base64')) } } },
        { text: prompt }
      ]
    }],
    inferenceConfig: { maxTokens: 512 }
  }));
  return response.output?.message?.content?.[0]?.text;
}

Enter fullscreen mode Exit fullscreen mode

I'm using Amazon Nova Lite for the demo because it's fast and cost-effective for concurrent vision calls. However, the model is a pluggable parameter. You can swap to Anthropic Claude for more nuanced reasoning on the synthesis step, route to an Amazon SageMaker endpoint for a custom-trained detection model, or use different models for different steps entirely.

The orchestration pattern doesn't change. Only the inference call changes.

Step 3: Synthesize

After the map operation completes, all successful region findings are available as an array. The synthesize step aggregates them into a coherent scene description with overall object detection results and computer vision insights.

const successfulFindings = mapResults.succeeded()
  .map(item => item.result as RegionFinding);

const synthesis = await context.step('synthesize', () =>
  synthesizeFindings(successfulFindings)
);

Enter fullscreen mode Exit fullscreen mode

Model selection becomes a scaling lever at this step. The tiled inference step runs N times concurrently, so you want it fast and cheap. The synthesis step runs once and needs to reason across all findings. You might want a more capable model here. Same orchestration code, different model routing per step based on the complexity of the task.

Step 4: Store

The final step persists the analysis result to Amazon DynamoDB and publishes a dashboard event through AppSync. Because this runs inside a checkpointed step, a failure here doesn't repeat the expensive inference steps. Only the storage operation retries.

Scale mechanics: what happens as N grows

The pipeline I've shown works with a 3×3 grid: 9 tiles, 9 inference calls. What happens when you need 64 tiles? Or 400? The code doesn't change. But the architecture decisions I made become increasingly important.

Image size drives tile count

The grid size is a parameter. A 3×3 grid works for a demo image. A high-resolution satellite capture might need an 8×8 grid. A whole-slide pathology image at diagnostic resolution might need a 20×20 grid or larger.

The buildRegions() function generates the grid based on that parameter. The context.map() call processes whatever array it receives. From the orchestration's perspective, 9 regions and 400 regions are the same operation at different scales.

Concurrency cap matches your throughput

The maxConcurrency option controls how many tiles process simultaneously. Set it to 5 for a demo running against on-demand Bedrock. Set it to 50 for a production workload with provisioned throughput. Set it to 200 for a batch job with a high-throughput SageMaker endpoint. The durable runtime manages the fan-out and concurrency without you building a queue or a semaphore.

The 256 KB checkpoint limit enforces clean architecture

Durable function checkpoints have a 256 KB size limit per step result. This means you cannot pass image bytes through a checkpoint. They're too large. Each tile re-fetches the image from S3 independently.

At 9 tiles, this feels like an overhead you'd rather avoid. At 400 tiles, it's the only sane architecture. You want each tile to be a self-contained unit that reads its input, runs inference, and returns a small result object. The checkpoint limit enforces this discipline from day one.

For higher tile counts, you can eliminate the per-tile S3 API calls entirely by mounting your image bucket with Amazon S3 Files. With S3 Files, the Lambda function reads the image directly from the local filesystem. No GetObject calls, no SDK overhead, no presigning. The image is a file path. At 9 tiles the difference is negligible. At 400 concurrent tiles each making a GetObject call, filesystem access becomes a meaningful optimization.

Partial failure at scale

At 9 tiles, one failure is an annoyance. You might tolerate restarting all 9. At 64 tiles, restarting all 64 because tile 47 hit a timeout is a waste of compute, time, and money. At 400 tiles, it's unacceptable. The mapResults object gives you fine-grained failure handling:

const successfulFindings = mapResults.succeeded()
  .map(item => item.result as RegionFinding);

if (mapResults.failureCount > 0) {
  mapResults.failed().forEach(item =>
    context.logger.error('Region failed', { index: item.index, error: String(item.error) })
  );
}

Enter fullscreen mode Exit fullscreen mode

Successful tiles keep their checkpointed results. Failed tiles can be logged, retried independently, or excluded from the synthesis. The pipeline degrades gracefully rather than failing catastrophically.

Model selection as a scaling lever

As tile count grows, cost per inference call matters more. With 9 tiles, using a capable (expensive) model for each tile is reasonable. With 400 tiles, you want the cheapest model that produces acceptable results for the per-tile work, and reserve the capable model for the single synthesis step. The orchestration code stays identical. You change a model ID parameter, not the pipeline structure.

Real-time observability at scale

Every tile publishes its completion status through AWS AppSync Events:

await publish(ch, [{ type: 'region', index, status: 'done', finding }]);

Enter fullscreen mode Exit fullscreen mode

At 9 tiles, this produces a satisfying progress indicator. Users watch regions light up on a dashboard as inference completes. At 64 tiles, real-time observability becomes essential rather than nice-to-have. Without per-tile status events, a 64-tile pipeline is a black box that either succeeds after two minutes or fails with no indication of where it stalled.

The dashboard in this demo subscribes to the pipeline's execution channel and renders results as they arrive. Each tile's bounding box detections overlay onto the original image in real time. At scale, this pattern gives operators visibility into pipeline health without polling: which tiles completed, which are in progress, which failed.

Get started

The complete source, including deploy instructions, frontend setup, and teardown, is available on GitHub: image-analysis-orchestration.

To experiment with scale, change the gridSize parameter when triggering the pipeline. Start with 3 (9 tiles). Try 5 (25 tiles). Push to 8 (64 tiles) and watch how the same code handles increased concurrency with checkpointed resilience.


Tiled inference is already your pattern. If you're working with images that don't fit in one model call (and at production resolution, most interesting images don't), you're already tiling, processing in parallel, and aggregating results. With durable functions, you get checkpointed, resilient orchestration for that pattern without building separate infrastructure. The context.map() call that handles 9 tiles handles 900. Your orchestration scales with your image.

This isn't a toy demo. It's the skeleton of production batch inference.