惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

P
Privacy International News Feed
Martin Fowler
Martin Fowler
D
Docker
Y
Y Combinator Blog
云风的 BLOG
云风的 BLOG
U
Unit 42
T
Tailwind CSS Blog
J
Java Code Geeks
G
Google Developers Blog
MongoDB | Blog
MongoDB | Blog
阮一峰的网络日志
阮一峰的网络日志
WordPress大学
WordPress大学
月光博客
月光博客
大猫的无限游戏
大猫的无限游戏
美团技术团队
F
Fortinet All Blogs
N
News and Events Feed by Topic
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
Hacker News - Newest:
Hacker News - Newest: "LLM"
The GitHub Blog
The GitHub Blog
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
Recorded Future
Recorded Future
N
Netflix TechBlog - Medium
Google DeepMind News
Google DeepMind News
Hacker News: Ask HN
Hacker News: Ask HN
L
LINUX DO - 最新话题
Microsoft Security Blog
Microsoft Security Blog
N
News and Events Feed by Topic
I
Intezer
TaoSecurity Blog
TaoSecurity Blog
NISL@THU
NISL@THU
小众软件
小众软件
博客园 - 聂微东
博客园 - Franky
有赞技术团队
有赞技术团队
P
Palo Alto Networks Blog
爱范儿
爱范儿
H
Hacker News: Front Page
C
Cyber Attacks, Cyber Crime and Cyber Security
C
Cisco Blogs
P
Proofpoint News Feed
I
InfoQ
Google DeepMind News
Google DeepMind News
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Vercel News
Vercel News
H
Heimdal Security Blog
C
Cybersecurity and Infrastructure Security Agency CISA
Application and Cybersecurity Blog
Application and Cybersecurity Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
量子位

Databricks

Why Talent Transformation Is the Missing Focus of Enterprise AI Public Health Intelligence Shouldn't Require a Data Scientist Mean Time to Detect Is a Data Access Problem First-party audience data is the ad sales relationship now Rethinking Distributed Systems for Serverless Performance and Reliability The AI Scaling Gap Hiding in Digital Native Companies 10 trillion samples a day: Scaling beyond traditional monitoring infra at Databricks AI success starts with clean data, not just better models How nOps Rebuilt Their Cloud Optimization Platform on Databricks Lakebase, and Why Other ISVs Should Too Peril Predicts: Precision Payouts for a Volatile World The foundation of AI scalability: one team, one platform, one operating model The Federal Data Paradox: Rich in Data, Poor in Access Driving Budapest Forward: How BKK Uses Databricks to Transform City Mobility LLM Vs AI: A Practical Guide to Differences, Use Cases, and Tools Model Risk Governance Is Not the Same as Risk Intelligence Generative AI for Business: A Complete Strategy and Implementation Guide Data Science vs Data Engineering: Choosing Analysis or Infrastructure AI Applications: Tools, Use Cases, and Platforms MLOps vs DevOps: A Practical Guide for Data Scientists and IT Teams Top Data Warehouse Tools For Modern Data Analytics Unlocking SAP Business Context in Databricks with Semantic Metadata Delta Sharing The marketing activation gap has a fix: Databricks and Stitch partner to turn data infrastructure into marketing performance Alert Fatigue Is a Business Risk Backstage with Lakebase Shipping Faster isn’t Learning Faster Why Your OEE Dashboard Is Lying to You The Turbine That Tried to Tell You It Was Failing Predicting Readmissions Isn't Enough. Acting in Time Is. Clinical Trials Run Longer Than They Have To. That's a Patient Problem Network Quality Is a Revenue Problem, Not a Technical One Shelf Availability Starts with Better Demand Visibility When Predicting the Next Hit Requires More Than Intuition Approximate Answers, Exact Decisions: New Sketch Functions for Analytics Companies Winning with AI Built the Data Layer First Rethinking SQL ETL for modern data platforms Stripe data now available on Databricks via Databricks Marketplace Databricks and Stripe Projects: Infrastructure Built for Agents Agents are ready but your architecture probably isn't Interoperability Between Unity Catalog and Google BigQuery via Catalog Federation Built In, Not Bolted On: What AI-Native Actually Means in Cybersecurity Operationalizing AI for public sector fraud prevention From months to minutes: Building real-time clinical data pipelines with natural language Agentic Data Engineering with Genie Code and Lakeflow Securely send first-party conversion signals with Snapchat Conversions API on Databricks Marketplace How leading tech companies are killing the builder’s tax with Lakebase Inside one of the first production deployments of Lakebase: LangGuard's agentic workflow governance engine The next generation of Databricks Genie Model Risk Management in 2026: A Banker’s Guide to the Revised Interagency Guidance OpenAI GPT-5.5 now available on Databricks, fully-governed through Unity AI Gateway Operational databases: How they work and when to use them Databricks partners with OpenAI on GPT-5.5 Announcing the Public Preview of Lakeflow Designer Are LLM agents good at join order optimization? How conversational analytics removes the BI bottleneck How to transform document activation workflows with Genie and Agent Bricks Beyond the spreadsheet: how Databricks is delivering the modern CFO in Financial Services AI App Development: Guide To Building AI-Powered Apps IoT in Manufacturing: Strategy, Components, Use Cases, and Challenges Stop Hand-Coding Change Data Capture Pipelines Multimodal Data Integration: Production Architectures for Healthcare AI Personalization Strategies for Media Companies A Modern AI Risk Management Framework Introducing the Databricks Excel Add-in for Business Users Real-Time Decisioning for AI Agents: Why you Need a Customer Context Layer First A Practical Guide to LLM Fine Tuning AI Data Transformation Guide for Data Engineers and Data Scientists Concurrency Control in DBMS: How Locking, MVCC and Optimistic Strategies Keep Data Consistent Bridging data science and marketing: Databricks unveils Delta Sharing integration for Adobe Experience Platform and agentic marketing workflows Take Control: Customer-Managed Keys for Lakebase Postgres Get hands on with agents, vibe coding and more at Data+ AI Summit Mercedes-Benz Builds a Cross-Cloud Data Mesh with Delta Sharing and Intelligent Replication, Cutting Costs by 66% What Is a Transactional Database? Introducing Genie Agent Mode Governing coding agent sprawl with Unity AI Gateway Governing Coding Agent Sprawl with Unity AI Gateway What is pgvector? Banks Don’t Have an AI Problem – They Have a Data Platform Problem Open Platform, Unified Pipelines: Why dbt on Databricks is Accelerating Why Your Agents Can’t Read Enterprise Documents — and How to Fix It Building with Databricks Document Intelligence and Lakeflow Databricks on Google Cloud: Innovate Faster. Smarter. Together. Introducing the Databricks Connector for Google Sheets: Real-Time, Governed Lakehouse Data in the Sheets Users Love Unity AI Gateway: How to connect agents to external MCPs securely Expanding agent governance with Unity AI Gateway Agentic reasoning in practice: Making sense of structured and unstructured data Agent Bricks: The Governed Enterprise Agent Platform 8 AI and data trends shaping financial services in 2026 Building real-time product search on Databricks Lovable + Databricks: Build Data-Driven Apps at the Speed of Thought Memory scaling for AI agents Powering clinical research innovation: How TriNetX uses Databricks to accelerate drug development Database Branching in Postgres: Git-Style Workflows with Databricks Lakebase How Zalando built a unified data foundation for AI and analytics on Databricks The next era of the open lakehouse: Apache Iceberg™ v3 in Public Preview on Databricks How FSIs eliminate silos between clients, operations, and finance How MakeMyTrip achieved millisecond personalization at scale with Databricks A multi-agent approach to audience intelligence AiChemy: Next-generation agent with MCP, skills and custom data for drug discovery Accelerate business insights with Lakeflow Connect, now with a Free Tier Unlocking Next-Gen Customer Experiences with Data Intelligence for Marketing
How Databricks is turning video into searchable, actionable intelligence
Justin Monaldo · 2026-06-27 · via Databricks

A utility company deploys drones to inspect hundreds of miles of power lines. A police department pulls hours of traffic camera footage to investigate a hit-and-run accident. An urban planning team leverages camera footage to analyze pedestrian and traffic flow.

Terabytes of video data are generated every single day that can provide valuable insights into everything from operational efficiency to public safety. But almost none of it gets analyzed in any meaningful way. That’s because combing through this unstructured video data is massively time-consuming and expensive.

Imagine being able to simply apply natural language queries to video content at scale to not just find specific content—but analyze, assess, and learn from it.

Databricks can support exactly that. The approach? Treat video as a data engineering problem.

How did Databricks change the approach to video analysis?

The traditional approach to video analysis is to throw more and more human analysts at the problem. Advancements in deep learning, computer vision, and most-recently vision language models (VLMs) have made it possible for computers to identify objects in videos with high accuracy. But scaling inference and orchestrating pipelines with huge quantities of unstructured data has made the logistics of building these pipelines difficult for organizations. This is especially true for applying VLMs to the problem. VLMs provide flexibility in prompting, not requiring the model to be pre-trained or fine-tuned on specific classes before use, but are larger and slower than traditional object detection models, presenting scaling challenges.

In Databricks, you can focus on how video analysis using these models fits into data pipelines, instead of the complexities of model inference and infrastructure.

image2.gif
Users can search video footage instantly using VLMs and natural language.

How does Databricks process and analyze video at scale?

This approach can be demonstrated in a Databricks app deployed directly in a Databricks workspace. A user uploads a video or points to one already stored in a Databricks Volume, enters a natural language prompt describing what they're looking for directly — e.g. white box trucks, security guards, solar panels — and kicks off the processing pipeline with a single click

From there, Databricks Serverless GPU Compute (SGC) takes over. A Lakeflow job is triggered, which grabs pre-warmed GPUs and immediately starts processing the video through Meta's SAM3 segmentation model within seconds. The model identifies objects of interest matching the prompt in each frame of the video. The video is truncated down to only those moments and rewritten into another Databricks Volume. For example, a 26-minute traffic camera video was reduced to one minute and 55 seconds of relevant footage, with original timestamps preserved so reviewers can jump back to the source if needed. Each truncated clip is then passed to a foundation model via the Databricks Foundation Model API (FMAPI) for AI-generated summarization, providing textual data which can be written to a table or flow to additional downstream processes.

Because this entire process is treated as a data engineering problem, the pipeline is explicitly model agnostic, leveraging MLflow to enable users to choose the model they prefer, or even bring new or fine-tuned models to the workflow. MLflow model signatures standardize the model inputs and outputs to ensure continuity and flexibility. Any model that you download from Huggingface or train from scratch can be leveraged in this pipeline. SAM3 can be swapped for YOLO models, other transformer-based vision models, or fine-tuned domain-specific models.”

That flexibility extends to the summarization and anomaly detection layer too. Any multi-model foundation model or smaller image captioning models can be used to convert the frame contents to text descriptions. Having these text descriptions can feed text-based AI workflows to summarize video for analyst review, or identify unexpected content and flag video segments for review. Making models interchangeable without breaking the pipeline makes this example extensible to almost any video processing use case.

Because serverless GPU compute is preconfigured to work with popular NVIDIA GPUs and deep learning frameworks, it’s just a matter of writing your data engineering code. You don’t have to worry about GPU compute capacity or Python package version compatibility with CUDA.

How does the pipeline handle video at scale?

The app-triggered workflow is just one way to interact with the pipeline. The same pipeline can run as a file or event-driven process: video lands in a Databricks Volume, it automatically triggers the LakeFlow job to produce the truncated output and text-base analysis without any human intervention. Downstream, that text can then trigger alerts, route to reviewers, or feed into additional AI processing.

image3.gif
Databricks generates a truncated video and AI-powered summary, surfacing only the most relevant moments for fast or automated review.

Concurrency is handled through a simple configuration. You can dump 20 videos in at once and it will kick off 20 versions of that same job running at the same time. Each job grabs its own serverless GPU compute independently, scaling horizontally as needed, and releases resources when done. No cluster management required, and no paying for GPUs when they’re not in use.

Where can video intelligence be applied?

This app and pipeline are a starting point. After deployment to any Databricks workspace the underlying architecture supports any scenario where large volumes of video need to be processed, searched or summarized. This includes infrastructure inspection, physical security, public safety, airport operations and more. The GitHub repo containing the app and pipeline code is publicly available for teams who want to deploy it, extend it, or adapt it to their own use cases.

image1.png
Databricks orchestrates an end-to-end video intelligence pipeline that ingests, processes and analyzes video at scale to deliver searchable insights in minutes.

Build your video intelligence pipeline on Databricks today

See how your agency can process, summarize and search massive volumes of video without complex ML workflows. Explore Databricks for Public Sector and connect with our public sector team.