慣性聚合 高效追蹤和閱讀你感興趣的部落格、新聞、科技資訊
閱讀原文 在慣性聚合中打開

推薦訂閱源

博客园 - 司徒正美
V
V2EX
T
Tailwind CSS Blog
有赞技术团队
有赞技术团队
aimingoo的专栏
aimingoo的专栏
Apple Machine Learning Research
Apple Machine Learning Research
IT之家
IT之家
Blog — PlanetScale
Blog — PlanetScale
A
About on SuperTechFans
月光博客
月光博客
T
The Blog of Author Tim Ferriss
宝玉的分享
宝玉的分享
Martin Fowler
Martin Fowler
博客园 - 聂微东
The GitHub Blog
The GitHub Blog
V
Visual Studio Blog
WordPress大学
WordPress大学
酷 壳 – CoolShell
酷 壳 – CoolShell
Engineering at Meta
Engineering at Meta
GbyAI
GbyAI

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python)
深入 Dataform 1:探索 API
Ben Watson · 2026-05-24 · via DEV Community

Ben Watson

系列概覽

本系列部落格文章是針對希望探索 Dataform 核心功能之外的 Dataform 用戶。除了其他事情之外,我們將深入研究 Dataform 的強大 API,在 GitHub Actions 中建立自動化的 CI/CD 流程,建立用於監控成本的管線,並透過程式碼修改 config {} 區塊。

Dataform 的 API 是探索 Dataform 更強大且文件記錄較少的功能的自然起點。對 API 的理解允許使用者將 Dataform 作為自動化現代數據平台的所有功能中心.

Dataform API 的關鍵概念

該 API 有兩個物件,它們是任何工作流程的基本組成部分:

  1. CompilationResult - 在特定時間點的 Dataform 工作區編譯狀態( 建立流程 的結果)。
  2. WorkflowInvocation - 單次執行的 CompilationResult 執行流程 的結果)。

CompilationResult

from google.cloud import dataform_v1

client = dataform_v1.DataformClient()

project_id = "my-project"
region = "europe-west2"
repository_id = "analytics"
workspace_id = "dev"

workspace = client.workspace_path(
    project_id,
    region,
    repository_id,
    workspace_id,
)

repository = client.repository_path(
    project_id,
    region,
    repository_id,
)

compilation_result = dataform_v1.CompilationResult(
    workspace=workspace
)

response = client.create_compilation_result(
    parent=repository,
    compilation_result=compilation_result,
)

print(response.name)

進入全螢幕模式 退出全螢幕模式

回應採取以下形式projects/<project_id>/locations/<region>/repositories/<repository_id>/compilationResults/<compilation_result_id> (其中 <compilation_result_id> 是一個 UUID v4).

每當偵測到程式碼變更時,Dataform UI 會產生一個 CompilationResult 物件,儘管 UI 會隱藏其 ID。在執行後,可以在 UI 的 Executions 頁面看到這個 ID。

Dataform has generated a  raw `CompilationResult` endraw

CompilationResult 包含有關 Dataform 工作空間中每個工作的資訊,允許使用者透過程式碼瀏覽 DAG,並提取每個檔案的 config {} 區塊等資訊。接下來,每個工作都可以透過以下方式進行迴圈:

request = dataform_v1.QueryCompilationResultActionsRequest(
    name=COMPILATION_RESULT
)
response = client.query_compilation_result_actions(
    request=request
)
for action in response.compilation_result_actions:
    ...

Enter fullscreen mode Exit fullscreen mode

WorkflowInvocation

from google.cloud import dataform_v1

client = dataform_v1.DataformClient()

project_id = "my-project"
region = "europe-west2"
repository_id = "analytics"

repository = client.repository_path(
    project_id , region, repository_id 
)

# compilation_result comes from the previous snippet
invocation = dataform_v1.WorkflowInvocation(
    compilation_result=compilation_result.name
)

response = client.create_workflow_invocation(
    parent=repository,
    workflow_invocation=invocation,
)

print(f'{response.name}: {response.state}')

Enter fullscreen mode Exit fullscreen mode

AWorkflowInvocation 每次執行執行時都在 Dataform UI 中產生:

A Dataform execution in the UI

WorkflowInvocation 包含已執行每個任務的資訊,允許用戶將任務綁定到它們的 BigQuery 任務 ID,並查看所創建的物件(例如表格或視圖)。每個已執行的任務都可以透過以下方式進行迭代:

request = dataform_v1.QueryWorkflowInvocationActionsRequest
    name=WORKFLOW_INVOCATION
)
response = client.query_workflow_invocation_actions(
    request=request
)
for action in response.workflow_invocation_actions:
    ...

進入全螢幕模式 退出全螢幕模式

將它們綁定在一起

當這兩個物件結合時,Dataform的API才真正展現其威力。一個典型的流程如下:

  • 工作空間被更新(手動或透過Git),
  • 從工作空間產生一個CompilationResult
  • 從那次編譯建立一個WorkflowInvocation
  • BigQuery執行產生的DAG。

Mermaid architecture diagram showing the relationship between  raw `CompilationResult` endraw  and  raw `WorkflowInvocation` endraw

僅僅這兩個物件就足以自動化在 Dataform UI 之外建立和執行 Dataform DAG。這使得 CI/CD 管道變得容易,我們將在稍後的文章中探討。