This is a submission for the Gemma 4 Challenge: Build with Gemma 4
What I Built
Gemma 4 CAD Orchestrator is a cloud-native, AI-powered parametric CAD application that lets engineers describe mechanical parts in plain English and instantly see them rendered in 2D blueprint view and 3D interactive viewport — powered by Google DeepMind's Gemma 4 26B A4B IT via Vertex AI Model Garden.
Demo
https://gemma4-cad-orchestrator-176775177828.us-central1.run.app/
Code
How I Used Gemma 4
How I Built This
1. The Idea
Mechanical engineers spend hours in legacy CAD suites manually modeling parts that already exist entirely in their heads. I wanted to eliminate this friction by creating a deterministic, zero-install, text-to-geometry pipeline. By simply describing a part in natural language—"a mounting plate 100mm long, 60mm wide, 8mm thick with 10mm holes"—the AI instantly generates a fully interactive 3D model, synchronized 2D engineering blueprints, and an editable parametric feature tree.
2. Choosing the AI Backend
The foundational challenge was identifying an LLM capable of reliably parsing technical engineering intent into rigid, structured JSON schemas without hallucinating geometries. To maximize resilience, I built the system model-agnostic from day one with a pluggable backend interface that dynamically scans and cascades through available endpoints:
-
Ollama (Local workstation fallback via
gemma4:31borgemma4:26b) - Vertex AI Gemma 4 MaaS (Production-grade, global managed infrastructure)
- Hugging Face Inference API (Alternative serverless endpoint)
- Google Gemini API (High-availability operational fallback)
The Express backend auto-detects operational availability and gracefully handles failsafes down the stack.
3. System Architecture
User's Browser Cloud Run Vertex AI
┌──────────────┐ HTTP ┌──────────────────┐ gRPC ┌──────────────────┐
│ Three.js 3D │◄────────────►│ Express Server │◄──────────►│ Gemma 4 26B │
│ Canvas 2D │ /api/* │ @google/genai │ MaaS │ (global) │
│ Copilot UI │ │ System Prompt │ │ │
└──────────────┘ │ JSON parser │ └──────────────────┘
└──────────────────┘
│
┌────▼────┐
│google.json│
4. Operational Pipeline:
- Intent Capture: The user defines a component concept inside the terminal-style Copilot UI.
- Context Contextualization: The middleware intercepts the request and injects a strict system prompt bounding the data structure across 9 geometric primitives, explicit parametric ranges, and multi-shot valid execution pairs.
-
Inference Execution: Gemma 4 uses its high-speed Mixture-of-Experts (MoE) routing matrix to generate a strictly typed JSON payload holding
shapeType,params, a descriptiveexplanation, a sequentialfeatureTree, and raw native coordinates. -
Client Translation: The frontend parses the schema. Three.js spins up the WebGL interactive viewport using responsive
OrbitControls, while an HTML5 Canvas programmatically renders the 2D orthographic blueprints with accurate technical dimensions, centerlines, and hidden layer calculations.
5. Key Technical Decisions
| Decision | Tactical Benefit |
|---|---|
| Serverless Cloud Run | Zero cluster overhead, auto-scaling execution to 0, isolated 512MB RAM footprints—ideal for hyper-efficient, cost-conscious microservice hosting. |
| Pure Client-Side Renders | Leveraged vanilla HTML5 Canvas and native Three.js via CDN. Zero heavy server-side image processing, reducing latency to near real-time. |
| MoE Routing Constraints | Utilizing the 26B A4B variant routes work to 3.8B active parameters per token, slashing generation times while holding high-level structural intelligence. |
Unified @google/genai Integration |
Erased SDK technical debt by utilizing the new standardized Google SDK, establishing clean async patterns via ai.models.generateContent(). |
6. Technical Engineering Challenges Overcome
-
API Paradigm Shifts: Rebuilding routing structures during mid-hackathon SDK shifts required migrating from traditional model instantiations (
getGenerativeModel) to direct top-level unified client calls. -
Undocumented Endpoint Properties: Debugging the
globalrouting requirement for Model Garden MaaS instances required checking runtime variables, as generic regional project variables consistently failed network handshakes. - JSON Conformance Over Enforcement: Standard models often fail to follow JSON output rules when technical jargon increases. This was solved by modifying the prompt structure to split thinking blocks from payload data blocks using distinctive tag sets.
7. Core Parametric Library
| Primitive | Param 1 | Param 2 | Param 3 | Param 4 |
|---|---|---|---|---|
| plate | length | width | thickness | holeDia |
| bracket | leg1 | leg2 | width | thickness |
| spacer | length | outer | bore | — |
| block | length | width | height | bore |
| cube | length | width | height | — |
| cylinder | radius | height | — | — |
| sphere | radius | — | — | — |
| cone | radius1 | radius2 | height | — |
| torus | ringRadius | tubeRadius | — | — |
8. What's Next
- Multimodal Blueprint Auditing: Gemma 4 natively supports image and video processing inputs. The immediate next phase is enabling users to snap a photo of a rough white-board sketch, passing it directly into Gemma to reconstruct an exact CAD replica.
- Vector & Asset Export: Implement programmatic transformation modules to export geometries directly into industrial formats like STEP, DXF, and STL.
- Edge Modifications: Introduce dynamic modifier hooks to compute automated Chamfer, Fillet, and Shell algorithms across active objects.





















