This is a submission for the Gemma 4 Challenge: Build with Gemma 4
What I Built
Static textbooks are a thing of the past. I built Gemminate, an intelligent, agentic learning platform that ingests raw textbook PDFs and instantly transforms them into interactive, personalized, and highly dynamic learning journeys.
Instead of forcing students to read hundreds of pages sequentially, Gemminate uses AI to act as a 1-on-1 tutor. When a user uploads a PDF, the platform:
- Extracts and Maps: Analyzes the Table of Contents and page contents to build a hierarchical, interactive "Learning Map." based on the learner's objectives.
- Generates Interactive UI: Seamlessly weaves standard textbook reading with AI-generated Quizzes, Flashcards, Feynman Technique exercises, and Video suggestions directly into the syllabus.
- Creates On-The-Fly Visualizations: Automatically generates highly interactive D3.js and Three.js HTML visualizations directly in the browser to explain complex topics mathematically and visually.
- Grades Handwritten Exams (Multimodal): Features a "Qualify" stage where students upload photos of their handwritten answers, which the AI reads, evaluates, and scores before unlocking the next chapter.
Built with FastAPI, LangGraph, FAISS (RAG), and Vanilla JS, Gemminate turns any dense PDF into an engaging, multimodal, and interactive classroom experience.
Demo
Live Application: gemminate.com
Video Walkthrough: [https://youtu.be/PnnDTs6KpNc]
Code
How I Used Gemma 4
Gemma 4 isn't just a chatbot in this project, it is the core reasoning engine driving the entire backend pipeline. I specifically chose the Gemma 4 26B Mixture-of-Experts (MoE) model (google/gemma-4-26b-a4b-it) via OpenRouter, as it perfectly bridged the gap between complex reasoning, rapid token throughput, and cost-efficiency.
Here is how Gemma 4 powers Gemminate's core features:
- Native Multimodal Vision for Handwritten Grading: One of Gemminate's standout features is the "Qualify" module. I utilized Gemma 4's native vision capabilities to evaluate photos of a user's handwritten physics/math answers. The model cross-references the student's handwriting against the textbook's context, scoring the test out of 10. I also used the vision model to analyze complex textbook diagrams and generate pedagogical meta-descriptions of the layout.
-
Complex Agentic Code Generation (D3.js):
When a user requests a visual explanation (e.g.,
@visual Magnetic Force), Gemma 4 is prompted to generate a complete, interactive HTML document featuring D3.js or Three.js code. The 26B MoE model's advanced coding capabilities allowed it to consistently output valid, bug-free declarative D3 selection code, complete with CSS and sliders for variable manipulation. - 128K Context Window for Chapter Mapping: To build the hierarchical "Learning Map," I pass massive arrays of page-by-page textbook summaries into Gemma 4. Leveraging its massive context window, the model successfully synthesizes these summaries into a unified JSON tree of chapters, subchapters, and dynamically inserted "learning node" injections (quizzes/flashcards) without losing track of the page numbering.
-
Structured Data & JSON adherence:
Gemminate heavily relies on LangGraph pipelines that require strict JSON outputs for Quiz, Flashcard, and Tree generation. Gemma 4 handled deep structural formatting flawlessly, successfully escaping LaTeX math notations (
\\alpha,\\[ \\]) inside JSON string values so the frontend MathJax could render them beautifully.
Gemma 4's 26B MoE model proved to be the ultimate workhorse: fast enough to stream dynamic D3 code to the frontend in seconds, and smart enough to accurately grade handwritten calculus.
@azanzijiomekong was very instrumental in ensuring the product lived up to expectation.




















