This is a submission for the Gemma 4 Challenge: Build with Gemma 4
What I Built
OpenShift Virtualization Migration Advisor — a local-first assessment tool that ingests legacy hypervisor configurations (VMware .vmx, libvirt domain XML, OVF, RHV/oVirt exports) and produces a structured migration report for moving workloads to Red Hat OpenShift Virtualization.
The problem it solves is specific and unglamorous: enterprises consolidating off vSphere and legacy KVM have a discovery bottleneck. Their VM inventories live in config files that contain infrastructure secrets — storage paths, VLAN topology, encryption key references, FIPS posture, licence keys. Sending those to a hosted LLM is a non-starter for regulated workloads.
So I built the assessment to run entirely on the host machine. Paste a config or upload an inventory → get a six-section migration report covering inventory mapping, OpenShift Virt primitive equivalents (VirtualMachine, DataVolume, NetworkAttachmentDefinition, StorageClass), compatibility risk flags, MTV-vs-virt-v2v tooling recommendation, effort sizing, and security posture preservation. Nothing leaves the box.
Demo
The tool generated this report from a single VMware .vmx file, fully on-device using Gemma 4 26B MoE via Ollama. Screenshots below; full generated output included so judges can see the actual model output, not a curated highlight.

Local Gradio UI. Pick a sample → click Generate → report streams in from local Gemma 4.

Six-section migration report streamed from Gemma 4 26B MoE running locally. No cloud API calls.
Full Report (generated locally)
From samples/vmware-web-tier.vmx — 311.7s on a 16 GB-VRAM laptop with 85% CPU offload
> Source detected: VMware vSphere (.vmx)
> Model: gemma4:26b
> Generated locally in: 311.7s on a consumer laptop GPU + CPU offload
1. Inventory Summary
| Name | vCPU | Memory | Disk(s) | Network(s) | Guest OS | Notes |
|---|---|---|---|---|---|---|
| web-tier-01 | 4 (2 sockets × 2 cores) | 8192 MiB | web-tier-01.vmdk | VLAN-100-DMZ | rhel9-64 | Production Nginx/PHP-FPM; requires UEFI Secure Boot; vSphere encrypted. |
2. OpenShift Virtualization Equivalents
| Resource | OpenShift Virtualization Primitive | Configuration Details |
|---|---|---|
| Compute | VirtualMachine |
spec.template.spec.domain.cpu: 2 sockets, 2 cores; spec.template.spec.domain.resources.requests.memory: 8Gi
|
| Firmware | VirtualMachine |
spec.template.spec.domain.firmware: efi (Secure Boot enabled) |
| Storage | DataVolume |
storageClassName: [not specified in source]; requires conversion from pvscsi to virtio-scsi
|
| Network | NetworkAttachmentDefinition |
Target VLAN-100-DMZ via Multus CNI |
3. Compatibility & Risk Flags
-
Critical — SCSI Controller Mismatch: Source uses
pvscsi. OpenShift Virt requiresvirtio-scsi. Driver injection or kernel reconfiguration may be necessary if the RHEL 9 initramfs lacks virtio drivers (unlikely for RHEL 9). -
Critical — Encryption / KMS Breakage: Source uses vSphere-specific KMS (
vsphere-kms-key-001). This encryption cannot be migrated. Data must be decrypted during migration or re-encrypted using OpenShift-native methods (encrypted PVCs). -
High — Network Driver Mismatch:
vmxnet3must be replaced withvirtiofor optimal performance and compatibility. -
High — UEFI Secure Boot: OpenShift Virt supports UEFI; the migration pipeline must ensure the
VirtualMachineobject is correctly configured withfirmware: efiand a bootloader compatible with the new virtual hardware. -
Medium — CD-ROM dependency: The
rhel-9.4.isois attached. Must be manually re-attached via aConfigMaporDataVolumeif required for post-migration configuration.
4. Migration Path
web-tier-01: MTV (Migration Toolkit for Virtualization).
Justification: MTV is the recommended path for vSphere-to-OpenShift migrations — it automates the conversion of VMDK to DataVolume and handles the mapping of vSphere networks to NetworkAttachmentDefinition.
5. Effort & TCO Notes
- Effort (web-tier-01): S Justification: Single, modern Linux guest (RHEL 9) with standard hardware requirements and no complex PCI passthrough.
-
TCO / Licensing:
- Requires OpenShift Container Platform (OCP) subscription
- RHEL guest licensing applies (standard RHEL / RHEL for Virtual Datacenters)
- Consolidation: If other web-tier nodes exist in the source, migrate them in a single MTV plan to reduce migration window and operational overhead.
6. Security & Compliance
| Source Security Feature | OpenShift Virtualization Equivalent |
|---|---|
| vSphere Encryption (KMS) | Encrypted PVCs (via StorageClass / CSI) or dm-crypt within the Guest OS |
| VLAN-100-DMZ Isolation |
NetworkAttachmentDefinition + NetworkPolicy for micro-segmentation |
| UEFI Secure Boot |
VirtualMachine spec firmware: efi with Secure Boot enabled |
| Production Workload Isolation | Namespace-level isolation in OpenShift |
Code
Repository: https://github.com/Bharathtrainer/openshift-migration-advisor
How I Used Gemma 4
I chose Gemma 4 26B MoE (gemma4:26b) after starting on 31B Dense and discovering it was the wrong tool for this workload.
The honest path: I picked 31B Dense first because the highest-quality reasoning seemed like the obvious choice for infrastructure assessment. Two problems surfaced on real-world inputs:
- Ollama Flash Attention prefill stall on Dense (ollama#15350) hangs the 31B variant on prompts beyond ~3–4K tokens. A multi-VM datacenter inventory blows past that on the first VM. The bug is specific to Dense's hybrid sliding+global attention; MoE handles the same prompts cleanly.
- Active-parameter efficiency. 26B MoE activates ~4B parameters per token versus 31B for Dense. On a consumer laptop GPU, that's the difference between a model that works (with some CPU offload) and one that doesn't fit at all.
What I kept from picking MoE over Dense:
- 256K context window — enough to ingest an entire small-datacenter inventory in one shot
- Stable long-prompt prefill on Ollama's current build
-
Native reasoning mode via the
<|think|>system-prompt token - Workable throughput on consumer hardware — generation runs even when 85% of layers spill to CPU
Honest performance note: the report above generated in 311.7 seconds on a 16 GB-VRAM laptop GPU with 85% CPU offload (ollama ps confirms the split). On a workstation with 24+ GB VRAM the same generation should land in 30–60 seconds. This is exactly the kind of detail you want a tool to expose, not hide — local AI's pitch is data sovereignty, and the tradeoff is hardware-dependent latency. Field engineers running this for offline assessment will accept 5 minutes for a report they can't legally send to a cloud API.
When MoE is not the right pick: short, single-turn, hard math/code reasoning where Dense's per-token capacity matters more than throughput. For long, structured, enterprise-document reasoning over large configs, MoE wins. That's the call this build makes, and the rationale is documented in the README with the GitHub issue link, not vibes.
One Gemma 4-specific detail worth flagging: I follow the recommended sampling (temperature=1.0, top_p=0.95, top_k=64) and set OLLAMA_FLASH_ATTENTION=1 + OLLAMA_KV_CACHE_TYPE=q4_0 to keep the KV cache compact enough for a 16K context window. Those four config values are the difference between this running at usable speed and not running at all.
Built entirely on a laptop. No cloud API key was used at any point in the construction of this submission. The report you see above was generated by Gemma 4 running on the same machine.




















