慣性聚合 高效追讀感興趣之博客、新聞、科技資訊
閱原文 以慣性聚合開啟

推薦訂閱源

L
LangChain Blog
宝玉的分享
宝玉的分享
酷 壳 – CoolShell
酷 壳 – CoolShell
N
Netflix TechBlog - Medium
F
Fortinet All Blogs
T
Tailwind CSS Blog
Google DeepMind News
Google DeepMind News
Jina AI
Jina AI
J
Java Code Geeks
Recent Announcements
Recent Announcements
The Cloudflare Blog
D
DataBreaches.Net
Hugging Face - Blog
Hugging Face - Blog
WordPress大学
WordPress大学
Vercel News
Vercel News
月光博客
月光博客
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Microsoft Azure Blog
Microsoft Azure Blog
雷峰网
雷峰网
H
Help Net Security
博客园 - Franky
S
SegmentFault 最新的问题
T
The Blog of Author Tim Ferriss
博客园_首页
C
Check Point Blog
腾讯CDC
美团技术团队
Martin Fowler
Martin Fowler
The GitHub Blog
The GitHub Blog
M
MIT News - Artificial intelligence
Apple Machine Learning Research
Apple Machine Learning Research
P
Proofpoint News Feed
U
Unit 42
人人都是产品经理
人人都是产品经理
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Engineering at Meta
Engineering at Meta
M
Microsoft Research Blog - Microsoft Research
阮一峰的网络日志
阮一峰的网络日志
G
Google Developers Blog
Stack Overflow Blog
Stack Overflow Blog
B
Blog
Last Week in AI
Last Week in AI
博客园 - 三生石上(FineUI控件)
博客园 - 聂微东
云风的 BLOG
云风的 BLOG
H
Hackread – Cybersecurity News, Data Breaches, AI and More
李成银的技术随笔
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
博客园 - 叶小钗
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知

DEV Community

Local RAG: Chat With Your Documents (Open Source, Private) GGUF & Modelfile: The Power User's Guide to Local LLMs What Excited Me Most at Google I/O 2026 OSS assemble! Kilo Code is launching on Product Hunt. Join the launch! https://www.producthunt.com/products/kilocode Your Organizational AI Adoption Metrics Are Lying (Plus How to Measure Real Adoption) The Moment I Realized AI Agents are Changing Software Forever Prisma Generator NestJS DTO — pluggable DTOs with annotations and custom generators I Spent a Month Testing Decentralized Poker Sites. Here's What Actually Works. DeepSeek-R1: The $0 o1 Alternative You Can Run Right Now The PHP Stack I Built TrustGate On — And Why I'd Do It Differently Today Building High-Throughput Data Pipelines: Why Chaining Encryption and Compression is a Performance Killer Optic is dead. A 2026 migration guide for OpenAPI breaking changes Smart Blind Stick, Mini Project The NSA just published an MCP security playbook. We created Agent Trust Transport Protocol ATTP - Implement today with MCPS Symfony 8 AWS Secrets Bundle Canlı TV Platformu Geliştirirken Öğrendiğim Teknik Dersler: Streaming, Flussonic ve Performans Gemma 4 Is Powerful — But Production AI Still Needs Governance What RepoSignal Surfaced in React — and Why Review Alone Doesn't Catch Everything LeetCode Solution: 1752. Check if Array Is Sorted and Rotated Breaking the Matrix at 15: How I Built a Cyber-Aesthetic AI Assistant Core Powered by Gemma 4 Разработка Android Kiosk приложения No More Manual Test Writing: How I Used Gemma 4 to Turn a GitHub Repo Into a Full Test Suite 🎯 Trafik Cezaları Platformları Geliştirirken Öğrendiğim Teknik Dersler The Myth of Low Latency: Why Event Meshes Make Your System Slow Building EIDOLON OS — A Local-First AI Cognitive Operating System qrrot - database with AI I Built a Local Gemma 4 Reviewer for Merchant Registry Evidence Compass v1.1.0 · we shipped a memory plugin that catches its own consumption drift How to build your first MCP server in 10 minutes Expo SDK 56 Is Out, and a Few Things Finally Clicked Into Place Building a 100ms Browser-Native WebSocket Clipboard Cómo solucionar `docker run` con `Exited (1)` en Raspberry Pi Why Claude Code Sessions Diverge: A Mechanism Catalog When One AI Agent Is Not Enough: A Practical Delegation Pattern for Enterprise Systems Cómo solucionar el bucle infinito en `useEffect` con objetos y arrays 🛢️ The Dangote Chain: What a Blockchain-Native Refinery IPO Would Look Like Build a "Where to Watch" feature in 50 lines with the StreamWatchHub API Gemma 4 on Android: Tricks for Faster On-Device Inference Your AI agent has amnesia. You've just normalized it. 🚀 Reviving My Women Safety System – From Idea to Real-Time Smart Safety Solution I built an AI that reviews every PR automatically (because nobody was reviewing mine) 🌿 Git Mastery: The Complete Developer Guide Bringing Gemma 4 E2B to the Edge: Building a Privacy-First Dream Analyzer with Flutter & LiteRT Google I/O 2026 Wasn’t About Features — It Was About AI Becoming the Developer Environment Building an AI Vedic Astrology App in 25 Days — What Actually Worked (and What Didn't) Hermes Agent Has Four Memories — And That's Why It Doesn't Forget You Pressure Isn't Killing You -Your Relationship With It Is 🐳 How to Run Any Project in Docker: A Complete Guide AccessLens — a blind person's lanyard, powered by Gemma 4 on-device Glyph v0.2: the release is the joinery
构建Windows平台之生产级MLOps家庭实验室——K8s,LLM,RAG及GitLab CI
Slimane BOUH · 2026-05-24 · via DEV Community

TL;DR — 吾于 Windows 11 之机,以 Multipass 配 k3s,设 MLOps 之全栈。此乃真指南也——尽载吾所遇之误,及如何解之。无虚饰,无完美之截图。惟实然可行之事耳.


为何构设家之实验室?

习之云费,积速如山。设家之实验室,予尔:

  • 真 Kubernetes — 非玩具之 Minikube 模式也
  • 全MLOps栈 — MLflow, Minio, Airflow, Ollama, Qdrant
  • GitLab之CI/CD — 实际之流程,非教程之演示
  • 零成本 — 运于尔已拥有之硬件
  • 安全之沙盒 — 毁坏亦无后患

其旨非仅使诸务运行。其旨乃在习练DevOps與MLOps之全流程自始至终:推代码→管道触发→Terraform规整→服务部署→指标显于Grafana.


我之设

资源
操作系统 Windows 11 Pro
内存 32 GB
处理器 八核
硬盘 500 GB SSD
虚拟机监视器 Hyper-V (原生 Windows 专业版)
虚拟机管理器 Multipass
Kubernetes k3s

架构决策:吾恒以 Windows 为日常驱动,悉运行于 Ubuntu 虚拟机之内,藉 Multipass 之力。分野明晰,启停自如,无并驾之烦忧。


堆栈

Windows 11 (daily driver)
│
├── 🌐 GitLab.com (SaaS — free tier)
│    └── Pipelines + Container Registry
│
└── Multipass → vm-k3s (10 GB RAM / 4 CPU / 80 GB)
     │
     ├── ☸️  k3s (Kubernetes)
     │
     ├── ⚙️  MLOps
     │    ├── MLflow      — experiment tracking
     │    ├── Minio       — S3-compatible artifact storage
     │    └── Airflow     — pipeline orchestration
     │
     ├── 🤖 LLM Stack
     │    ├── Ollama      — run LLMs locally (CPU)
     │    └── LiteLLM    — unified OpenAI-compatible API
     │
     ├── 🔍 RAG Stack
     │    ├── Qdrant      — vector database
     │    └── LangChain   — RAG orchestration
     │
     ├── 📊 Observability
     │    ├── Prometheus  — metrics
     │    ├── Grafana     — dashboards
     │    └── Loki        — centralized logs
     │
     └── 🔐 HashiCorp Vault — secrets management

入全景模式 出全屏模式


第一步 — 启用Hyper-V并安装工具

首,于 Windows Pro 中启 Hyper-V(Multipass 所需也):

# Run as Administrator
Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Hyper-V-All
# Reboot when prompted

入全屏模式 出全屏模式

重启后,以winget安装诸工具:

winget install Canonical.Multipass
winget install Git.Git
winget install Microsoft.VisualStudioCode
winget install Hashicorp.Terraform
winget install Helm.Helm
winget install Kubernetes.kubectl

入全屏模式 出全屏模式

何故择winget而弃Chocolatey?吾初用choco install multipass,遇此谬误:
Exception calling "Start": "The specified executable is not a valid application for this OS platform."
Winget直装认证之安装程序。宜用winget。


第二步 — 置虚拟机

multipass launch `
  --name vm-k3s `
  --cpus 4 `
  --memory 10G `
  --disk 80G `
  22.04

入全景模式 出全屏模式

内存小贴士:吾初试十六吉,得:
Failed to allocate 16384 MB of RAM: Insufficient system resources
Windows已耗用约20GB。10GB于32GB之机,乃至善之境——使操作系统从容,予k3s广袤之域。

审汝之实有之RAM,乃造VM之前。

Get-CimInstance Win32_OperatingSystem | Select-Object FreePhysicalMemory, TotalVisibleMemorySize

入全景模式 出全景模式

入虚拟机:

multipass shell vm-k3s
# Prompt becomes: ubuntu@vm-k3s:~$

入全景模式 出全景模式


第三步 — 安装 Docker + k3s

于虚拟机内:

# Docker
curl -fsSL https://get.docker.com | sudo sh
sudo usermod -aG docker ubuntu
sudo systemctl enable docker && sudo systemctl start docker

# k3s — lightweight Kubernetes
curl -sfL https://get.k3s.io | sh -s - \
  --write-kubeconfig-mode 644 \
  --disable traefik \
  --docker

sleep 20
sudo k3s kubectl get nodes

入全景模式 出全景模式

配置 kubectl:

mkdir -p ~/.kube
sudo cp /etc/rancher/k3s/k3s.yaml ~/.kube/config
sudo chown ubuntu:ubuntu ~/.kube/config
echo 'export KUBECONFIG=~/.kube/config' >> ~/.bashrc
source ~/.bashrc

kubectl get nodes
# NAME      STATUS   ROLES                  AGE   VERSION
# vm-k3s    Ready    control-plane,master   30s   v1.29.x

入全景模式 退出全屏模式


第四步 — Helm + 命名空间

# Install Helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

# Add repos
helm repo add grafana         https://grafana.github.io/helm-charts
helm repo add prometheus      https://prometheus-community.github.io/helm-charts
helm repo add open-telemetry  https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo add hashicorp       https://helm.releases.hashicorp.com
helm repo update

# Create namespaces
for ns in mlops llm rag monitoring logging vault; do
  kubectl create namespace $ns
done

进入全屏模式 退出全屏模式


第五步 — 连接 GitLab CI

吾用GitLab.com SaaS,非自架 GitLab。此省 RAM 六 GB — GitLab CE 单独需六 GB 以上。免费版于家庭实验室已足。

于 gitlab.com 上立一项目,取注册之令牌。设置 → 持续集成/持续部署 → 运行器,则:

# Install GitLab Runner
curl -L https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh | sudo bash
sudo apt-get install -y gitlab-runner
sudo usermod -aG docker gitlab-runner

# Register
sudo gitlab-runner register \
  --non-interactive \
  --url "https://gitlab.com" \
  --registration-token "YOUR_TOKEN_HERE" \
  --executor "docker" \
  --docker-image "alpine:latest" \
  --docker-volumes "/var/run/docker.sock:/var/run/docker.sock" \
  --docker-privileged \
  --description "homelab-runner" \
  --tag-list "homelab,k8s,mlops,terraform" \
  --run-untagged true

sudo gitlab-runner start

入全景模式 出全屏模式

君之跑者,数秒内显绿于 GitLab。今每推,皆引本地之真 CI/CD。


第六步 — 部署Minio(正道)

此乃吾初遇大阻之境也。

吾所试者何:

helm install minio bitnami/minio \
  --namespace mlops \
  --set auth.rootUser=minioadmin \
  --set auth.rootPassword=minioadmin123

入全屏模式 出全屏模式

所遇之事何:

Failed to pull image "docker.io/bitnami/minio:2025.7.23-debian-12-r3": not found
Error: ErrImagePull → ImagePullBackOff

入全屏模式 出全屏模式

Bitnami所生之Helm图签,引 Docker所载之像,然此像尚未存于Docker Hub。此乃时序之故也。

其修也 — 直用官办Minio之像:

cat <<'EOF' | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: minio
  namespace: mlops
spec:
  replicas: 1
  selector:
    matchLabels:
      app: minio
  template:
    metadata:
      labels:
        app: minio
    spec:
      containers:
      - name: minio
        image: quay.io/minio/minio:latest
        command: ["minio", "server", "/data", "--console-address", ":9001"]
        env:
        - name: MINIO_ROOT_USER
          value: "minioadmin"
        - name: MINIO_ROOT_PASSWORD
          value: "minioadmin123"
        ports:
        - containerPort: 9000
          name: api
        - containerPort: 9001
          name: console
---
apiVersion: v1
kind: Service
metadata:
  name: minio
  namespace: mlops
spec:
  type: NodePort
  ports:
  - name: api
    port: 9000
    nodePort: 30900
  - name: console
    port: 9001
    nodePort: 30901
  selector:
    app: minio
EOF

入全景模式 出全景模式

quay.io/minio/minio者,Minio自设之登载所也 — 常新无谬,无号之别。


第七步 — 部署MLflow

MLflow需Minio为其实物后援。吾用SQLite以简之(家居实验室不倚PostgreSQL):

cat <<'EOF' | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mlflow
  namespace: mlops
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mlflow
  template:
    metadata:
      labels:
        app: mlflow
    spec:
      initContainers:
      - name: create-minio-bucket
        image: quay.io/minio/mc:latest
        command: ["/bin/sh", "-c"]
        args:
          - |
            mc alias set minio http://minio:9000 minioadmin minioadmin123
            mc mb minio/mlflow --ignore-existing
      containers:
      - name: mlflow
        image: ghcr.io/mlflow/mlflow:latest
        command:
          - mlflow
          - server
          - --host=0.0.0.0
          - --port=5000
          - --backend-store-uri=sqlite:///mlflow.db
          - --default-artifact-root=s3://mlflow/
          - --serve-artifacts
        env:
        - name: MLFLOW_S3_ENDPOINT_URL
          value: "http://minio:9000"
        - name: AWS_ACCESS_KEY_ID
          value: "minioadmin"
        - name: AWS_SECRET_ACCESS_KEY
          value: "minioadmin123"
        - name: AWS_DEFAULT_REGION
          value: "us-east-1"
        ports:
        - containerPort: 5000
---
apiVersion: v1
kind: Service
metadata:
  name: mlflow
  namespace: mlops
spec:
  type: NodePort
  ports:
  - port: 5000
    nodePort: 30500
  selector:
    app: mlflow
EOF

入全景模式 退出全屏模式

initContainer在服务器启动前自动于Minio中创建mlflow桶——无需手动配置.


第八步——在本地以Ollama运行LLM

此乃妙处所在。于本地机器,于Kubernetes之中,唯以CPU之力运行真实LLM.

cat <<'EOF' | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ollama
  namespace: llm
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ollama
  template:
    metadata:
      labels:
        app: ollama
    spec:
      containers:
      - name: ollama
        image: ollama/ollama:latest
        ports:
        - containerPort: 11434
        env:
        - name: OLLAMA_NUM_PARALLEL
          value: "1"
        - name: OLLAMA_MAX_LOADED_MODELS
          value: "1"
        resources:
          requests:
            memory: "2Gi"
            cpu: "1"
          limits:
            memory: "4Gi"
            cpu: "3"
        volumeMounts:
        - name: ollama-data
          mountPath: /root/.ollama
      volumes:
      - name: ollama-data
        emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
  name: ollama
  namespace: llm
spec:
  type: NodePort
  ports:
  - port: 11434
    nodePort: 31434
  selector:
    app: ollama
EOF

进入全屏模式 退出全屏模式

严重:共享虚拟机务必设定资源限制无限制,Ollama将耗尽所有可用内存,OOM杀其他容器。此教训痛彻心扉

CPU模型之择

模型 所需内存 适用场景
Mistral 7B Q4 4.3 GB 重于四吉之限
Phi-3 Mini 三吉半 犹重
llama3.2:1b 一吉三 ✅ 适于CPU家宅之研
gemma2:2b 一吉六 ✅ 良替
OLLAMA_POD=$(kubectl get pod -n llm -l app=ollama -o jsonpath='{.items[0].metadata.name}')

# Pull the model
kubectl exec -n llm $OLLAMA_POD -- ollama pull llama3.2:1b

# Test it
kubectl exec -n llm $OLLAMA_POD -- ollama run llama3.2:1b "Explain RAG in 2 sentences"

入全屏模式 出全屏模式

以 API 測試:

VM_IP=$(hostname -I | awk '{print $1}')
curl -s http://$VM_IP:31434/api/generate \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3.2:1b",
    "prompt": "What is MLOps?",
    "stream": false
  }' | python3 -c "import sys,json; print(json.load(sys.stdin)['response'])"

全屏模式入: 全屏模式出:


步第九 — 可觀察性堆疊

# Prometheus + Grafana
helm install kube-prometheus prometheus/kube-prometheus-stack \
  --namespace monitoring \
  --set grafana.service.type=NodePort \
  --set grafana.service.nodePort=30300 \
  --set grafana.adminPassword=admin123 \
  --set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false

# Loki (log aggregation) + Promtail (log shipper)
helm install loki grafana/loki-stack \
  --namespace logging \
  --set grafana.enabled=false \
  --set promtail.enabled=true

全屏模式入: 全屏模式出:

訪 Grafana:

VM_IP=$(hostname -I | awk '{print $1}')
echo "Grafana: http://$VM_IP:30300"
# Login: admin / admin123

全屏模式入: 全屏模式出:

於 Grafana 中加 Loki 為數據源:

  • 設置 → 資料來源 → 添加 → Loki
  • 網址: http://loki.logging:3100

今已統一記錄與指標於一儀表.


步驟十 — 密鑰管理之閣

helm install vault hashicorp/vault \
  --namespace vault \
  --set server.dev.enabled=true \
  --set server.dev.devRootToken=root \
  --set ui.enabled=true \
  --set ui.serviceType=NodePort \
  --set ui.serviceNodePort=30820

# Store your first secret
VAULT_POD=$(kubectl get pod -n vault -l app.kubernetes.io/name=vault -o jsonpath='{.items[0].metadata.name}')
kubectl exec -n vault $VAULT_POD -- vault secrets enable kv-v2
kubectl exec -n vault $VAULT_POD -- vault kv put secret/homelab \
  minio_key=minioadmin \
  minio_secret=minioadmin123

VM_IP=$(hostname -I | awk '{print $1}')
echo "Vault UI: http://$VM_IP:30820  (token: root)"

進入全屏模式 退出全屏模式

於汝之GitLab CI流程中,引閣中密鑰,勿硬碼於變數。


全景图 — 所有服务运行

VM_IP=$(hostname -I | awk '{print $1}')
echo "=== Your Home Lab ==="
echo "MLflow   : http://$VM_IP:30500"
echo "Minio    : http://$VM_IP:30901"
echo "Grafana  : http://$VM_IP:30300  (admin/admin123)"
echo "Vault    : http://$VM_IP:30820  (token: root)"
echo "Ollama   : http://$VM_IP:31434"

入全景模式 出全屏模式


所得之训

1. Bitnami Helm 图表因镜像标签而崩坏
勿用bitnami/minio—此引未成之像。用之。quay.io/minio/minio:latest直也。

二、常设 Kubernetes 资源之限于共用之虚拟机
无限则一贪欲之 Pod(尔其 Ollama 是也)将使众皆 OOMKill。常设 limits.memory 是务也.

三、内存之筹谋重于所思
于三十二吉字节之机,Windows 自身耗约二十吉字节。所余十二吉字节供尔之虚拟机。慎量之——十吉字节于虚拟机,乃切合实际之至善也。

四、GitLab SaaS > 家用实验室自托管
自托管GitLab CE需6GB以上内存方得闲置。GitLab.com免费版予无限私有仓库,400分钟CI/CD/月,及容器注册中心。用之。

五、CPU上LLM宜小始
无GPU勿用Mistral 7B于CPU。llama3.2:1b 之能,出人意料,适于 RAG 之试,仅用 1.3 GB。需力更盛,则后加 GPU 之透。

6. 用 winget 而非 choco 于 Windows 之 Multipass。
Chocolatey 之 Multipass 包,用安装者,于近年 Windows 之构,辄败。winget install Canonical.Multipass 每试必效。


何以继之

Kubeflow Pipelines

  • — 于K8s上,善调机器学习之管 OpenTelemetry Collector
  • — 统一踪迹、度数、日志之导 Datadog集成
  • — 尽送云上之可察 Terraform IaC
  • — 替换所有 kubectl apply 为适当的基础设施即代码
  • RAG 流程 — Qdrant + LangChain + Ollama 端到端

快速参考

# VM management (Windows PowerShell)
multipass list                    # list VMs
multipass shell vm-k3s            # enter VM
multipass suspend vm-k3s          # pause (saves RAM)
multipass start vm-k3s            # resume

# Inside the VM
kubectl get pods -A               # all pods
kubectl top pods -A               # RAM/CPU usage
free -mh                          # available RAM
watch kubectl get pods -A         # live monitoring

进入全屏模式 退出全屏模式


资源


此乃以咖啡与kubectl describe pod调试之功所建。若遇未及详述之困,但留言相询,吾乐于助之.

若此助尔省时,留一❤️——此亦助他人得见之。