惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

WordPress大学
WordPress大学
T
Threat Research - Cisco Blogs
D
DataBreaches.Net
Microsoft Azure Blog
Microsoft Azure Blog
D
Docker
P
Proofpoint News Feed
小众软件
小众软件
博客园 - 聂微东
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
人人都是产品经理
人人都是产品经理
J
Java Code Geeks
Martin Fowler
Martin Fowler
L
LangChain Blog
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
李成银的技术随笔
MongoDB | Blog
MongoDB | Blog
M
MIT News - Artificial intelligence
阮一峰的网络日志
阮一峰的网络日志
Hacker News: Ask HN
Hacker News: Ask HN
C
CERT Recently Published Vulnerability Notes
H
Help Net Security
The GitHub Blog
The GitHub Blog
S
Security Archives - TechRepublic
AWS News Blog
AWS News Blog
Project Zero
Project Zero
Security Latest
Security Latest
P
Privacy International News Feed
T
Troy Hunt's Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
C
CXSECURITY Database RSS Feed - CXSecurity.com
I
Intezer
酷 壳 – CoolShell
酷 壳 – CoolShell
The Hacker News
The Hacker News
I
InfoQ
P
Proofpoint News Feed
C
Cisco Blogs
aimingoo的专栏
aimingoo的专栏
T
ThreatConnect
Recorded Future
Recorded Future
P
Palo Alto Networks Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
V
V2EX
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
G
GRAHAM CLULEY
F
Future of Privacy Forum
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
N
News and Events Feed by Topic
Engineering at Meta
Engineering at Meta

DEV Community

Google AI Studio Just Changed the Shape of App Development If you struggle to learn then this is for you. Best AI Agent Security & Guardrails Tools in 2026: LLM Guard vs NeMo vs Guardrails AI Building Dynamic RBAC in React 19: From Permission Strings to Component-Level Access Control Why We Switched from React to HTMX in Production: A 200-Site Case Study Gemma-Loom: The Intent-Based Virtual Machine (IVM) for Edge Sovereignty Java实习海投攻略:3天300个沟通,我是怎么拿到面试的 I Deployed Netflix's Web Server in 30 Seconds (And So Can You) - Docker Project 1 Debugging Android 14 WebRTC Disconnects on a coturn Relay Path 1/30 Days System Design Question Testing FastAPI + SQLAlchemy with Real PostgreSQL Fixtures: No More Mocking Misery FAQ Schema Markup Generators: What They Actually Do (and What They Don't Tell You) How a pure-TypeScript flex layout engine closed the last WASM-Yoga gap Spot instances as GitHub Actions runners Agents Need Receipts, Not Just Better Prompts readmegen — Generate beautiful README.md in seconds (12 templates, open source) When AI Reads Blueprints: The Hidden Attack Surface of Multimodal Engineering Intelligence Simplicity scales — complexity kills side projects AI does exactly what you ask — that's the problem How a model upgrade silently broke our extraction prompt (and how we caught it) The Best Form Backend for Static Sites in 2026 # ⛽ I Built a Cross-Platform Fuel Finder with React & Supabase: The Indie Dev Journey The 11 Major Cloud Service Providers in 2025 Membangun Karya Visual: Mengintip Fasilitas Multimedia dan Studio Kreatif Amikom What Is IOPS? Visualizing Database Design: From Interactive Canvas to Drizzle, Prisma, and SQL in Real-time A tool to make your GitHub README impossible to ignore 🚀 Zero-Downtime Blue-Green and IP-Based Canary Deployments on ECS Fargate I reproduced a Claude Code RCE. The bug pattern is everywhere. We Replaced Our RAG Pipeline With Persistent KV Cache. Here's What We Found. Jenkins CI/CD Pipeline for a Dockerized Node.js Application: Manual Trigger vs Automatic Trigger Using GitHub Webhooks How to Stream Live Forex Rates to Google Sheets API: A Complete Guide Small Models Will Beat Giant Models (And Most People Haven’t Realized Why Yet) How I Built 5 Linux Automation Scripts on AWS EC2 I built TokenPatch to measure AI coding cost per applied patch I built a Chrome extension to stop squinting at the web Producer audit clean, six tests red Conversa — A Multi-Agent AI Platform Powered by Gemma 4 Build a Real Agent in 15 Minutes with Gemini's New Managed Agents API What I Actually Build: AI Systems That Ship, Not Demos That Impress The Box Ticked While You Read This: LinkedIn, AI Training, and the Switch You Did Not Flip Investasi Masa Depan: Mengintip Fasilitas Laboratorium Komputer Kelas Dunia di Yogyakarta I Cancelled My $20 Claude Cowork Plan After a Week With OpenWork Stop Reviewing Every Line of AI Code - Build the Trust Stack Instead How To Build an Image Cropper in Browser (Simple Steps) I built a macOS disk cleaner for developers and just launched it would love feedback Membangun Kompetensi dan Relasi: Mengapa Ekosistem Kampus Itu Penting I Built an AI That Decides Which AI to Talk To — Running 24/7 From My Living Room Codex Team Usage SOP How to Actually Become a Programmer: The Hard Part Nobody Wants to Explain Building a Production-Style Multi-Tool AI Agent with Python, Flask, React & Gemini AI The Caretaker Sandbox: An Offline-First Visual Playground & Template Engine powered by Gemma 4 # Building Instagram OSINT Projects with HikerAPI Your AI can read. Gemma 4 can see The Battle of the Senior Dev: Why AI Gives You Wings But Only If You're Ready to Pilot HiDream Raw Output Failed Tried Dev-2604 VRAM Math Killed It Won with a Prompt Enhancer Instead I Finally Finished a Project I Abandoned — And GitHub Copilot Helped Me Ship It SafeSMS: On-Device Threat Detection with Gemma 4 E4B, no internet required I Built OpenKap — A Loom Alternative for Small Teams Who Just Want to Ship Gemma 4 is Here: The Dawn of Local Multimodal Reasoning Offline-First Flutter: How We Built a CRM That Manages 100K+ Leads With No Internet Memory for Agents: When Vectors Meet Graphs, Bugs Drop 4 The Rise of Production-Grade AI Infrastructure I ran my idea-validation product through its own validator. The verdict was PIVOT. We Built an Agent Commerce API. Google I/O 2026 Changed Our 3-Month Roadmap in 24 Hours. "My Partner's Memory Was Full. I Didn't Know — Until We Tried to Talk." I’m a Front End Web Developer Learning Machine Learning From Scratch Laravel Waiting Request I Built a Chrome Extension to Track How Long You Actually Spend on Each Tab Why Google Can't See Your React Breadcrumbs (And the 4-Line Fix) AI Travel Assistant Powered by Gemma 4; With Streaming, Image Input, and Visual Recommendation Cards Microsoft tried to kill the printer driver. Healthcare said no. The Blueprint Beneath the Blueprint: Designing Data Model and Choosing Its Database REST APIs vs Webhooks in Telecom Billing - Which One Actually Makes Sense? Accounting Made Simple: AI-Powered Financial Insights of Japanese Companies with Gemma 4 The append-only AST trick that makes Flutter AI chat actually smooth Designing the Future of Payments — Why XML Still Matters in the Age of APIs From Legacy to Live — Reviving XMLPayments with GitHub Copilot Two Weeks Into Learning Solana XMLPayments — The Hidden Backbone of Modern Financial Orchestration AI Agents in Practice — Read from the beginning Reviving My Gemma Agentic Framework: From Prototype to Polished Repo Smart Contracts Demand Better Infrastructure: Building on contract.dev Self-Hosted LLM Tool Calling: Forge and the Build-vs-Buy Decision ORA-00072 오류 원인과 해결 방법 완벽 가이드 OpenWA for CTOs: Self-Hosted WhatsApp Gateway Trade-Offs NotebookLM Automation With notebooklm-py: Useful, But Classify Data First Docker v29.5.x Operator Upgrade Checklist Coding-Agent Instruction Design: The CLAUDE.md File That Prevents Rework When I Finally Realized My Runtime Was Holding Me Back GnokeOps: Host Your Own AI House Party The Death of Static Rate Limiters: Why Your Java Virtual Threads Need BBR-Style Adaptive Concurrency AI Agents in Practice — Part 2: What Makes Something an Agent Stop scattering LLM SDK/API calls across your codebase. Here is the 2-file rule that fixed mine Beyond Prompts: Structuring AI Workflows for Real Frontend Engineering From an Abandoned Hackathon Project to an AI Study Workspace 🚀 Terraform with AI: Build AWS Infra (Cursor + MCP) What If AI Didn’t Need the Internet? 750,000 Chips, 140 Trillion Tokens: The Math Behind DeepSeek's Permanent Price Cut You're Renting Someone Else's Compute — And It's Costing You More Than You Think
How to Build a Self-Hosted AI Code Review Tool in Python
Ayi NEDJIMI · 2026-05-23 · via DEV Community

Every team has the same code review problem: PRs sit for days, reviewers miss subtle logic bugs, and security issues slip through because nobody carefully checked the authentication layer. Linters catch syntax and style issues, but they don't reason about intent. A language model can — and you can run it entirely on your own infrastructure without sending a single line of your source code to a third party.

This guide walks you through building a self-hosted AI code review tool in Python. It reads a git diff, sends it to a locally hosted language model, and returns structured review comments you can pipe directly into your CI workflow.

Why Self-Hosted Matters

Sending your source code to an external API is a significant trust decision. For proprietary code, regulated industries, or anything security-sensitive, you want model inference happening inside your own perimeter. Ollama handles this cleanly: it runs any GGUF-quantized model locally and exposes an HTTP endpoint that's fully compatible with the OpenAI Python SDK. You get the same API surface, zero data egress.

The architecture is intentionally simple:

  • A Python script reads a git diff (or file path)
  • It splits the diff into manageable chunks
  • Each chunk is sent to the local LLM with a structured system prompt
  • The model returns JSON-formatted review comments
  • You aggregate and display them — or feed them into your CI gate

Setting Up

You need Python 3.11+, the openai SDK (it works against any compatible endpoint), and Ollama running locally with a code-focused model. codellama:13b works well; deepseek-coder:6.7b is faster and nearly as accurate for review tasks.

pip install openai gitpython
ollama pull deepseek-coder:6.7b

Enter fullscreen mode Exit fullscreen mode

Store your config in a .env file — the script reads from environment variables so swapping models requires no code changes:

OLLAMA_BASE_URL=http://localhost:11434/v1
OLLAMA_API_KEY=ollama
OLLAMA_MODEL=deepseek-coder:6.7b

Enter fullscreen mode Exit fullscreen mode

The Core Reviewer

The script reads a diff from a file argument or stdin (which makes it trivial to wire into a git hook), sends it to the model, and parses the structured output.

import os, json, sys
from openai import OpenAI

client = OpenAI(
    base_url=os.getenv("OLLAMA_BASE_URL", "http://localhost:11434/v1"),
    api_key=os.getenv("OLLAMA_API_KEY", "ollama"),
)
MODEL = os.getenv("OLLAMA_MODEL", "deepseek-coder:6.7b")

SYSTEM_PROMPT = (
    "You are a senior software engineer performing a code review.\n"
    "Analyze the provided code diff and return a JSON array of review comments.\n"
    "Each comment must have: severity (critical/warning/suggestion), "
    "line (int or null), message (str), fix (str or null).\n"
    "Return ONLY valid JSON. No prose outside the JSON array."
)

def review_diff(diff_text: str, max_chunk_chars: int = 6000) -> list[dict]:
    lines = diff_text.splitlines(keepends=True)
    chunks, current, current_len = [], [], 0
    for line in lines:
        if current_len + len(line) > max_chunk_chars and current:
            chunks.append("".join(current))
            current, current_len = [], 0
        current.append(line)
        current_len += len(line)
    if current:
        chunks.append("".join(current))

    all_comments = []
    for chunk in chunks:
        response = client.chat.completions.create(
            model=MODEL,
            messages=[
                {"role": "system", "content": SYSTEM_PROMPT},
                {"role": "user", "content": f"Review this diff:\n\n{chunk}"},
            ],
            temperature=0.1,
            max_tokens=1024,
        )
        raw = response.choices[0].message.content.strip()
        try:
            comments = json.loads(raw)
            if isinstance(comments, list):
                all_comments.extend(comments)
        except json.JSONDecodeError:
            pass
    return all_comments

if __name__ == "__main__":
    diff = open(sys.argv[1]).read() if len(sys.argv) > 1 else sys.stdin.read()
    comments = review_diff(diff)
    has_critical = False
    for c in sorted(comments, key=lambda x: ["critical","warning","suggestion"].index(x.get("severity","suggestion"))):
        print(f"[{c.get('severity','?').upper()}] line {c.get('line','?')}: {c.get('message','')}")
        if c.get("fix"):
            print(f"{c['fix']}\n")
        if c.get("severity") == "critical":
            has_critical = True
    sys.exit(1 if has_critical else 0)

Enter fullscreen mode Exit fullscreen mode

The script exits with code 1 if any critical issue is found, making it trivial to use as a blocking pre-push hook or CI gate.

Integrating into CI

For GitHub Actions, run the reviewer on every pull request diff:

name: AI Code Review
on: [pull_request]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - run: pip install openai
      - name: Run AI review
        env:
          OLLAMA_BASE_URL: ${{ secrets.OLLAMA_BASE_URL }}
          OLLAMA_API_KEY: ${{ secrets.OLLAMA_API_KEY }}
          OLLAMA_MODEL: deepseek-coder:6.7b
        run: |
          git diff origin/${{ github.base_ref }}...HEAD > pr.diff
          python reviewer.py pr.diff

Enter fullscreen mode Exit fullscreen mode

For self-hosted CI (Gitea Actions, GitLab CI, Jenkins), point OLLAMA_BASE_URL at your internal Ollama instance. The runner needs network access to it, but nothing leaves your perimeter. If your Ollama node lives on a private subnet, use a dedicated runner in that subnet rather than routing through a proxy.

Hardening the Prompt for Security Review

The default prompt covers general code quality. When you want security-focused output — useful as a pre-merge gate on sensitive services — specialize the system prompt:

SECURITY_PROMPT = (
    "You are a security-focused code reviewer.\n"
    "Flag only security vulnerabilities: injection flaws, auth bypasses, "
    "insecure deserialization, hardcoded credentials, missing input validation, "
    "race conditions, and OWASP Top 10 patterns.\n"
    "Return a JSON array: [{severity, cwe, line, message, fix}]. "
    "Return ONLY valid JSON."
)

Enter fullscreen mode Exit fullscreen mode

Swap this in for SYSTEM_PROMPT. The cwe field is useful if you want to integrate findings with a vulnerability tracker or feed them into a risk scoring pipeline.

Keep in mind that language models produce false positives at a non-trivial rate. Treat this layer as a fast first-pass triage, not a substitute for manual review. For a structured view of what to actually check before shipping to production, our security hardening checklists cover the most common vulnerability classes by language and framework.

Splitting Large Diffs by File

Chunking by character count works, but it can split a file mid-hunk and confuse the model. Splitting by file boundary gives better results:

import re

def split_diff_by_file(diff_text: str) -> list[str]:
    parts = re.split(r'(?=^diff --git )', diff_text, flags=re.MULTILINE)
    return [p for p in parts if p.strip()]

def review_all_files(diff_text: str) -> list[dict]:
    all_comments = []
    for file_diff in split_diff_by_file(diff_text):
        all_comments.extend(review_diff(file_diff))
    return all_comments

Enter fullscreen mode Exit fullscreen mode

For very large files (300+ changed lines), further split on the @@ hunk markers. The model's effective context for code analysis degrades past ~4000 tokens of diff — smaller, focused chunks consistently produce better output than one large dump.

The Takeaway

Self-hosted AI code review earns its place in the pipeline as a fast, cheap first-pass filter. It catches common patterns — missing error handling, SQL queries built with f-strings, hardcoded secrets, unvalidated user input — before a human reviewer ever opens the PR. The setup is lightweight: Ollama, one Python file, a CI step.

What it won't replace: architectural review, business logic validation, and nuanced security analysis that requires understanding your domain. The model doesn't know your codebase's invariants or threat model. But for the low-hanging fruit, it consistently earns its keep.

From here, you can extend this foundation: add a SQLite store to track comment trends over time, wire up the GitHub Reviews API to post inline comments on the PR diff, or build a prompt library with different reviewer personas (security, performance, readability). The pattern is solid — the specialization is up to you.


I run AYI NEDJIMI Consultants, a cybersecurity consulting firm. We publish free security hardening checklists — PDF and Excel.