Offline Agentic Coding

推荐订阅源

Security Archives - TechRepublic

OpenAI News

WeLiveSecurity

Hacker News: Ask HN

Hacker News - Newest: "LLM"

cs.AI updates on arXiv.org

Exploit-DB.com RSS Feed

News and Events Feed by Topic

TaoSecurity Blog

Heimdal Security Blog

Threat Intelligence Blog | Flashpoint

Palo Alto Networks Blog

Project Zero

Attack and Defense Labs

CXSECURITY Database RSS Feed - CXSecurity.com

Tor Project blog

Scott Helme

Threat Research - Cisco Blogs

Simon Willison's Weblog

Spread Privacy

Cisco Talos Blog

Threatpost

cs.CV updates on arXiv.org

The Last Watchdog

Google DeepMind News

Privacy & Cybersecurity Law Blog

Know Your Adversary

cs.CL updates on arXiv.org

Lohrmann on Cybersecurity

量子位

V2EX - 技术

The Exploit Database - CXSecurity.com

酷壳 – CoolShell

Recent Commits to openclaw:main

CERT Recently Published Vulnerability Notes

Java Code Geeks

OSCHINA 社区最新新闻

Will Angel's Blog

The AI Tarpit: Why You Can't Stop Reading Your Code Anthropic Fable The Stochastically K-Shaped Engineering Job Market Claude.AI Pro Plan quotas too small for deep research Apple Silicon costs more than OpenRouter Jankmarking: Janky Benchmarking Offline Agentic Coding: OpenCode Washington DC on track for most volatile temperature year since 1959

Offline Agentic Coding

2026-04-27 · via Will Angel's Blog

Tags: AI, LLMs, Agents, Local models, Ollama, Coding

Published 2026-04-27

offline agentic coding: a handdrawn aeroplane

You can use ollama as the backend for claude code!

ollama launch claude --model

This allows you to use claude code with local models. I'm writing this from an airplane with no internet connection.

Overall model comparisons

Gemma4:e2b did not finish any tasks despite being blazing fast at over 100 tokens per second.

qwen3-coder-next:q4_K_M actually did reasonably well. Felt a bit worse than haiku quality but notably slower. Took around half an hour to fill up 75k of context, which is about 40 tokens per second while taking 50-60gb of memory.

qwen3.6:35b was also fairly reasonable. Did an adaquate job writing a small local data processing job, but was also fairly slow.

Gemma4:31b felt the most 'claude-like' in claude code, but was also fairly slow and occasionally required some jostling and interruption.

Overall

I don't seriously recommend local agentic coding with LLMs. You need some serious hardware to run decent models and it's still slow. It's a nice capability to have locally, but it probably isn't better than coding by hand. Still very cool to have a computer that can program itself though, and amazing that a consumer device can locally run models and software that matches the original gpt-3 era ChatGPT style experience.

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。