惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

博客园 - 司徒正美
aimingoo的专栏
aimingoo的专栏
MongoDB | Blog
MongoDB | Blog
云风的 BLOG
云风的 BLOG
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
酷 壳 – CoolShell
酷 壳 – CoolShell
博客园 - 聂微东
Y
Y Combinator Blog
T
Tailwind CSS Blog
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
S
SegmentFault 最新的问题
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
博客园 - 【当耐特】
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
J
Java Code Geeks
美团技术团队
Google DeepMind News
Google DeepMind News
博客园_首页
Apple Machine Learning Research
Apple Machine Learning Research
T
The Blog of Author Tim Ferriss

DEV Community

Why AI Should Not Write SQL Against ERP Databases Vibe coding works until it doesn't. The debt is real. Shipping at the Edge: Migrating a Coffee Subscription Platform to Cloudflare Workers Stop Tab-Switching: A Developer's Guide to Color Tools That Actually Fit the Workflow DevOps vs MLOps vs AIOps: What Changes, What Stays, and a Simple Roadmap to Get Started 5 n8n Automations Every WooCommerce Store Needs (Save 10+ Hours/Week) What I Learned Building My Own AI Harness Hytale Servers Will Fail Treasure Hunts Until We Fix Our Event Handling Redux in React: Managing Global State Like a Pro Unfreezing Your GitHub Actions: Troubleshooting Stuck Deployments and Protecting Your Git Repo Statistics Unlocking Project Discoverability on GHES: A Key to Software Engineering Productivity When the Cleanup Code Becomes the Project Rockpack 8.0 - A React Scaffolder Built for the Age of AI-Assisted Development Mismanaging the Treasure Hunt Engine in Hytale Servers Will Get You Killed Stop Calling It an AI Assistant. It’s Already Managing Your Company Why Hardcoded Automations Fail AI Agents Why I built a post-quantum signing API (and why JWT is on borrowed time) Weekend Thought: Frontend Build Tools Suffer From Work Amnesia AI Is Changing Engineering Culture More Than We Realize A 10-Line Playwright Trick That Saved Me Hours on Every Sephora Run Everyone Was Focused on Gemini, But Infinite Scaler Was the Real Twister "Gemma 4 Analyzed My Bank Statements – Apparently I 'Have a Problem' with Coffee and Late-Night Apps" #css #webdev #beginners #codenewbie The Hidden Layer Every AI Developer Must Learn AlphaEvolve: Google DeepMind's Gemini-Powered Evolutionary Coding Agent RDS Reserved Instance Pricing: Every Engine, Every Rule, Real Dollar Savings How To Build An AI-Powered MVP Without Burning Your Startup Budget In 2026 Reading a Psychrometric Chart Without Getting Lost LMR-BENCH: Can LLM Agents Reproduce NLP Research Code? (EMNLP 2025) How to turn text into colors (without AI) Building Real-Time Apps in Node.js with Rivalis: WebSockets, Rooms, Actors, and a Binary Wire This Week In React #282 : Security, Fate, TanStack, Redux, Jotai | Hermes-node, Expo, Rozenite, Harness | TC39, Bun, pnpm, npm, Yarn, Node AI Copilot vs AI Agent Architecture - What's Actually Different (And Why It Matters) Smart Contract Security: NEAR's Futures Surge and AI Token Risks Database Maintenance: Tracing Production Incidents to Their Root Cause Stop juggling AI SDKs in PHP — meet Prisma Google Quietly Changed What “Apps” Mean at I/O 2026 The Infrastructure Team Is the Real Single Point of Failure Building SQLite from Scratch: 740 Lines of C++23 to Understand Every Byte of a .db File The 4 Levels of Hermes Agent Scaling Framework: From One Hermes Agent to a Fully Automated Team Your AI Has a Memory. It Just Doesn’t Know What to Remember. Claprec: Engineering Tradeoffs - Limited time vs. Perfection (6/6) Building a Daily Google News API Monitor in Python Building RookDuel Avikal: From Chess Steganography to Post-Quantum Archival Security Google I/O e IA: o que realmente muda na vida do dev? Color Contrast Failures: The Number One Accessibility Issue and How to Fix It # I Watched 15 Hours of Hermes Agent Videos So You Don't Have To Cómo solucionar el bucle infinito en useEffect con objetos y arrays en React The First Agent-Centric Cloud Security Platform — And Why We Didn't Build It That Way On Purpose Most Treasure Hunts Engines on Hytale Servers Are Built to Fail - Lessons from a Burned Database GhostScan v3.0 — From Closed-Source EXE to Open-Source Pentest Framework De hojas de cálculo a IA: construyendo una plataforma SRM moderna When is AI fine in education? Python Tools for Managing API Rate Limits in Data Pipelines How to Implement Exponential Backoff for Rate-Limited APIs in Python "My Web Chat Wasn't a Real Channel. That Broke My Agent Pipeline" next-advanced-sitemap v1.0.7 — safer URL ingestion & automatic trimming for Next.js sitemap generation I keep seeing people build an AI lead processing agent when they really need a 6-step rules engine AI Powered Student Learning Assistant Using Gemma 4 How I Built a Drop-In Proxy to Slash My OpenAI Bills by 20%+ Automatically Building a Sarcastic AI English Tutor with Persona-as-Code and Gemini Audio Input for Pronunciation Correction Five Years Later, I Finally Have 96GB VRAM — What It Actually Unlocks for Agent Loops Turning a 1-Line Idea Into a 40-Second Short with a 10-Beat Local Video Pipeline Running LTX-2.3 Alongside TTS on a Single 96GB GPU with a Cold-Start Architecture Cutting LTX-2 22B Peak VRAM by 40% with fp8_cast — and Why optimum-quanto Was a Trap HiDream Skeleton Mode: Prompt Beats OpenPose Ref — 8 Patterns Benchmarked Replicating a Language-Learning Comedy Short with Claude Code — Gemini as a Multimodal Sub-Agent HiDream-O1-Image 3–8x Faster: Benchmarking Steps, CFG, and Resolution AWS Savings Plan Buying Strategy: How to Layer, Size, and Time Commitments application.properties I built a macro tracker powered by AI + attitude Solace: A Global Mental Health First Responder Built with Gemma 4 Why Blocking Prompt Injection Is Wrong — and What to Do Instead The AI code tools Dutch developers actually use in 2026 (field notes) Automatic Error Recovery in AI Agent Networks You Are Not Choosing Building a Cinematic Adaptive Learning Intelligence with Gemma 4, Gemini, and OpenAI(Powered by Gemma 4) CLAUDE.md for Angular: 13 Rules That Make AI Write Idiomatic, Production-Ready Components I tested 7 vector databases for my RAG stack in 2026, here's the one nobody is talking about (yet) Claude agreed with a false fact I gave it. Confidently. That broke my workflow Google's "Budget" Model Just Beat Its Own Flagship. Here's What That Actually Means for Developers. How I built a monitoring SaaS for Joomla, WordPress & PrestaShop agencies Shifting from Passive Dashboards to Automated Remediation: A Guide to Next-Generation FinOps and CloudZero Alternatives Automating CSV WooCommerce Imports Without Plugins Why Wobbly Plugs and Overheating Outlets Are More Dangerous Than You Think (UL 498 Explained) Building an AI Model Evaluation Pipeline on AWS for Audio Content Generation Your Side Project Is Not a Business Neurodiversity and the two layers of cognition GitHub Internal Repositories Breached: Source Code and Internal Data Allegedly Exfiltrated in 2026 Supply Chain Attack Stop drowning in files: auto-organize your Google Drive with n8n (free workflow JSON) Secure Firmware Updates with a Secure Element: Building Trust Into the Bootloader I Thought Domain-Driven Design Was a Waste of Time. I Was Wrong. AI Content Is Getting Tagged Like Livestock — And That's Actually Good ESP32 Into a Speech-to-Text Device Why Simple Audio Transcription Fails in Healthcare: The Need for Clinical Reasoning Engines The 114KB Span Attribute That Hid Our LCP Data How to Scale AI Development Beyond Prototype Speed Agent Execution Environments: Cloud Sandbox vs Local GUI vs Hybrid AI code review checklist that actually catches problems What’s the best tech stack for AI app development?
Run Powerful AI Coding Locally on a Normal Laptop
devfirstcomm · 2026-05-22 · via DEV Community

Run Powerful AI Coding Locally on a Normal Laptop
A Developer-Friendly Guide to Setting Up ROO Code + Ollama + Qwen (8GB/16GB RAM)

If you are a developer who wants to use AI coding assistants locally without paying for cloud APIs or owning a high-end GPU, this guide is for you.

In this article, we will set up:

ROO Code inside Visual Studio Code
Ollama for running local AI models
Qwen2.5-Coder model locally
Optimized for:
8GB RAM laptops
16GB RAM laptops
No dedicated GPU / No VRAM

By the end, you’ll have your own private AI coding assistant running fully offline.
Why Run AI Locally?

Running AI locally gives developers:

✅ No API cost
✅ Better privacy
✅ Faster experimentation
✅ Offline development
✅ Full control over models
✅ No dependency on cloud providers

Recommended Hardware
Configuration Recommended Model
8GB RAM qwen2.5-coder:1.5b
16GB RAM qwen2.5-coder:7b
16GB+ RAM qwen2.5-coder:14b (slow but possible)

If you have no GPU, don’t worry. Ollama can run models entirely on CPU.

Step 1 — Install Visual Studio Code

  1. Download and install:
  2. Visual Studio Code
  3. Use the official website:

After installation:

code --version

Verify VS Code is properly installed.

Step 2 — Install Ollama

Install:

Ollama

Windows

Download installer from the official Ollama website.

Verify installation:

ollama --version

Step 3 — Start Ollama

Run:

ollama serve

This starts the local AI server at:

http://localhost:11434

Keep this terminal running.

Step 4 — Install Qwen Coding Model
For 8GB RAM Systems

Recommended:

ollama run qwen2.5-coder:1.5b
Why?

  1. Lightweight
  2. Fast on CPU
  3. Good enough for:
  4. Code generation
  5. Refactoring
  6. Unit tests
  7. Small automation tasks

For 16GB RAM Systems

Recommended:
ollama run qwen2.5-coder:7b

This gives much better:

  1. Reasoning
  2. Architecture suggestions
  3. Refactoring quality
  4. Multi-file understanding

Step 5 — Test the Model

Try:

ollama run qwen2.5-coder:7b

Then ask:

Who are you and create a hello world example in python

If the model responds, you’re ready.

Step 6 — Install ROO Code Extension

Inside VS Code:

Open Extensions
Search:

Roo Code
Install the extension

ROO Code converts VS Code into an AI-powered development environment.

Step 7 — Configure ROO Code for Ollama

Open ROO Code settings.

Set:

Provider: Ollama

API Endpoint:

http://localhost:11434

Model:

For 8GB RAM:

qwen2.5-coder:1.5b

For 16GB RAM:

qwen2.5-coder:7b

Save settings.

Step 8 — First AI Coding Test

Open a project and ask ROO Code:

Create a Java Spring Boot CRUD API with Controller, Service, Repository

Or:

Generate Cypress automation for login page

You now have a local AI coding assistant.

Best Practices for Low-RAM Systems
For 8GB RAM Machines
Recommended Settings
Setting Value
Context Window Small
Concurrent Apps Minimal
Model 1.5B
Browser Tabs Limited
Avoid

❌ Running Docker + AI together
❌ Opening large IDE projects
❌ Using 7B models continuously

Best Practices for 16GB RAM Machines

You can comfortably use:

qwen2.5-coder:7b
Medium-size repositories
Spring Boot projects
React applications
Cypress automation generation

Recommended:

OLLAMA_NUM_PARALLEL=1

This prevents RAM spikes.

Performance Optimization Tips
Reduce Model Temperature

Better coding consistency:

temperature = 0.2
Keep Context Smaller

Instead of entire repositories:

✅ Open only relevant folders

This improves response quality and speed.

Restart Ollama Occasionally

Long sessions can consume memory.

Restart:

ollama stop
ollama serve
Recommended Models by Use Case
Use Case Recommended Model
Basic coding qwen2.5-coder:1.5b
Java development qwen2.5-coder:7b
Test automation qwen2.5-coder:7b
Architecture discussion qwen2.5-coder:7b
Large enterprise code DeepSeek-Coder 14B (16GB+)
What Works Surprisingly Well Locally?

Even without a GPU, local models perform very well for:

✅ Boilerplate generation
✅ Refactoring
✅ Unit tests
✅ Cypress automation
✅ SQL generation
✅ Spring Boot scaffolding
✅ API creation
✅ Debugging suggestions
✅ Documentation generation

Limitations

Be realistic about CPU-only setups.

You may experience:

Slower response time
Limited context handling
Occasional hallucinations
Reduced multi-file reasoning

But for day-to-day development, the experience is still highly productive.

My Recommended Setup
For Most Developers
8GB RAM
Ollama + qwen2.5-coder:1.5b + Roo Code
16GB RAM
Ollama + qwen2.5-coder:7b + Roo Code

This provides the best balance between:

Performance
Memory usage
Coding quality
Stability
Final Thoughts

Local AI development is no longer limited to expensive GPUs.

Today, even a normal laptop can run surprisingly capable coding assistants using:

Ollama
Qwen2.5-Coder
Visual Studio Code
ROO Code

For developers working in Java, Spring Boot, React, Cypress, AI automation, and system design — this setup is an excellent starting point into the world of local AI engineering.

Useful Commands Cheat Sheet

Start Ollama

ollama serve

Run 1.5B model

ollama run qwen2.5-coder:1.5b

Run 7B model

ollama run qwen2.5-coder:7b

List installed models

ollama list

Remove model

ollama rm qwen2.5-coder:7b
Tags