惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

罗磊的独立博客
T
Tenable Blog
人人都是产品经理
人人都是产品经理
IT之家
IT之家
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
小众软件
小众软件
美团技术团队
The GitHub Blog
The GitHub Blog
Y
Y Combinator Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
V
Visual Studio Blog
M
Microsoft Research Blog - Microsoft Research
aimingoo的专栏
aimingoo的专栏
P
Proofpoint News Feed
T
The Blog of Author Tim Ferriss
博客园 - 聂微东
V
V2EX
Microsoft Security Blog
Microsoft Security Blog
C
CXSECURITY Database RSS Feed - CXSecurity.com
爱范儿
爱范儿
Latest news
Latest news
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
I
InfoQ
H
Help Net Security
Google DeepMind News
Google DeepMind News
P
Privacy International News Feed
U
Unit 42
Cyberwarzone
Cyberwarzone
V
Vulnerabilities – Threatpost
F
Future of Privacy Forum
雷峰网
雷峰网
Recorded Future
Recorded Future
WordPress大学
WordPress大学
P
Privacy & Cybersecurity Law Blog
博客园 - Franky
D
Darknet – Hacking Tools, Hacker News & Cyber Security
N
Netflix TechBlog - Medium
D
Docker
博客园_首页
J
Java Code Geeks
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
Blog — PlanetScale
Blog — PlanetScale
C
CERT Recently Published Vulnerability Notes
Malwarebytes
Malwarebytes
MongoDB | Blog
MongoDB | Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Cisco Talos Blog
Cisco Talos Blog
T
Threat Research - Cisco Blogs
Know Your Adversary
Know Your Adversary
GbyAI
GbyAI

DEV Community

Git Time Machine — How Version Control Can Save Your Project My Dad Got an Electricity Bill He Couldn't Understand. Google I/O 2026 Just Made That Problem Solvable. My Dad Got an Electricity Bill He Couldn't Understand. Google I/O 2026 Just Made That Problem Solvable. Read Replicas Lie About Consistency. 4 Sync Modes Behind the Lie. Reviving My Coding Project with GitHub Copilot I Tried Gemini 3.5 Flash After Google I/O 2026 - Here is What I Found :)) Blueprints Might Be More Important Than Frameworks AI CareCompanion - Offline Health Assistant Long-Context Models Killed RAG. Except for the 6 Cases Where They Made It Worse. I Built a Neural Network Engine in C# That Runs in Your Browser - No ONNX Runtime, No JavaScript Bridge, No Native Binaries An In-Depth Overview of the Apache Iceberg 1.11.0 Release Your Agent Just Called the Same Tool 47 Times. Here's the 20-Line Detector. How I Built a Multi-System Astrology Bot in Python (And What Meta Banned Me For) Gemma 4 Has Four Variants. Here's How to Pick the Right One Before You Write a Single Line of Code. Log Level Strategies: Balancing Observability and Cost Why WebMCP Is the Most Important Thing Google Announced at I/O 2026 (And Nobody's Talking About It) Making LLM Calls Reliable: Retry, Semaphore, Cache, and Batch Google's 2x Energy Efficiency Claim Is Real — But Here's What They're Not Measuring What's actually going on with CORS, under the hood Language-Agnostic Code Generation: The Driver Plugin Model Why We Rewrote Our Python CLI in Go (and What We Gained) I added up everything Google gives developers for free after I/O 2026. It's kind of absurd The Dawn of Smarter Apps: My Take on Google I/O 2026 AI Announcements Why AI Agents Like Hermes Need a Semantic Execution Layer for the Physical World Why We Built TestSmith: The Test Coverage Problem Nobody Talks About How to Convert Bank Statement PDFs to Excel: The Complete 2026 Guide Have You Ever Used a Website That Keeps Working After You Turn Off Your Internet? From idea to indexed: how I launched a SaaS in 60 days with Laravel + React Building a local-first AI tutor for my daughter (and 10–14 year-olds in Austrian schools) with Gemma 4 EC2 SSH Not Connecting? Here Are the 5 Things That Were Wrong (And How I Fixed Them) Best AI Tools for HVAC Contractors 2026 From Closed Internal Stack to Open-Source Ecosystem: I Finally Shipped Three Years of .NET Infrastructure Scrumpan is offlically LIVE!! Building a BMI Calculator CLI with TypeScript — Types, Functions, and Vitest From Building WordPress Websites to Node.js APIs: My Honest Full Stack Journey XiHan Snore Coach: Privacy-First On-Device MedTech Guardian powered by Gemma 4 Mobile Why AI Coding Agents Hallucinate and How to Fix It mcp-probe v1.4.0: Contract assertions for production MCP servers Google I/O 2026 Wasn't About One More Model. It Was About the Agent Stack. How I built 100+ crypto calculators in 6 languages on Astro The Dawn of Local Multi-Agent Architectures: Why Gemma 4 Changes Everything for Cloud Developers # I Told My AI to Simulate a Planet for 10,000 Years. It Built the Whole Thing Itself. 18/30 Days System Design Questions! From Hackathon Chaos to Clean CLI: Reviving My Daily Routine Analyser with GitHub Copilot Building a Home Lab with Proxmox and Terraform (for Kubernetes) PolicyAware vs Guardrails vs AI Gateways vs Model Routers: The Comparison Every AI Engineer Needs to Read Partner: An AI That Does Research While You Sleep Rugby Fundamentals as Software Concepts - Mapping the Pitch to your Code Base I Let Claude Code Run Unsupervised for 24 Hours. Here's What Happened. Why Zed Is Replacing VS Code in My AI-Augmented Workflow Build a scroll-driven WebGL hero in 30 lines Karpathy's LLM Wiki? No Code with Claude or Github Copilot! Why Platform Governance and Transparency Matter for Developers and Freelancers I built a Flutter CLI that generates Clean Architecture in seconds Using an LLM to automate a task that used to take hours by hand CyberArena – Interactive Cyber Security Simulation & Threat Analysis Platform Tile Extractor Mathematical Functions in CSS: clamp, min, max and How They Simplify Responsiveness Polyglot Persistence in Microservices: Let the Domain Choose the Database 190 Countries, Zero API Calls: Shipping Static Data in a Chrome Extension Your AI Writes Code Fast. Here’s How to Check It Before Shipping qwen2.5-coder is too slow for Claude Code on a Mac. Here's the fix. Building Automated Text-to-Video Pipelines with AI Can Gemini Become an Offline AI Tutor? Lessons from Building Educational AI OPRIX : From a simple messaging web app to a well structured and enhanced UI messaging web app Why React + TypeScript Nullability Slowly Becomes Exhausting Why AI Agents Need a Project Layer - Part 1 Stop Hand-Editing MCP Configs: A Zero-Dependency Go CLI What I Learned Working With Microsoft, SQUAD(GTCO), and Different Tech Communities 🧠 Hermes Agent Assistant — A Modular AI Agent System with Planner, Executor & Memory Spring Boot Auto-Configuration Source Code: Nail This Interview Question The Ultimate Guide to Free AI API Keys: 6 Platforms You Need to Know Why 91% of AI Agents Fail in Production (And What the 9% Do Differently) TryHackMe | Battery | WALKTHROUGH Stop Guessing Your Regex — Test It Live in the Browser I Built FreelancEye, an Open-Source Mobile PWA for Finding Clients Beyond the Hype: My Production Playbook for Docker Swarm Top AI App Builder Platforms with Integrated Backend, Hosting & Database ECS vs EKS in 2026: An Honest Comparison from Someone Who Has Run Both in Production Hardening Your Node.js App Against Supply Chain & Remote Code Execution Attacks linux commands A Practical GEO Case: How an AI System Started Recommending Our Blog Your AI Agent Works 24/7 and Earns $0. I Built the Fix. Your AI Trading Agent Will Lose All Your Money — Here's How To Stop It Google I/O 2026: What Happens When Everything Connects? Why AI writes software but doesn’t build a good product Beyond the Hype: How Google I/O 2026 Secretly Democratized Production-Ready AI Agents with Managed Sandboxes. The Killer Assumption Test: How to Spot Doomed Product Decisions Before You Ship Stop Describing Your Bugs — Just Screenshot Them # I Built an AI Website Builder and Here's What Actually Happened Cooking an AI Campaign in 5 Minutes with Google Cloud AI APIs Your PM Retrospectives Are Lying to You How I Built a Free, Self-Hosted Pipeline That Auto-Generates Faceless YouTube Shorts TypeScript 54 to 58: The Features That Actually Matter in 2026 How to Tailor Your CV to Any Job Posting in 2026 The 7-day SaaS MVP loop: ship fast, then validate with people who actually show up 95. Fine-Tuning LLMs: Make a General Model Do Your Specific Job What Is a Frontend Developer Roadmap and Why You Need One Google shipped three Gemini "Flash" models. Picking the wrong one could 6 your AI bill Building an MCP server so Claude can query my SaaS analytics directly
Zero-Cost AI in VS Code
DmitryGanin · 2026-05-24 · via DEV Community

Zero-Cost AI: Accessing Premium Models in VS Code Without API Keys

How I built a VS Code extension that gives you free access to Qwen-Max and DeepSeek models using only your existing web account — no billing, no tokens, no limits.


🎯 The Problem with Modern AI Tools

The state of AI development has become expensive:

  • API keys needed for every provider
  • Per-token pricing that adds up quickly
  • Rate limits blocking your workflow
  • Multiple subscriptions to different services
  • Browser sessions constantly expiring

Premium AI services charge significant monthly fees:

  • ChatGPT Plus: $20/month
  • Claude Pro: $20/month
  • Gemini Advanced: $20/month

That's over $600/year for basic access! And you still need separate accounts for each service.

What if there was another way?


💡 The Solution: Browser-Based Authentication

I developed AI Free VSCode — an open-source extension that leverages your existing free tier accounts from AI providers through browser automation.

Key Innovation

Instead of requiring API keys (which often have strict rate limits), the extension:

  1. Uses Playwright to automate a real Chromium browser session
  2. Stores authentication cookies locally
  3. Makes requests through the official web APIs
  4. Gives you full access to the same free tier available on their websites

Result?

Zero cost - uses existing free accounts

No API keys - just sign in once

Higher limits - same as browsing the website

Native integration - works directly in Copilot Chat

Agent mode - full tool calling support


🏗 Architecture Overview

Extension Structure

ai-free-vscode/
├── src/
│   ├── extension.mjs          # Entry point & commands
│   ├── lmProvider.mjs         # Unified LM provider interface
│   ├── deepseek/
│   │   ├── auth.mjs           # Browser login with Playwright
│   │   ├── client.mjs         # API client implementation
│   │   ├── provider.mjs       # Model logic & session management
│   │   └── config.mjs         # Configuration constants
│   ├── qwen/
│   │   ├── auth.mjs           # Qwen authentication
│   │   ├── client.mjs         # Qwen API client
│   │   └── provider.mjs       # Qwen model implementation
│   ├── utils/
│   │   ├── logger.mjs         # Debug logging
│   │   ├── rateLimiter.mjs    # Rate limiting protection
│   │   ├── responseValidator.mjs
│   │   └── tokenValidator.mjs
│   └── promptUtils.mjs        # Message formatting
├── package.json               # Extension manifest
└── README.md                  # Documentation

Enter fullscreen mode Exit fullscreen mode

Core Components

1. Authentication Flow

The extension registers commands for users to authenticate:

context.subscriptions.push(
  vscode.commands.registerCommand("deepseek.login", async () => {
    await clearProfileSession(); // Clear old session
    const result = await loginAndSaveAuth(); // New login via Playwright
    auth.cookieHeader = result.cookieHeader;
    auth.token = result.token;
  }),
);

Enter fullscreen mode Exit fullscreen mode

Process:

  • Opens Chromium browser via Playwright
  • User signs into provider normally
  • Session cookies captured and stored locally
  • Cookies used for subsequent API requests

2. Unified Provider Interface

All models are unified under a single vendor namespace:

class AiFreeVscodeChatModelProvider {
  async provideLanguageModelChatResponse(
    model,
    messages,
    options,
    progress,
    token,
  ) {
    // Convert VS Code messages to API format
    const convertedMessages = convertMessages(messages);
    const tools = convertToolSchemas(options?.tools);
    const prompt = messagesToPrompt(convertedMessages, tools);

    // Route to appropriate provider
    switch (model.family) {
      case "deepseek":
        await deepseekComplete({ modelId, prompt, auth, onText, signal });
        break;
      case "qwen":
        await qwenComplete({
          modelId,
          prompt,
          auth,
          onText,
          onThinking,
          signal,
        });
        break;
    }
  }
}

Enter fullscreen mode Exit fullscreen mode

Process:

  • Routes VS Code chat requests to appropriate provider
  • Handles both DeepSeek and Qwen models
  • Converts messages to API format
  • Manages streaming responses

3. Smart Session Management

Maintains conversation continuity with session caching:

const sessionIdCache = new Map();

async function runComplete({
  modelId,
  prompt,
  auth,
  threadKey,
  messagesCount,
}) {
  // Start fresh session for first message in thread
  if (messagesCount === 1) {
    sessionIdCache.delete(threadKey);
  }

  // Try cached session first (for conversation continuity)
  const cachedSessionId = sessionIdCache.get(threadKey);
  if (cachedSessionId) {
    const ok = await attempt(cachedSessionId);
    if (ok) return; // Success!
  }

  // Retry with new session
  const sessionId = await client.createSession({ signal });
  sessionIdCache.set(threadKey, sessionId);
  await attempt(sessionId);
}

Enter fullscreen mode Exit fullscreen mode


🔧 Installation & Setup

Step 1: Install the Extension

Download the latest .vsix file from Releases and install via VS Code Extensions panel.

Or develop locally:

git clone https://github.com/AppsGanin/ai-free-vscode
cd ai-free-vscode
npm install  # installs dependencies + Playwright Chromium

Enter fullscreen mode Exit fullscreen mode

Press F5 to launch in Extension Development Host.

Step 2: Sign In to Provider

  1. Open Command Palette (Cmd+Shift+P / Ctrl+Shift+P)
  2. Run "AI Free VSCode: DeepSeek: Sign In (Playwright)"
  3. A browser window opens automatically
  4. Log in to your account normally
  5. Window closes when session is saved

Repeat for Qwen or other supported providers.

Step 3: Start Chatting

  • Open Copilot Chat panel (⌘+L)
  • Select your preferred model from dropdown
  • Start asking questions!

🚀 Supported Models

Model ID Max Tokens Use Case
DeepSeek V4 deepseek-default 8K output General purpose
DeepSeek V4 Expert deepseek-expert 8K output Complex reasoning
Qwen2.5-Max qwen-max - Powerful tasks
Qwen3.6-Plus qwen-plus 1M context Long documents
Qwen3-Max qwen3-max - Flagship quality
Qwen3-Coder qwen-coder 1M context Code generation
Qwen3.5-Flash qwen-flash - Fastest responses

All models support tool calling for Agent mode operations like:

  • File reading/writing
  • Terminal execution
  • Multi-step debugging
  • Code refactoring

🛠 Technical Deep Dive

How Messages Are Processed

1. Message Conversion

VS Code messages are converted to API-compatible format:

function convertMessages(messages) {
  return messages
    .map((msg) => {
      const role = msg.role === "assistant" ? "assistant" : "user";

      // Handle text content
      const content = msg.content
        .map((part) =>
          part instanceof LanguageModelTextPart ? part.value : "",
        )
        .join("");

      // Extract tool calls from assistant
      const toolCalls = msg.content
        .filter((p) => p instanceof LanguageModelToolCallPart)
        .map((p) => ({
          id: p.callId,
          type: "function",
          function: { name: p.name, arguments: JSON.stringify(p.input) },
        }));

      // Generate separate "tool" messages for results
      const toolResults = msg.content
        .filter((p) => p instanceof LanguageModelToolResultPart)
        .map((p) => ({
          role: "tool",
          tool_call_id: p.callId,
          content: p.content.value,
        }));

      return [
        ...toolResults,
        { role, content, tool_calls: toolCalls.length ? toolCalls : undefined },
      ];
    })
    .flat();
}

Enter fullscreen mode Exit fullscreen mode

2. Tool Call Detection

The system detects markdown fences indicating tool calls:

const TOOL_FENCES = ["`\`\`tool_call", "\ntool_call\n{", "tool_call\n{"];

function findFence(str) {
  let best = -1;
  for (const fence of TOOL_FENCES) {
    const idx = str.indexOf(fence);
    if (idx !== -1 && (best === -1 || idx < best)) best = idx;
  }
  return best;
}

// Stream processing
streamBuf += text;
const idx = findFence(streamBuf);
if (idx !== -1) {
  // Emit text before fence, suppress tool call block
  flushStream(streamBuf.slice(0, idx));
  streamBuf = "";
  inToolCall = true;
}

Enter fullscreen mode Exit fullscreen mode

This prevents raw tool call blocks from appearing in the chat UI while still executing them properly.

3. Thinking Mode Support

For models with explicit reasoning phases:

let thinkingStarted = false;
let thinkingText = "";

const onThinking = async (text) => {
  thinkingText += text;
  thinkingStarted = true;
};

// When content starts, emit thinking as collapsible block
if (thinkingStarted) {
  progress.report(new LanguageModelThinkingPart(thinkingText, "thinking-0"));
}

Enter fullscreen mode Exit fullscreen mode

VS Code displays this as a native collapsible "💭 Thinking" section above responses.


🔐 Security & Privacy

What Happens to Your Data?

Cookies stored locally - Only your machine, encrypted by OS
No cloud storage - We never transmit your credentials
Session isolation - Each provider maintains separate sessions
No telemetry - No usage statistics sent anywhere

Error Handling

try {
  await client.complete({ ... });
} catch (e) {
  // Graceful handling of various errors
  if (e.isNotSignedIn) {
    showErrorMessage("Please sign in first");
  } else if (e.isAuthError) {
    // Cookie expired - force re-login
    clearProfileSession();
    throw e;
  } else if (isBizError(e)) {
    // Business logic error with formatted message
    progress.report(new TextPart(formatBizError(e.bizCode, e.bizMsg)));
  }
}

Enter fullscreen mode Exit fullscreen mode

⚠️ Limitations & Caveats

Important Considerations

  1. Terms of Service - Automating browser sessions may violate provider ToS
  2. Account Risk - Your account could be restricted (use at your own risk)
  3. Stability - Providers can change APIs without notice
  4. Single Session - Only one active user session at a time
  5. No Enterprise Support - Not suitable for corporate compliance requirements

Mitigation Strategies

  • Use separate accounts from main email
  • Don't abuse the service (reasonable usage only)
  • Keep extension updated for API changes
  • Maintain backups of important code/settings

🚀 Real-World Use Cases

Scenario 1: Code Review Assistant

# Ask about potential bugs
User: "Review this Python function for memory leaks:"
[User pastes code]

Assistant: Analyzes code structure, identifies resource leaks,
suggests fixes with explanations

Enter fullscreen mode Exit fullscreen mode

Scenario 2: Database Query Optimization

-- Paste slow query
EXPLAIN SELECT * FROM users WHERE created_at > NOW() - INTERVAL '7 days';

Assistant: Suggests indexing strategies, query rewriting,
and alternative approaches

Enter fullscreen mode Exit fullscreen mode

Scenario 3: Full Stack Debugging

  1. Identify error in terminal
  2. Ask assistant to analyze stack trace
  3. Get root cause explanation
  4. Receive fix suggestion with code example
  5. Apply fix directly in editor

🤝 Contributing

This is an open-source hobby project built by enthusiasts, for enthusiasts.

Ways to contribute:

  1. Fix bugs - See open issues
  2. Add new models - Implement additional AI providers
  3. Improve docs - Clarify setup instructions
  4. Enhance UX - Better error messages, UI improvements
  5. Write tests - Increase coverage for edge cases

Getting started:

git clone https://github.com/AppsGanin/ai-free-vscode
cd ai-free-vscode
npm install
# Edit code, press F5 to test

Enter fullscreen mode Exit fullscreen mode

Contributions welcome! PRs are always appreciated.


📝 Legal Disclaimer

This extension is unofficial and not affiliated with any AI provider.

  • Use at your own risk - Automating web sessions may violate ToS
  • No guarantees - May stop working if providers change APIs
  • No liability - Authors not responsible for consequences

Always review Terms of Service before use.


🎯 Conclusion

AI Free VSCode demonstrates that you don't need expensive API keys or multiple subscriptions to access premium AI capabilities. By leveraging browser automation and existing free tiers, we've created a solution that:

  • 💰 Costs nothing - literally $0 monthly subscription
  • 🚀 Works instantly - one-time sign-in, perpetual access
  • 🔒 Respects privacy - all data stays local
  • 🛠️ Integrates seamlessly - native VS Code experience

Whether you're a student learning to code, a indie developer building your startup, or just someone who wants powerful AI tools without breaking the bank - this extension removes financial barriers and puts cutting-edge technology in your hands.


Ready to try it?

Let's democratize AI access together! 🚀