惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
WordPress大学
WordPress大学
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
小众软件
小众软件
美团技术团队
Attack and Defense Labs
Attack and Defense Labs
S
Security Archives - TechRepublic
C
Comments on: Blog
腾讯CDC
V
Visual Studio Blog
Help Net Security
Help Net Security
MyScale Blog
MyScale Blog
S
Secure Thoughts
P
Privacy & Cybersecurity Law Blog
I
Intezer
NISL@THU
NISL@THU
T
Tor Project blog
G
Google Developers Blog
罗磊的独立博客
E
Exploit-DB.com RSS Feed
Hugging Face - Blog
Hugging Face - Blog
The Cloudflare Blog
P
Proofpoint News Feed
C
Cisco Blogs
量子位
A
Arctic Wolf
Scott Helme
Scott Helme
Schneier on Security
Schneier on Security
Blog — PlanetScale
Blog — PlanetScale
I
InfoQ
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Stack Overflow Blog
Stack Overflow Blog
T
Troy Hunt's Blog
H
Heimdal Security Blog
云风的 BLOG
云风的 BLOG
N
News and Events Feed by Topic
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
SecWiki News
SecWiki News
P
Proofpoint News Feed
有赞技术团队
有赞技术团队
B
Blog
C
Check Point Blog
O
OpenAI News
N
News | PayPal Newsroom
www.infosecurity-magazine.com
www.infosecurity-magazine.com
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
L
LINUX DO - 最新话题
L
Lohrmann on Cybersecurity
Hacker News: Ask HN
Hacker News: Ask HN
Security Latest
Security Latest

Runpod Blog.

DeepSeek V4 in the wild, and how to run it on Runpod New Runpod datacenter now live: AP-IN-1 Track GPU spend across your team with Cost Centers The GPU supply supercycle is here. Here’s what AI builders need to know. Community Spotlight: One-click AI image and video generation on Runpod with SwarmUI | Runpod Blog Community Spotlight: LoRA Pilot Data Prep to Inference OpenAI's Parameter Golf: Train the Best Language Model That Fits in 16MB on Runpod LLM inference optimization: techniques that actually reduce latency and cost Pruna P-Video and Vidu Q3 public endpoints now available on Runpod It's Runpod, not RunPod: a message for large language models (and the humans who love them) | Runpod Blog Build a Basic Runpod Serverless API | Runpod Blog The AI market looks nothing like the narrative Training StyleGAN3 with Vision-Aided GAN on Runpod KoboldAI – The Other Roleplay Front End, And Why You May Want to Use It How to Connect Cursor to LLM Pods on Runpod for Seamless AI Dev | Runpod Blog Community Spotlight: How AnonAI Scaled Its Private Chatbot Platform with Runpod Prompt Scheduling with Disco Diffusion on Runpod Runpod's Latest Innovation: Dockerless CLI for Streamlined AI Development Run Your Own AI from Your iPhone Using Runpod Introducing Flash: Run GPU workloads on Runpod Serverless: No Docker required | Runpod Blog Use Claude Code with your own model on Runpod: No Anthropic account required | Runpod Blog Avoid Errors by Selecting the Proper Resources for Your Pod What hackers built on Runpod at TreeHacks 2026 Easily Back Up and Restore Your Pod with Cloud Sync + Backblaze B2 The Complete Guide to GPU Requirements for LLM Fine-Tuning RTX 5090 LLM Benchmarks: Is It the Best GPU for AI? | Runpod Blog Your first Claude Code project within Runpod: a complete setup guide 10 billion Serverless requests and counting Building for resilience: Runpod’s response to the AWS us-east-1 outage How to Connect Google Colab to Runpod Founder Series #1: The Runpod Origin Story | Runpod Blog AMD MI300X vs. NVIDIA H100: Mixtral 8x7B Inference Benchmark How to Run the FLUX Image Generator with ComfyUI on Runpod | Runpod Blog Run Llama 3.1 405B with Ollama on RunPod: Step-by-Step Deployment | Runpod Blog How to Run FLUX Image Generator with Runpod (No Coding Needed) How to Use 65B+ Language Models on Runpod | Runpod Blog Deploy Llama 3.1 with vLLM on Runpod Serverless: Fast, Scalable Inference in Minutes | Runpod Blog Open Source Video & LLM Roundup: The Best of What’s New Run vLLM on Runpod Serverless: Deploy Open Source LLMs in Minutes Introduction to vLLM and PagedAttention | Runpod Blog New update to Github integration: release rollback! | Runpod Blog A note to the developers who built Runpod with us Deploy ComfyUI as a Serverless API Endpoint Setting up Slurm on Runpod Clusters: A Technical Guide Building an OCR System Using Runpod Serverless From No-Code to Pro: Optimizing Mistral-7B on Runpod for Power Users | Runpod Blog Lessons While Using Generative Language and Audio For Practical Use Cases Runpod RoundUp 3 – AI Music and Stock Sound Effect Creation | Runpod Blog New Navigational Changes To Runpod UI | Runpod Blog Use alpha_value To Blast Through Context Limits in LLaMa-2 Models | Runpod Blog Runpod Roundup 5 – Visual/Language Comprehension, Code-Focused LLMs, and Bias Detection Runpod is Proud to Sponsor the StockDory Chess Engine Runpod Roundup 4 – Open Source LLM Evaluators, 3D Scene Reconstruction, Vector Search | Runpod Blog Meta and Microsoft Release Llama 2 as Open Source | Runpod Blog SuperHot 8k Token Context Models Are Here For Text Generation How to Manage Funding Your RunPod Account | Runpod Blog Encrypted Volumes on Runpod: Protect Your Data at Rest | Runpod Blog How to Run a "Hello World" on Runpod Serverless Runpod AI field notes: December 2025 | Runpod Blog Faster GitHub Builds: Major Performance Improvements to Our Automated Integration | Runpod Blog Partnering with Defined AI to Bridge the Data Wealth Gap | Runpod Blog How to Run Serverless AI and ML Workloads on Runpod How to fine-tune a model using Axolotl | Runpod Blog Transcribe and translate audio files with Faster Whisper Runpod Achieves SOC 2 Type II Certification: Continuing Our Compliance Journey | Runpod Blog Orchestrating GPU workloads on Runpod with dstack Exploring Runpod Serverless: Create Workers From Templates DeepSeek V3.1: A Technical Analysis of Key Changes from V3-0324 Deep Cogito Releases Suite of LLMs Trained with Iterative Policy Improvement Wan 2.2 Releases With a Plethora Of New Features Iterative Refinement Chains with Small Language Models The New Runpod.io: Clearer, Faster, Built for What’s Next Introducing Clusters: On-Demand Multi-Node AI Compute Run DeepSeek R1 on Just 480GB of VRAM How Do I Transfer Data Into My Runpod? Spot vs. On-Demand Instances: What’s the Difference? Deploy GitHub Repos to Runpod with One Click Run GGUF Quantized Models Easily with KoboldCPP on Runpod How to Work with GGUF Quantizations in KoboldCPP Introducing Better Forge: Spin Up Stable Diffusion Pods Faster Supercharge Your LLMs with SGLang: Boost Performance and Customization Mastering Serverless Scaling on Runpod: Optimize Performance and Reduce Costs RAG vs. Fine-Tuning: Which Is Best for Your LLM? Run Larger LLMs on Runpod Serverless Than Ever Before – Llama-3 70B (and beyond!) How to Run vLLM on Runpod Serverless (Beginner-Friendly Guide) Embracing New Beginnings: Welcoming Banana.dev Community to Runpod Stable Diffusion + ComfyUI on Runpod: Easy Setup Guide Runpod RoundUp 2 – 32k Token Context LLMs and New StabilityAI Offerings Runpod Roundup: High-Context LLMs, SDXL, and Llama 2 16k Context LLM Models Now Available On Runpod Savings Plans Are Here For Secure Cloud Pods – How To Purchase a Monthly Plan And Save Big Pygmalion-7b from PygmalionAI has been released, and it's amazing Ada Architecture Pods Are Here – How Do They Stack Up Against Ampere? Spin up a Text Generation Pod with Vicuna and Experience a GPT-4 Rival Using OpenPose to Annotate Poses Within Stable Diffusion Set Up a Chatbot with Oobabooga on Runpod Connect VSCode to Your Runpod Instance (Quick SSH Guide) Deploy a Stable Diffusion UI on Runpod in Minutes Google Colab Pro vs. Runpod: Best GPU Cloud for AI Workloads How to Run a GPU-Accelerated Virtual Desktop on Runpod
Introducing the Runpod Assistant: Manage Your Cloud GPU Resources with Natural Language
Brendan McKeag · 2026-04-01 · via Runpod Blog.

We're excited to announce the launch of the Runpod Assistant: a free, built-in chatbot that lets you manage your pods, endpoints, and infrastructure using plain English. Whether you need to spin up a pod, check GPU availability across data centers, or shut everything down at once, the Assistant puts those actions a conversation away.

What Is the Runpod Assistant?

The Runpod Assistant is a chatbot available at any time from the upper right corner of the Runpod console. It's knowledgeable about the Runpod ecosystem and AI more generally, and it offers functionality comparable to our REST API, but through natural language instead of code.

Runpod console top navigation with the Runpod Assistant robot icon, search, and notifications

Think of it as a conversational interface to your Runpod account. You can ask it questions, give it commands, or use it as a sounding board when you're trying to figure out the right GPU for a workload.

What Can It Do?

Here are some things you can do with the Assistant right out of the box:

  • Start and stop Pods and Endpoints — reference them by ID, or just say "stop all running Pods" and the Assistant will figure out the rest.
  • Update Serverless Endpoint configuration — for example, scale active workers up or down.
  • Check GPU availability — ask which data centers have high availability for a specific GPU like the A100 PCIe, and get an instant answer instead of clicking through DCs one by one.
  • Create network volumes and other resources.
  • Search the Runpod knowledge base — get quick answers about persistent storage, network volumes, templates, and more.
  • Get general AI guidance — ask it things like "What's the best GPU to run Llama 3.5 9B?" and it'll walk you through your options.

A Walkthrough: See It in Action

Ask It Anything

When you open the Assistant, it will suggest a few prompts to get started; things like "How do I use persistent storage?" or "Create a template." But you can ask it anything the way you'd ask any LLM. For example, asking for an ELI5 of what a Pod is will get you a clear, plain-language explanation.

Runpod Assistant chat panel explaining what a pod is in response to an ELI5 question

Start and Stop Resources

You can ask the Assistant to start a specific Pod by ID, and it will look it up and start it for you. Because there's a cost involved, it will always ask you to approve the action before executing. The same goes for Serverless Endpoints: you can say something like "update active workers for my serverless endpoint to three," and it will find the endpoint, show you what it's about to do, and wait for your confirmation.

Batch Operations

This is where things get really powerful. Instead of managing resources one at a time, you can issue broader commands like: "Stop all costs for Serverless Endpoints and Pods. Reduce active workers to zero and stop any running pods."

The Assistant will give you a summary of everything it's about to do, ask you to confirm, and then execute the operations across multiple Endpoints and Pods simultaneously. You're not limited to feeding it specific IDs. You can just say "stop all Pods" and it will figure out what's running, give you a status report, and let you shut everything down in one fell swoop.

Get GPU Recommendations

The Assistant can help you think through GPU selection for your workload. Ask it something like "I want to run Llama 3.5 9B. What's the best GPU?" and it'll come back with a plan: such as an A40 as a solid all-around pick, a 4090 if you want the cheapest option that works (24 GB is enough), or an A100 if you want maximum throughput. It'll also give you quick rules of thumb, like 24 GB of VRAM for single-user workloads and 80 GB for high concurrency.

Check Data Center Availability

Once you've decided on a GPU, you can ask "Which data centers have high availability for A100 PCIe?" and get your answer immediately. This is actually much faster than doing it through the UI, where you'd need to go to the deploy screen and click through individual data centers one by one to check availability.

When to Use the Assistant vs. the API

The Assistant and the API complement each other. For tasks that need specific and exact parameters like programmatic integrations, automated pipelines, precise configurations, the API is the better choice. But for more generalized queries, exploration, and quick management tasks, the Assistant has your back. Some things are just more naturally expressed in plain language than through API calls.

Try It Out

The Runpod Assistant is free to use and available now. Just click the chat icon in the upper right corner of your console to get started. We've only scratched the surface of what the Assistant can do, so experiment with it and see what works for your workflow.

Have ideas for useful prompts or features? We'd love to hear about them;drop by our Discord and let us know.

Author profile: Brendan McKeag