惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
WordPress大学
WordPress大学
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
小众软件
小众软件
美团技术团队
Attack and Defense Labs
Attack and Defense Labs
S
Security Archives - TechRepublic
C
Comments on: Blog
腾讯CDC
V
Visual Studio Blog
Help Net Security
Help Net Security
MyScale Blog
MyScale Blog
S
Secure Thoughts
P
Privacy & Cybersecurity Law Blog
I
Intezer
NISL@THU
NISL@THU
T
Tor Project blog
G
Google Developers Blog
罗磊的独立博客
E
Exploit-DB.com RSS Feed
Hugging Face - Blog
Hugging Face - Blog
The Cloudflare Blog
P
Proofpoint News Feed
C
Cisco Blogs
量子位
A
Arctic Wolf
Scott Helme
Scott Helme
Schneier on Security
Schneier on Security
Blog — PlanetScale
Blog — PlanetScale
I
InfoQ
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Stack Overflow Blog
Stack Overflow Blog
T
Troy Hunt's Blog
H
Heimdal Security Blog
云风的 BLOG
云风的 BLOG
N
News and Events Feed by Topic
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
SecWiki News
SecWiki News
P
Proofpoint News Feed
有赞技术团队
有赞技术团队
B
Blog
C
Check Point Blog
O
OpenAI News
N
News | PayPal Newsroom
www.infosecurity-magazine.com
www.infosecurity-magazine.com
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
L
LINUX DO - 最新话题
L
Lohrmann on Cybersecurity
Hacker News: Ask HN
Hacker News: Ask HN
Security Latest
Security Latest

Runpod Blog.

DeepSeek V4 in the wild, and how to run it on Runpod New Runpod datacenter now live: AP-IN-1 Track GPU spend across your team with Cost Centers The GPU supply supercycle is here. Here’s what AI builders need to know. Community Spotlight: One-click AI image and video generation on Runpod with SwarmUI | Runpod Blog Community Spotlight: LoRA Pilot Data Prep to Inference Introducing the Runpod Assistant: Manage Your Cloud GPU Resources with Natural Language OpenAI's Parameter Golf: Train the Best Language Model That Fits in 16MB on Runpod LLM inference optimization: techniques that actually reduce latency and cost Pruna P-Video and Vidu Q3 public endpoints now available on Runpod Runpod brand spelling guide Quickstart - Runpod Documentation The AI market looks nothing like the narrative Training StyleGAN3 with Vision-Aided GAN on Runpod KoboldAI – The Other Roleplay Front End, And Why You May Want to Use It How to Connect Cursor to LLM Pods on Runpod for Seamless AI Dev Community Spotlight: How AnonAI Scaled Its Private Chatbot Platform with Runpod Prompt Scheduling with Disco Diffusion on Runpod Runpod's Latest Innovation: Dockerless CLI for Streamlined AI Development Run Your Own AI from Your iPhone Using Runpod Introducing Flash: Run GPU workloads on Runpod Serverless: No Docker required Use Claude Code with your own model on Runpod: No Anthropic account required Avoid Errors by Selecting the Proper Resources for Your Pod What hackers built on Runpod at TreeHacks 2026 Easily Back Up and Restore Your Pod with Cloud Sync + Backblaze B2 The Complete Guide to GPU Requirements for LLM Fine-Tuning AI Guides, Tutorials & GPU Infrastructure Insights | Runpod Your first Claude Code project within Runpod: a complete setup guide 10 billion Serverless requests and counting Building for resilience: Runpod’s response to the AWS us-east-1 outage How to Connect Google Colab to Runpod Founder Series #1: The Runpod Origin Story AMD MI300X vs. NVIDIA H100: Mixtral 8x7B Inference Benchmark How to Run the FLUX Image Generator with ComfyUI on Runpod Run Llama 3.1 405B with Ollama on Runpod: Step-by-Step Deployment How to Run FLUX Image Generator with Runpod (No Coding Needed) How to Use 65B+ Language Models on Runpod Deploy Llama 3.1 with vLLM on Runpod Serverless: Fast, Scalable Inference in Minutes Open Source Video & LLM Roundup: The Best of What’s New Run vLLM on Runpod Serverless: Deploy Open Source LLMs in Minutes Introduction to vLLM and PagedAttention New update to Github integration: release rollback! | Runpod Blog A note to the developers who built Runpod with us Deploy ComfyUI as a Serverless API Endpoint Setting up Slurm on Runpod Clusters: A Technical Guide Building an OCR System Using Runpod Serverless From No-Code to Pro: Optimizing Mistral-7B on Runpod for Power Users Lessons While Using Generative Language and Audio For Practical Use Cases Runpod RoundUp 3 – AI Music and Stock Sound Effect Creation New Navigational Changes To Runpod UI Use alpha_value To Blast Through Context Limits in LLaMa-2 Models Runpod Roundup 5 – Visual/Language Comprehension, Code-Focused LLMs, and Bias Detection Runpod is Proud to Sponsor the StockDory Chess Engine Runpod Roundup 4 – Open Source LLM Evaluators, 3D Scene Reconstruction, Vector Search Meta and Microsoft Release Llama 2 as Open Source SuperHot 8k Token Context Models Are Here For Text Generation How to Manage Funding Your Runpod Account Encrypted Volumes on Runpod: Protect Your Data at Rest How to Run a "Hello World" on Runpod Serverless Runpod AI field notes: December 2025 Faster GitHub Builds: Major Performance Improvements to Our Automated Integration Partnering with Defined AI to Bridge the Data Wealth Gap How to Run Serverless AI and ML Workloads on Runpod Transcribe and translate audio files with Faster Whisper Runpod Achieves SOC 2 Type II Certification: Continuing Our Compliance Journey Orchestrating GPU workloads on Runpod with dstack Exploring Runpod Serverless: Create Workers From Templates DeepSeek V3.1: A Technical Analysis of Key Changes from V3-0324 Deep Cogito Releases Suite of LLMs Trained with Iterative Policy Improvement Wan 2.2 Releases With a Plethora Of New Features Iterative Refinement Chains with Small Language Models The New Runpod.io: Clearer, Faster, Built for What’s Next Introducing Clusters: On-Demand Multi-Node AI Compute Run DeepSeek R1 on Just 480GB of VRAM How Do I Transfer Data Into My Runpod? Spot vs. On-Demand Instances: What’s the Difference? Deploy GitHub Repos to Runpod with One Click Run GGUF Quantized Models Easily with KoboldCPP on Runpod How to Work with GGUF Quantizations in KoboldCPP Introducing Better Forge: Spin Up Stable Diffusion Pods Faster Supercharge Your LLMs with SGLang: Boost Performance and Customization Mastering Serverless Scaling on Runpod: Optimize Performance and Reduce Costs RAG vs. Fine-Tuning: Which Is Best for Your LLM? Run Larger LLMs on Runpod Serverless Than Ever Before – Llama-3 70B (and beyond!) How to Run vLLM on Runpod Serverless (Beginner-Friendly Guide) Embracing New Beginnings: Welcoming Banana.dev Community to Runpod Stable Diffusion + ComfyUI on Runpod: Easy Setup Guide Runpod RoundUp 2 – 32k Token Context LLMs and New StabilityAI Offerings Runpod Roundup: High-Context LLMs, SDXL, and Llama 2 16k Context LLM Models Now Available On Runpod Savings Plans Are Here For Secure Cloud Pods – How To Purchase a Monthly Plan And Save Big Pygmalion-7b from PygmalionAI has been released, and it's amazing Ada Architecture Pods Are Here – How Do They Stack Up Against Ampere? Spin up a Text Generation Pod with Vicuna and Experience a GPT-4 Rival Using OpenPose to Annotate Poses Within Stable Diffusion Set Up a Chatbot with Oobabooga on Runpod Connect VSCode to Your Runpod Instance (Quick SSH Guide) Deploy a Stable Diffusion UI on Runpod in Minutes Google Colab Pro vs. Runpod: Best GPU Cloud for AI Workloads How to Run a GPU-Accelerated Virtual Desktop on Runpod
How to fine-tune a model using Axolotl
Eliot Cowley · 2025-10-31 · via Runpod Blog.

How to fine-tune a model using Axolotl

Model fine-tuning is the process of adapting a pre-trained machine learning model to perform better on a specific task or dataset. This technique allows for improved performance and efficiency compared to training a model from scratch, as it leverages the knowledge already learned by the model.

Fine-tuning is ideal when you have limited data but want to enhance model performance. It is beneficial when the task differs significantly from the original training task of the model.

Fine-tuning is a subset of transfer learning, using knowledge an existing model already has as the starting point for learning new tasks. It’s easier and cheaper to hone an existing model’s capabilities rather than train a new model from scratch. For example, you can fine-tune an existing Large Language Model (LLM) to adjust its tone when responding to inquiries, or give it knowledge specific to your domain or business.

What you’ll learn

In this blog post you’ll learn how to:

  • Deploy a pod on Runpod based on the axolotl-runpod template
  • SSH into the pod
  • Fine-tune a Llama 3 model using LoRA
  • Prompt the fine-tuned model using an interactive UI

Requirements

1. Deploy a pod

You can fine-tune a model on Runpod using Axolotl, an open-source tool for fine-tuning AI models. Let’s deploy a pod that will fine-tune a model based on a dataset, and then run that model so we can test how it has changed.

  1. Log in to the Runpod Console and select Pod Templates from the left sidebar.
  2. Search for “axolotl” and select the axolotl-runpod template. This template uses an official Axolotl Docker image.

Runpod Hub search results for axolotl with the axolotl-Runpod pod template highlighted

  1. Select Deploy Pod.

Runpod console axolotl-Runpod template page with the Deploy Pod button highlighted

  1. Select a GPU to use to train and run the model. GPUs with more RAM typically cost more money, but perform better, while GPUs with less RAM are cheaper but slower. I went with the RTX A4000, a previous-generation NVIDIA GPU. Make sure that you choose an NVIDIA GPU, because the template expects one.
  2. Enter a Pod Name.
  3. Leave the other settings at their defaults and select Deploy On-Demand.

Runpod console pricing and pod summary for an RTX A4000 with the Deploy On-Demand button highlighted

  1. Wait for the pod to initialize. When the pod is ready, a green circle is displayed next to its name

Runpod console breadcrumb showing a running pod named axolotl_fine_tuning

2. Explore the workspace

Now that we have a pod that has Axolotl up and running, let’s access the pod from a terminal on our local machine and see the files and directories in our workspace.

  1. If you have added a public SSH key to your Runpod account, you will see a command that you can copy and paste into a terminal on your local machine to connect to your pod.

Runpod console SSH connection panel showing the SSH command for connecting to a pod

  1. In your terminal, you should see a welcome message from Axolotl:

Terminal showing the Axolotl cloud image welcome banner after connecting to a pod over SSH

  1. Let’s explore the files in our workspace. By default, you should be in /workspace/axolotl. Enter dir to see the files and folders in this directory. There’s a lot of stuff here!

Terminal listing the contents of the Axolotl repository directory

  1. To train a model with Axolotl, we must use a configuration file. Axolotl provides example configuration files for many different models. Enter cd examples and then dir to list them:

Terminal listing model folders in the Axolotl examples directory

  1. Let’s look at the Llama 3 examples. Llama is a series of LLMs by Meta, and Llama 3 is the previous generation. Enter cd llama-3 and then dir to list the example configuration files:

Terminal listing YAML config files in the Axolotl llama-3 examples folder

  1. There are a lot of confusingly named YAML files here, many of them sounding the same. Let’s look specifically at lora-1b.yml.

    LoRA, which stands for Low-Rank Adaptation, is a technique used in fine-tuning that adapts models to new contexts in an efficient and performant way without requiring full retraining of the model. It will speed up our fine-tuning and be less costly than full retraining. The 1b in the filename means that the model has one billion parameters.

  2. Let’s read through the configuration file. Open it up in a text editor:

    nano lora-1b.yml

  3. Notice the first few fields:

Axolotl YAML config snippet setting Llama-3.2-1B as the base model with an Alpaca-format dataset

The model we will fine-tune is NousResearch/Llama-3.2-1B, a Llama 3.2 text model with one billion parameters that even lower-end hardware can run.

We will train it on the teknium/GPT4-LLM-Cleaned dataset, a GPT-4 LLM instruction dataset with OpenAI disclaimers and refusals filtered out.

Press Ctrl+X (Cmd+X on Mac) to exit the Nano text editor.

Okay, we familiarized ourselves with the pod’s workspace and chose a configuration file. Now let’s go ahead and fine-tune our model!

  1. Enter cd /workspace/axolotl to navigate back to the starting directory so that our model is saved in the correct place. Now enter the following command:

    axolotl train examples/llama-3/lora-1b.yml

  1. This may take a while depending on the GPU you chose for your pod. Once Axolotl finishes fine-tuning the model, you should see something like the following message:

    Training completed! Saving trained model to ./outputs/lora-out.

  1. Let’s test the model to see if it responds to prompts in a way that aligns with the dataset. Enter the following command to start interactively prompting the model we just fine-tuned. The first argument is the location of the configuration file that we used to train the model; the --lora-model-dir argument is the location of the fine-tuned model:

    axolotl inference examples/llama-3/lora-1b.yml
    --lora-model-dir="./outputs/lora-out" --gradio

  2. Axolotl should generate URLs that you can open in your browser to prompt the model using a UI called Gradio. Open the public URL:

Axolotl Gradio interface showing a health-tips prompt and the fine-tuned model's response

  1. Enter a prompt from the dataset, such as:

    Give three tips for staying healthy.

  2. Check the model’s output against the expected output from the dataset. For example, in my testing, the above prompt resulted in the following output:

    1. Eat nutritious foods and exercise regularly
    2. Get plenty of sleep and avoid stressors like nicotine, alcohol or excess caffeine intake.
    3. Manage your weight by eating a balanced diet and engaging in physical activity at least 30 minutes per day.


    Whereas the expected output from the dataset is:

    1. Eat a balanced and nutritious diet: Make sure your meals are inclusive of a variety of fruits and vegetables, lean protein, whole grains, and healthy fats. This helps to provide your body with the essential nutrients to function at its best and can help prevent chronic diseases.
    2. Engage in regular physical activity: Exercise is crucial for maintaining strong bones, muscles, and cardiovascular health. Aim for at least 150 minutes of moderate aerobic exercise or 75 minutes of vigorous exercise each week.
    3. Get enough sleep: Getting enough quality sleep is crucial for physical and mental well-being. It helps to regulate mood, improve cognitive function, and supports healthy growth and immune function. Aim for 7-9 hours of sleep each night.
    It’s close, but not quite the same. This is one of the drawbacks of using LoRA - lower precision. But still pretty good!

Next steps

Congratulations, you’ve fine-tuned a model based on a dataset! Runpod and Axolotl enable you to take existing models and adapt them to new contexts, without requiring you to create your own model from scratch. Here are some things you can do to take this further:

  • As you saw, our fine-tuned model didn’t exactly match our dataset’s expected output. Try fully fine-tuning a model using another of Axolotl’s example configuration files (examples/llama-3/fft-8b.yaml) and check the output against the dataset’s expected output. As this configuration uses an eight billion-parameter model and fully fine-tunes it, the output should be more accurate.
  • Try fine-tuning a model using Quantized Low-Rank Adaptation (QLoRA). QLoRA is similar to LoRA, but quantizes the model, compressing complex, more precise parameters into smaller, less precise parameters. Therefore, fine-tuning with QLoRA is even more efficient than LoRA, but also results in less precise output. Axolotl provides example configuration files that use QLoRA, such as examples/llama-3/qlora.yml, which fine-tunes an eight billion-parameter model. Compare the time it takes to fine-tune a model using full fine-tuning, LoRA, and QLoRA.
  • Runpod also offers an Axolotl serverless template. Try spinning up an endpoint and fine-tuning a model by sending a JSON request.

Note: When you’re done with your pod, don’t forget to terminate it, otherwise it will keep costing you money!

Author profile: Eliot Cowley

The Chips Got Faster. The Stack Didn't.

The Chips Got Faster. The Stack Didn't.

Explore why faster chips have shifted the bottleneck to AI infrastructure, and what that means for teams running production workloads.

All

Multi-Instance GPUs on Runpod: Stop Paying for Compute You Don't Need

Multi-Instance GPUs on Runpod: Stop Paying for Compute You Don't Need

With MIG, we can partition RTX 6000 Pro cards into isolated 24 GB instances. Here's when it makes sense for your workloads.

All

OpenAI Parameter Golf: what 1,100 researchers built in six weeks

OpenAI Parameter Golf: what 1,100 researchers built in six weeks

How 1,100 researchers beat OpenAI's own baseline with 16 megabytes and 10 minutes.

All

Build what’s next.

Build, train, and scale AI workloads on Runpod with cloud GPUs, Serverless, and Clusters.