Introducing Storage Buckets on the Hugging Face Hub

Hugging Face - Blog

Waypoint-1.5: Higher-Fidelity Interactive Worlds for Everyday GPUs ALTK‑Evolve: On‑the‑Job Learning for AI Agents Safetensors is Joining the PyTorch Foundation Holo3: Breaking the Computer Use Frontier Any Custom Frontend with Gradio's Backend A New Framework for Evaluating Voice Agents (EVA) Bringing Robotics AI to Embedded Platforms: Dataset Recording, VLA Fine‑Tuning, and On‑Device Optimizations One-Shot Any Web App with Gradio's gr.HTML CUGA on Hugging Face: Democratizing Configurable AI Agents New in llama.cpp: Model Management Building Deep Research: How we Achieved State of the Art OVHcloud on Hugging Face Inference Providers 🔥 20x Faster TRL Fine-tuning with RapidFire AI Building for an Open Future - our new partnership with Google Cloud Aligning to What? Rethinking Agent Generalization in MiniMax M2 Building a Healthcare Robot from Simulation to Deployment with NVIDIA Isaac Sentence Transformers is joining Hugging Face! Unlock the power of images with AI Sheets Supercharge your OCR Pipelines with Open Models Google Cloud C4 Brings a 70% TCO improvement on GPT OSS with Intel and Hugging Face Get your VLM running in 3 simple steps on Intel CPUs Nemotron-Personas-India: Synthesized Data for Sovereign AI Introducing RTEB: A New Standard for Retrieval Evaluation Accelerating Qwen3-8B Agent on Intel® Core™ Ultra with Depth-Pruned Draft Models VibeGame: Exploring Vibe Coding Games Nemotron-Personas-Japan: ソブリン AI のための合成データセット Swift Transformers Reaches 1.0 – and Looks to the Future Smol2Operator: Post-Training GUI Agents for Computer Use SyGra: The One-Stop Framework for Building Data for LLMs and SLMs Gaia2 and ARE: Empowering the community to study agents Scaleway on Hugging Face Inference Providers 🔥 Democratizing AI Safety with RiskRubric.ai Public AI on Hugging Face Inference Providers 🔥 `LeRobotDataset:v3.0`: Bringing large-scale datasets to `lerobot` Visible Watermarking with Gradio Introducing the Palmyra-mini family: Powerful, lightweight, and ready to reason! Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers Fine-tune Any LLM from the Hugging Face Hub with Together AI Jupyter Agents: training LLMs to reason with notebooks mmBERT: ModernBERT goes Multilingual Welcome EmbeddingGemma, Google's new efficient embedding model SAIR: Accelerating Pharma R&D with AI-Powered Structural Intelligence Make your ZeroGPU Spaces go brrr with ahead-of-time compilation NVIDIA Releases 6 Million Multi-Lingual Reasoning Dataset Generate Images with Claude and Hugging Face From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels MCP for Research: How to Connect AI to Research Tools Kimina-Prover-RL Arm & ExecuTorch 0.7: Bringing Generative AI to the masses Neural Super Sampling is here! TextQuests: How Good are LLMs at Text-Based Video Games? 🇵🇭 FilBench - Can LLMs Understand and Generate Filipino? Introducing AI Sheets: a tool to work with datasets using open AI models! Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training Vision Language Model Alignment in TRL ⚡️ Welcome GPT OSS, the new open-source model family from OpenAI! Measuring Open-Source Llama Nemotron Models on DeepResearch Bench 📚 3LM: A Benchmark for Arabic LLMs in STEM and Code Implementing MCP Servers in Python: An AI Shopping Assistant with Gradio Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face Say hello to `hf`: a faster, friendlier Hugging Face CLI ✨ Parquet Content-Defined Chunking TimeScope: How Long Can Your Video Large Multimodal Model Go? Fast LoRA inference for Flux with Diffusers and PEFT Accelerate a World of LLMs on Hugging Face with NVIDIA NIM Arc Virtual Cell Challenge: A Primer Consilium: When Multiple LLMs Collaborate Back to The Future: Evaluating AI Agents on Predicting Future Events Five Big Improvements to Gradio MCP Servers Ettin Suite: SoTA Paired Encoders and Decoders Migrating the Hub from Git LFS to Xet Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models Asynchronous Robot Inference: Decoupling Action Prediction and Execution ScreenEnv: Deploy your full stack Desktop Agent Building the Hugging Face MCP Server Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Creating custom kernels for the AMD MI300 Upskill your LLMs With Gradio MCP Servers SmolLM3: smol, multilingual, long-context reasoner Three Mighty Alerts Supporting Hugging Face’s Production Infrastructure Efficient MultiModal Data Pipeline Announcing NeurIPS 2025 E2LM Competition: Early Training Evaluation of Language Models Training and Finetuning Sparse Embedding Models with Sentence Transformers Welcome the NVIDIA Llama Nemotron Nano VLM to Hugging Face Hub Gemma 3n fully available in the open-source ecosystem! Transformers backend integration in SGLang (LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware Groq on Hugging Face Inference Providers 🔥 How Long Prompts Block Other Requests - Optimizing LLM Performance Learn the Hugging Face Kernel Hub in 5 Minutes Convert Transformers to ONNX with Hugging Face Optimum Intel and Hugging Face Partner to Democratize Machine Learning Hardware Acceleration Director of Machine Learning Insights [Part 3: Finance Edition] The Annotated Diffusion Model Deep Q-Learning with Space Invaders Graphcore and Hugging Face Launch New Lineup of IPU-Ready Transformers Introducing Pull Requests and Discussions 🥳 Efficient Table Pre-training without Real Data: An Introduction to TAPEX An Introduction to Q-Learning Part 2/2 How Sempre Health is leveraging the Expert Acceleration Program to accelerate their ML roadmap

Lucain Pouget, Eliott Coyac, Adrien Carreira, Victor Mustar, Jul · 2026-03-10 · via Hugging Face - Blog

Back to Articles

Hugging Face Models and Datasets repos are great for publishing final artifacts. But production ML generates a constant stream of intermediate files (checkpoints, optimizer states, processed shards, logs, traces, etc.) that change often, arrive from many jobs at once, and rarely need version control.

Storage Buckets are built exactly for this: mutable, S3-like object storage you can browse on the Hub, script from Python, or manage with the hf CLI. And because they are backed by Xet, they are especially efficient for ML artifacts that share content across files.

Why we built Buckets

Git starts to feel like the wrong abstraction pretty quickly when you're dealing with:

Training clusters writing checkpoints and optimizer states throughout a run
Data pipelines processing raw datasets iteratively
Agents storing traces, memory, and shared knowledge graphs

The storage need in all these cases is the same: write fast, overwrite when needed, sync directories, remove stale files, and keep things moving.

A Bucket is a non-versioned storage container on the Hub. It lives under a user or organization namespace, has standard Hugging Face permissions, can be private or public, has a page you can open in your browser, and can be addressed programmatically with a handle like hf://buckets/username/my-training-bucket.

Why Xet matters

Buckets are built on Xet, Hugging Face’s chunk-based storage backend, and this matters more than it might seem.

Instead of treating files as monolithic blobs, Xet breaks content into chunks and deduplicates across them. Upload a processed dataset that’s mostly similar to the raw one? Many chunks already exist. Store successive checkpoints where large parts of the model are frozen? Same story. Buckets skip the bytes that are already there, which means less bandwidth, faster transfers, and more efficient storage.

This is a natural fit for ML workloads. Training pipelines constantly produce families of related artifacts — raw and processed data, successive checkpoints, Agent traces and derived summaries — and Xet is designed to take advantage of that overlap.

For Enterprise customers, billing is based on deduplicated storage, so shared chunks directly reduce the billed footprint. Deduplication helps with both speed and cost.

Pre-warming: bringing data close to compute

Buckets live on the Hub, which means global storage by default. But not every workload can afford to pull data from wherever it happens to live, for distributed training and large-scale pipelines, storage location directly affects throughput.

Pre-warming lets you bring hot data closer to the cloud provider and region where your compute runs. Instead of data traveling across regions on every read, you declare where you need it and Buckets make sure it's already there when your jobs start. This is especially useful for training clusters that need fast access to large datasets or checkpoints, and for multi-region setups where different parts of a pipeline run in different clouds.

We are partnering with AWS and GCP to start with, more more cloud providers coming in the future.

Getting started

You can get a bucket up and running in under 2 minutes with the hf CLI. First thing is to install it and log in:

curl -LsSf https://hf.co/cli/install.sh | bash
hf auth login

Create a bucket for your project:

hf buckets create my-training-bucket --private

Say your training job is writing checkpoints locally to ./checkpoints. Sync that directory into the Bucket:

hf buckets sync ./checkpoints hf://buckets/username/my-training-bucket/checkpoints

For large transfers, you might want to see what will happen before anything moves. --dry-run prints the plan without executing anything:

hf buckets sync ./checkpoints hf://buckets/username/my-training-bucket/checkpoints --dry-run

You can also save the plan to a file for review and apply it later:

hf buckets sync ./checkpoints hf://buckets/username/my-training-bucket/checkpoints --plan sync-plan.jsonl
hf buckets sync --apply sync-plan.jsonl

Once done, inspect the Bucket from the CLI:

hf buckets list username/my-training-bucket -h

or browse it directly on the Hub at https://huggingface.co/buckets/username/my-training-bucket.

That is the whole loop. Create a bucket, sync your working data into it, check on it when you need to, and save the versioned repo for when something is worth publishing. For one-off operations, hf buckets cp copies individual files and hf buckets remove cleans up stale objects.

Using Buckets from Python

Everything above also works from Python via huggingface_hub (available since v1.5.0). The API follows the same pattern: create, sync, inspect.

from huggingface_hub import create_bucket, list_bucket_tree, sync_bucket

create_bucket("my-training-bucket", private=True, exist_ok=True)

sync_bucket(
    "./checkpoints",
    "hf://buckets/username/my-training-bucket/checkpoints",
)

for item in list_bucket_tree(
    "username/my-training-bucket",
    prefix="checkpoints",
    recursive=True,
):
    print(item.path, item.size)

This makes it straightforward to integrate Buckets into training scripts, data pipelines, or any service that manages artifacts programmatically. The Python client also supports batch uploads, selective downloads, deletes, and bucket moves for when you need finer control.

Bucket support is also available in JavaScript via @huggingface/hub (since v2.10.5), so you can integrate Buckets into Node.js services and web applications as well.

Filesystem integration

Buckets also work through HfFileSystem, the fsspec-compatible filesystem in huggingface_hub. This means you can list, read, write, and glob Bucket contents using standard filesystem operations — and any library that supports fsspec can access Buckets directly.

from huggingface_hub import hffs

# List files in a bucket directory
hffs.ls("buckets/username/my-training-bucket/checkpoints", detail=False)

# Glob for specific files
hffs.glob("buckets/username/my-training-bucket/**/*.parquet")

# Read a file directly
with hffs.open("buckets/username/my-training-bucket/config.yaml", "r") as f:
    print(f.read())

Because fsspec is the standard Python interface for remote filesystems, libraries like pandas, Polars, and Dask can read from and write to Buckets using hf:// paths with no extra setup:

import pandas as pd

# Read a CSV directly from a Bucket
df = pd.read_csv("hf://buckets/username/my-training-bucket/results.csv")

# Write results back
df.to_csv("hf://buckets/username/my-training-bucket/summary.csv")

This makes it easy to plug Buckets into existing data workflows without changing how your code reads or writes files.

From Buckets to versioned repos

Buckets are the fast, mutable place where artifacts live while they are still in motion. Once something becomes a stable deliverable, it usually belongs to a versioned model or dataset repo.

On the roadmap, we plan to support direct transfers between Buckets and repos in both directions: promote final checkpoint weights into a model repo, or commit processed shards into a dataset repo once a pipeline completes. The working layer and the publishing layer stay separate, but fit into one continuous Hub-native workflow.

Trusted by launch partners

Before opening Buckets to everyone, we ran a private beta with a small group of launch partners.

A huge thank you to Jasper, Arcee, IBM, and PixAI for testing early versions, surfacing bugs, and sharing feedback that directly shaped this feature.

Conclusion and resources

Storage Buckets bring a missing storage layer to the Hub. They give you a Hub-native place for the mutable, high-throughput side of ML: checkpoints, processed data, Agent traces, logs, and everything else that is useful before it becomes final.

Because they are built on Xet, Buckets are not just easier to use than forcing everything through Git. They are also more efficient for the kinds of related artifacts AI systems produce all the time. That means faster transfers, better deduplication, and on Enterprise plans, billing that benefits from the deduplicated footprint.

If you already use the Hub, Buckets let you keep more of your workflow in one place. If you come from S3-style storage, they give you a familiar model with better alignment to AI artifacts and a clear path toward final publication on the Hub.

Buckets are included in existing Hub storage plans. Free accounts come with storage to get started, and PRO and Enterprise plans offer higher limits. See the storage page for details.