惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
V
Vulnerabilities – Threatpost
有赞技术团队
有赞技术团队
小众软件
小众软件
O
OpenAI News
C
Cyber Attacks, Cyber Crime and Cyber Security
I
Intezer
NISL@THU
NISL@THU
D
Darknet – Hacking Tools, Hacker News & Cyber Security
N
News and Events Feed by Topic
MongoDB | Blog
MongoDB | Blog
阮一峰的网络日志
阮一峰的网络日志
Hacker News: Ask HN
Hacker News: Ask HN
D
Docker
WordPress大学
WordPress大学
Security Archives - TechRepublic
Security Archives - TechRepublic
A
About on SuperTechFans
Stack Overflow Blog
Stack Overflow Blog
C
CERT Recently Published Vulnerability Notes
L
LINUX DO - 最新话题
Application and Cybersecurity Blog
Application and Cybersecurity Blog
M
MIT News - Artificial intelligence
Blog — PlanetScale
Blog — PlanetScale
S
Security @ Cisco Blogs
Cloudbric
Cloudbric
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
V
V2EX
Hacker News - Newest:
Hacker News - Newest: "LLM"
G
Google Developers Blog
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
W
WeLiveSecurity
Google DeepMind News
Google DeepMind News
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
H
Hackread – Cybersecurity News, Data Breaches, AI and More
G
GRAHAM CLULEY
S
Schneier on Security
T
Tor Project blog
Spread Privacy
Spread Privacy
PCI Perspectives
PCI Perspectives
Microsoft Security Blog
Microsoft Security Blog
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
F
Fortinet All Blogs
L
Lohrmann on Cybersecurity
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
T
The Exploit Database - CXSecurity.com
TaoSecurity Blog
TaoSecurity Blog
Apple Machine Learning Research
Apple Machine Learning Research
T
Threat Research - Cisco Blogs
T
Troy Hunt's Blog
罗磊的独立博客

VictoriaMetrics: Simple & Reliable Monitoring for Everyone on VictoriaMetrics

Operator now has Long-Term Support (LTS) version Multi-tiered Observability: A Practical Way to Handle Diverse Workloads VictoriaMetrics April 2026 Ecosystem Updates Not All Telemetry Requires Premium Pricing VictoriaMetrics at KubeCon Amsterdam: Community Highlights What's new in VictoriaMetrics Anomaly Detection (Q1 2026) What's New in VictoriaMetrics Cloud Q1 2026? Logs, MCP Server, Better Alerting, and... a Secret Project VictoriaMetrics at KubeCon: Optimizing Tail Sampling in OpenTelemetry with Retroactive Sampling VictoriaMetrics March 2026 Ecosystem Updates Observability Lessons From OpenAI Benchmarking Kubernetes Log Collectors: vlagent, Vector, Fluent Bit, OpenTelemetry Collector, and more VictoriaMetrics February 2026 Ecosystem Updates VictoriaMetrics at FOSDEM, Cloud Native Days France, and CfgMgmtCamp Ghent VictoriaLogs in VictoriaMetrics Cloud: Fast, Cost-Effective Log Management is Here What’s new in VictoriaMetrics Anomaly Detection (2025) VictoriaMetrics January 2026 Ecosystem Updates VictoriaLogs Basics: What You Need to Know, with Examples & Visuals What's New in VictoriaMetrics Cloud Q4 2025? New tiers, more deployment options, IaC and alerting rules. Vibe coding tools observability with VictoriaMetrics Stack and OpenTelemetry How a US Software Provider Improved Traffic Alerting with VictoriaMetrics Anomaly Detection VictoriaMetrics 2025 Developer Experience: A Year in Review Spotify’s performance & control across large monitoring environments with VictoriaMetrics VictoriaMetrics Achieves Red Hat OpenShift Operator Certification Our latest updates across the VictoriaMetrics Observability ecosystem New Capacity Tiers in VictoriaMetrics Cloud Announcing 1B+ Downloads & Product Development With Logs, Traces, Metrics AI Agents Observability with OpenTelemetry and the VictoriaMetrics Stack Discarding gRPC-Go: The Story Behind OTLP/gRPC Support in VictoriaTraces What's New in VictoriaMetrics Cloud Q3 2025? From new region in Asia to proactive alerts How DreamHost Slashed Memory Usage by 80% and Scaled to 76 Million Time Series Upcoming Conferences & Meetups: Where to Meet Our Team VictoriaMetrics Long-Term Support (LTS): H2 2025 Update Creating a Sustainable Open Source Business Model - Introduction Full-Stack Observability with VictoriaMetrics in the OTel Demo Alerting Best Practices vmanomaly Deep Dive: Smarter Alerting with AI (Tech Talk Companion) VictoriaLogs Practical Ingestion Guide for Message, Time and Streams Monotonic and Wall Clock Time in the Go time package Hello Singapore! VictoriaMetrics Cloud Expands to Asia Pacific MCP Server Integration & Much More: What's New in VictoriaMetrics Cloud Q2 2025 FIPS 140-3 Compatible Builds for VictoriaMetrics Enterprise Components VictoriaLogs Unleashed: Cluster Version Now Available for Exceptional, Linear Scaling Integrations made easy with VictoriaMetrics Cloud Developer's Note: Research on Distributed Tracing, Comparing With Tempo and ClickHouse vmagent: Key Features Explained in Under 15 Minutes Go synctest: Solving Flaky Tests vmalert: Maximize Your Monitoring (Tech Talk Companion) Celebrating 14K Stars on GitHub: Spring Update vmalert: Maximize Your Monitoring VictoriaMetrics Connects with the Open Source Community at LinuxFest Northwest 2025 Graceful Shutdown in Go: Practical Patterns VictoriaLogs: Gaps, Gains & Growth Prometheus Monitoring: Functions, Subqueries, Operators, and Modifiers VictoriaMetrics Cloud: What's New in Q1 2025? Don’t default to microservices: You’ll thank us later! Container CPU Requests & Limits Explained with GOMAXPROCS Tuning gRPC in Go: Streaming RPCs, Interceptors, and Metadata From Chaos to Clarity with VictoriaLogs Prometheus Alerting 101: Rules, Recording Rules, and Alertmanager Heading to London: Meet Our Team at KubeCon Europe 2025 Inside vmselect: The Query Processing Engine of VictoriaMetrics Meet Our Team at Scale 22x Practical Protobuf - From Basic to Best Practices VictoriaLogs Status Update: Heading Towards the Cluster Version 24th of February 2025 Statement: VictoriaMetrics Stands with Ukraine! Prometheus Metrics Explained: Counters, Gauges, Histograms & Summaries Prometheus Monitoring: Instant Queries and Range Queries Explained 300%+ Growth in 2024: Join Our Team in 2025! FOSDEM 2025 recap How Protobuf Works—The Art of Data Encoding OpenTelemetry, Prometheus, and More: Which Is Better for Metrics Collection and Propagation? How vmstorage Handles Query Requests From vmselect How vmstorage's IndexDB Works VictoriaMetrics Tech Talk Stream: A Deep Dive into Blackbox Monitoring How HTTP/2 Works and How to Enable It in Go VictoriaMetrics Cloud: What's New in Q4 2024? How vmstorage Processes Data: Retention, Merging, Deduplication,... How vmstorage Handles Data Ingestion From vminsert When Metrics Meet vminsert: A Data-Delivery Story From net/rpc to gRPC in Go Applications Piros | VictoriaMetrics Partner Allenta | VictoriaMetrics Partner CloudRaft | VictoriaMetrics Partner Sensedia & VictoriaMetrics: API-compatible Efficient Storage Scalable Prometheus: Why DSV Chose VictoriaMetrics Sensor Factory | VictoriaMetrics Partner Erythix | VictoriaMetrics Partner Groove X & VictoriaMetrics: Faster Device Health Monitoring Scaled & Performant Monitoring at Spotify with VictoriaMetrics Grammarly & VictoriaMetrics: 10× Lower Costs & Direct Access Zelarsoft | VictoriaMetrics Partner DFKI & VictoriaMetrics: Efficient Long-Term Metric Storage Niubits | VictoriaMetrics Partner Megazone Cloud | VictoriaMetrics Partner Cogito Software | VictoriaMetrics Partner Bajau | VictoriaMetrics Partner Find Out Why Dig Security Chose VictoriaMetrics! Ness | VictoriaMetrics Partner Alpha Data | VictoriaMetrics Partner SIOS Technology | VictoriaMetrics Partner
Save network costs with VictoriaMetrics remote write protocol
Aliaksandr Valialkin · 2023-03-08 · via VictoriaMetrics: Simple & Reliable Monitoring for Everyone on VictoriaMetrics

Prometheus remote write protocol

#

Prometheus remote write protocol is used by Prometheus for sending data to remote storage systems such as VictoriaMetrics. See these docs on how to set up Prometheus to send the data to VictoriaMetrics. This protocol is very simple - it writes the collected raw samples into WriteRequest protobuf message, then compresses the message with Snappy compression algorithm and sends it to the remote storage in an HTTP POST request.

vmagent uses Prometheus remote write protocol for transferring the collected samples to remote storage specified via -remoteWrite.url.

The Prometheus remote write protocol serves well in most cases. But it isn’t optimized for low network bandwidth usage. So it can consume big amounts of network traffic when millions of samples per second must be transferred to the remote storage. Is is OK when the remote storage is located in the same network as Prometheus, and this network has no limits on bandwidth and the network transfer is free. In reality the remote storage may be located in another datacenter, availability zone or region. In this case there may be some limits on network bandwidth and/or on network transfer costs for transferring the data from Prometheus to the remote storage.

Let’s calculate monthly costs for transferring the data from Prometheus located in Google Cloud to the remote storage located in AWS at a rate of 1 million of samples/sec. Our internal stats show that an average data sample in production costs around 50 bytes to transfer via Prometheus remote write protocol. So 1 million samples/sec requires 50MB/sec of network bandwidth. This transforms to 50MB/sec * 3600sec * 24h * 30d = 129600GB of network traffic per month. 1GB of egress network traffic costs $0.08 at Google Cloud according to this pricing. So the monthly network transfer costs will be around 129600GB * $0.08 = $10K.

The issue with high network costs when transferring the data from vmagent to Prometheus-compatible remote storage is quite common. That’s why we started exploring on how to resolve it.

Prometheus remote write protocol transfers all the labels with each sample. Real-world samples usually contain a big number of labels. For example:

process_cpu_seconds_total{
  job="foo",
  instance="bar",
  env="prod",
  namespace="default",
  container="qwerty",
  pod="abcdef",
  ...
} <value> <timestamp>

The average length of metric name plus all the labels per each sample in production is around 200 bytes according to our stats. When the sample is encoded into TimeSeries protobuf message, its size becomes even bigger than the plaintext representation shown above. The average per-sample size on the wire is reduced to 50 bytes thanks to Snappy compression. But 50 bytes is still too big of a value compared to 0.4-0.8 bytes of disk space needed per each sample stored by VictoriaMetrics.

Possible solutions for reducing network bandwidth costs

#

A single time series usually consists of many samples. These samples are sent to the remote storage with some interval. This interval is known as scrape_interval in the Prometheus ecosystem. So we can assign a small id per each new time series seen on the wire and then send the series id together with (value, timestamp) to the remote storage instead of sending the metric name plus labels next time. This allows sending <4-byte sample id> + <8-byte value> + <8-byte timestamp> = 20 bytes instead of 200 bytes per each sample. This is 50/20 = 2.5x better than Prometheus remote write protocol does.

We can go further and send varint-encoded difference (aka delta) between the current value and the previous value per each sample. The same encoding technique can be applied to timestamps as well. This allows reducing the on-the-wire sample size to ~10 bytes according to our tests.

Then a block of encoded samples can be compressed with zstd compression in order to reduce per-sample size to ~5 bytes.

This data transfer protocol allows reducing network bandwidth usage by 50/5 = 10x comparing to Prometheus remote write protocol!

Unfortunately, this protocol has the following issues:

  • It requires maintaining non-trivial amounts of state on both sender and receiver side. The sender must maintain a map for locating the series id by time series name plus labels. The receiver must maintain a map for locating the time series name plus labels by series ID. Both maps can be quite big when samples for millions of time series are transferred over the network.

  • The maps can grow indefinitely when old time series are constantly substituted by new time series (aka high churn rate).

  • The state must be maintained individually per each connection (or session) between the sender and the receiver. This means that the memory usage at the receiver may go out of control when many independent senders transfer data to a single remote storage. This also means that it may be hard to use this protocol over HTTP. You need either to pass the same session id between multiple HTTP requests or to stream the data in a single request body via chunked transfer encoding. This may make hard load balancing and horizontal scalability at the receiver side.

  • It requires additional CPU time for the encoding and decoding comparing to Prometheus remote write protocol.

That’s why we decided to use a much simpler approach.

VictoriaMetrics remote write protocol

#

The first version of the VictoriaMetrics remote write protocol is almost identical to the Prometheus remote write protocol. It writes the collected raw samples into WriteRequest protobuf message, but then compresses it with zstd compression instead of snappy compression before sending it to the remote storage in an HTTP POST request.

The implementation of this protocol is very simple at both sender and receiver side - just replace snappy compression with zstd compression and change Content-Encoding: snappy to Content-Encoding: zstd in HTTP request headers.

The VictoriaMetrics remote write protocol allows reducing network traffic costs by 2x-4x comparing to the Prometheus remote write protocol at the cost of slightly higher CPU usage (+10% according to our production stats). Put it another way, this reduces monthly network transfer costs from $10K to $2.5K when transferring a million of samples per second between different cloud providers.

How to enable VictoriaMetrics remote write protocol?

#

Just upgrade VictoriaMetrics components (including vmagent) to v1.88 or to any version beyond that.

vmagent automatically detects whether the configured remote storage supports VictoriaMetrics remote write protocol and uses this protocol instead of Prometheus remote write protocol for data transfer. See these docs for more details.

Future work

#

We are going to explore and implement more advanced algorithms in the future versions of VictoriaMetrics remote write protocol, in order to gain bigger cost savings.

It would be great if Prometheus itself and other Prometheus-compatible remote storage systems would support VictoriaMetrics the remote write protocol as well. The whole Prometheus ecosystem would benefit from this!