惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

V2EX - 技术
V2EX - 技术
L
LangChain Blog
IT之家
IT之家
S
SegmentFault 最新的问题
博客园 - 三生石上(FineUI控件)
H
Hackread – Cybersecurity News, Data Breaches, AI and More
T
The Blog of Author Tim Ferriss
Blog — PlanetScale
Blog — PlanetScale
N
Netflix TechBlog - Medium
U
Unit 42
B
Blog RSS Feed
GbyAI
GbyAI
Microsoft Security Blog
Microsoft Security Blog
博客园 - 司徒正美
Apple Machine Learning Research
Apple Machine Learning Research
T
Threatpost
C
CERT Recently Published Vulnerability Notes
Cisco Talos Blog
Cisco Talos Blog
The Register - Security
The Register - Security
Vercel News
Vercel News
S
Schneier on Security
Spread Privacy
Spread Privacy
C
Cyber Attacks, Cyber Crime and Cyber Security
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
博客园 - 叶小钗
雷峰网
雷峰网
博客园_首页
人人都是产品经理
人人都是产品经理
P
Palo Alto Networks Blog
The Hacker News
The Hacker News
T
Tor Project blog
L
Lohrmann on Cybersecurity
Know Your Adversary
Know Your Adversary
D
Darknet – Hacking Tools, Hacker News & Cyber Security
C
Cybersecurity and Infrastructure Security Agency CISA
P
Privacy International News Feed
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
Tenable Blog
V
Vulnerabilities – Threatpost
大猫的无限游戏
大猫的无限游戏
博客园 - 【当耐特】
V
V2EX
Security Latest
Security Latest
A
About on SuperTechFans
Cloudbric
Cloudbric
S
Security Affairs
MongoDB | Blog
MongoDB | Blog
Y
Y Combinator Blog
Martin Fowler
Martin Fowler
TaoSecurity Blog
TaoSecurity Blog

Ivan on Containers, Kubernetes, and Server-Side

A grounded take on agentic coding for production environments Server-Side Playgrounds Reimagined: Build, Boot, and Network Your Own Virtual Labs [not a] Kubernetes 101 - Pods, Deployments, and Services As an Attempt To Automate Age-Old Infra Patterns JavaScript or TypeScript? How To Benefit From the Dichotomy On Software Design... and Good Writing Building a Firecracker-Powered Course Platform To Learn Docker and Kubernetes How To Publish a Port of a Running Container What Actually Happens When You Publish a Container Port A Visual Guide to SSH Tunnels: Local and Remote Port Forwarding Debugging Containers Like a Pro Docker: How To Debug Distroless And Slim Containers How To Extract Container Image Filesystem Using Docker | iximiuz Labs In Pursuit of Better Container Images: Alpine, Distroless, Apko, Chisel, DockerSlim, oh my! How To Start Programming In Go: Advice For Fellow DevOps Engineers Kubernetes Ephemeral Containers and kubectl debug Command How To Develop Kubernetes CLIs Like a Pro Docker Container Commands Explained: Understand, Don't Memorize | iximiuz Labs Learning Docker with Docker - Toying With DinD For Fun And Profit How To Extend Kubernetes API - Kubernetes vs. Django The Influence of Plumbing on Programming How To Call Kubernetes API from Go - Types and Common Machinery How To Call Kubernetes API using Simple HTTP Client Kubernetes API Basics - Resources, Kinds, and Objects OpenFaaS - Run Containerized Functions On Your Own Terms Learning Containers From The Bottom Up Docker Containers vs. Kubernetes Pods - Taking a Deeper Look | iximiuz Labs Learn-by-Doing Platforms for Dev, DevOps, and SRE Folks How HTTP Keep-Alive can cause TCP race condition How to Work with Container Images Using ctr | iximiuz Labs Multiple Containers, Same Port, no Reverse Proxy... Exploring Go net/http Package - On How Not To Set Socket Options Disposable Local Development Environments with Vagrant, Docker, and Arkade DevOps, SRE, and Platform Engineering My Choice of Programming Languages How to learn PromQL with Prometheus Playground Prometheus Cheat Sheet - Basics (Metrics, Labels, Time Series, Scraping) Rust - Writing Parsers With nom Parser Combinator Framework pq - parse and query log files as time series Prometheus Cheat Sheet - Moving Average, Max, Min, etc (Aggregation Over Time) Prometheus Cheat Sheet - How to Join Multiple Metrics (Vector Matching) The Need For Slimmer Containers Understanding Rust Privacy and Visibility Model Bridge vs. Switch: Takeaways from a Real Data Center Tour | iximiuz Labs From LAN to VXLAN: Networking Basics for Non-Network Engineers | iximiuz Labs KiND - How I Wasted a Day Loading Local Docker Images Go, HTTP handlers, panic, and deadlocks Exploring Kubernetes Operator Pattern Making Sense Out Of Cloud Native Buzz Service Discovery in Kubernetes: Combining the Best of Two Worlds API Developers Never REST How Container Networking Works: Building a Bridge Network From Scratch | iximiuz Labs Traefik: canary deployments with weighted load balancing Service Proxy, Pod, Sidecar, oh my! You Need Containers To Build Images You Don't Need an Image To Run a Container Not Every Container Has an Operating System Inside Working with container images in Go Master Go While Learning Containers Implementing Container Runtime Shim: Interactive Containers How to use Flask with gevent (uWSGI and Gunicorn editions) My 10 Years of Programming Experience Implementing Container Runtime Shim: First Code Implementing Container Runtime Shim: runc Kubernetes Repository On Flame Dealing with process termination in Linux (with Rust examples) conman - [the] Container Manager: Inception Journey From Containerization To Orchestration And Beyond Linux PTY - How docker attach and docker exec Commands Work Inside Illustrated introduction to Linux iptables From Docker Container to Bootable Linux Disk Image Пишем свой веб-сервер на Python: протокол HTTP 9001 способ создать веб-сервер на Python Explaining async/await in 200 lines of code Explaining event loop in 100 lines of code Save the day with gevent Пишем свой веб-сервер на Python: процессы, потоки и асинхронный I/O Truly optional scalar types in protobuf3 (with Go examples) Node.js Writable streams distilled Node.js Readable streams distilled How to on starting processes (mostly in Linux) Дайджест интересных ссылок – Июль 2016 Пишем свой веб-сервер на Python: сокеты Наследование в JavaScript Мастерить!
Prometheus Is Not a TSDB
Ivan Velichko · 2021-07-24 · via Ivan on Containers, Kubernetes, and Server-Side

Misconception - the right word to explain my early Prometheus experience. I came to Prometheus with vast Graphite and moderate InfluxDB experience. In my eyes, Graphite was a highly performant but fairly limited system. Metrics in Graphite are just strings (well, dotted), and the values are always stored aggregated with the lowest possible resolution of 1 second. But due to these limitations, Graphite is fast. In contrast, InfluxDB adopts Metrics 2.0 format with multiple tags and fields per metric. It also allows the storage of non-aggregated data points with impressive nanosecond precision. But this power needs to be used carefully. Otherwise, you'll get all sorts of performance issues.

Prometheus is not a TSDB

For some reason, I expected Prometheus to reside somewhere in between these two systems. A kinda-sorta system that takes the best of both worlds: rich labeled metrics, non-aggregated values, and high query performance.

And at first, it indeed felt as such! But then I started noticing that I cannot really explain some of the query results. Like at all. Or sometimes, I couldn't find evidence in metrics that just had to be there. Like the metrics were showing me a different picture than I was observing with my eyes while analyzing raw data such as web server access logs.

So, I started looking for more details. I wanted to understand precisely how metrics are collected, how they are stored, what a query execution model is, et cetera, et cetera. And at first, I was shocked by my findings! Oftentimes, the Prometheus behavior didn't make any sense, especially comparing to Graphite or InfluxDB! But then it occurred to me that I was missing one important detail...

Both Graphite and InfluxDB are pure time-series databases (TSDB). Yes, they are often used as metric storage for monitoring purposes. But every particular setup of these systems comes with certain trade-offs and bolt-on additions addressing performance or reliability concerns. For instance, there is often a statsd-like daemon in front of your Graphite doing preaggregation; you use different rollup strategies for older data points, etc. But normally, you are aware of that. So when you query the last couple of days of metrics, you expect them to have a secondly precision. But when you query something a week or a month old, you already know that each data point represents a minute of aggregated data, not a second.

However, Prometheus is not a TSDB.

Prometheus is a monitoring system that just happens to use a TSDB under the hood. So, some of these trade-offs one would usually find in a typical Graphite installation were already taken by Prometheus authors. But maybe not just all of them were made clear in the documentation 🙈

So, when I would find a certain query producing quirky results, it would be due to my misunderstanding of Prometheus as a whole, not just its query execution model. Aiming for results that would be reasonable for a pure TSDB apparently may be way too expensive in a monitoring context. Prometheus, as a metric collection system, is tailored for monitoring purposes from day one. It does provide good collection, storage, and query performance. But it may sacrifice the data precision (scraping metrics every 10-30 seconds), or completeness (tolerating missing scrapes with a 5m long lookback delta), or extrapolate your latency distribution instead of keeping the actual measurements (that's how histogram_quantile() actually works).

Once I realized this difference, I reconsidered my expectations from the system. However, I still needed a better way to understand the PromQL's query execution model than the official documentation provides. There is also plenty of Prometheus and PromQL articles out there, but they give you just a shallow overview of the query language. Only a sweeping search finally revealed these two much more profound resources - PromLabs' blog and Robust Perception's blog. Both are apparently written by Prometheus creators.

But practice is even better than reading 😉 So, I went ahead and hacked a local Prometheus playground to solidify my learnings. And of course, I'm always striving to write down stuff on the way, mostly for future me, but feel free to check out my other Prometheus posts:

PromQL diagrams