惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

博客园_首页
D
Darknet – Hacking Tools, Hacker News & Cyber Security
Cyberwarzone
Cyberwarzone
C
CERT Recently Published Vulnerability Notes
Hacker News: Ask HN
Hacker News: Ask HN
AI
AI
T
The Exploit Database - CXSecurity.com
C
Cybersecurity and Infrastructure Security Agency CISA
Project Zero
Project Zero
Security Latest
Security Latest
Google Online Security Blog
Google Online Security Blog
Schneier on Security
Schneier on Security
P
Proofpoint News Feed
K
Kaspersky official blog
Security Archives - TechRepublic
Security Archives - TechRepublic
Help Net Security
Help Net Security
L
LINUX DO - 最新话题
Attack and Defense Labs
Attack and Defense Labs
T
Threatpost
P
Privacy International News Feed
P
Privacy & Cybersecurity Law Blog
www.infosecurity-magazine.com
www.infosecurity-magazine.com
PCI Perspectives
PCI Perspectives
博客园 - Franky
C
Cisco Blogs
aimingoo的专栏
aimingoo的专栏
Stack Overflow Blog
Stack Overflow Blog
T
Tor Project blog
N
Netflix TechBlog - Medium
The Last Watchdog
The Last Watchdog
Know Your Adversary
Know Your Adversary
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
The GitHub Blog
The GitHub Blog
Latest news
Latest news
Recorded Future
Recorded Future
M
MIT News - Artificial intelligence
博客园 - 叶小钗
H
Hacker News: Front Page
S
Secure Thoughts
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
阮一峰的网络日志
阮一峰的网络日志
S
Schneier on Security
Blog — PlanetScale
Blog — PlanetScale
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
腾讯CDC
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
量子位
L
LINUX DO - 热门话题

Ivan on Containers, Kubernetes, and Server-Side

A grounded take on agentic coding for production environments Server-Side Playgrounds Reimagined: Build, Boot, and Network Your Own Virtual Labs [not a] Kubernetes 101 - Pods, Deployments, and Services As an Attempt To Automate Age-Old Infra Patterns JavaScript or TypeScript? How To Benefit From the Dichotomy On Software Design... and Good Writing Building a Firecracker-Powered Course Platform To Learn Docker and Kubernetes How To Publish a Port of a Running Container What Actually Happens When You Publish a Container Port A Visual Guide to SSH Tunnels: Local and Remote Port Forwarding Debugging Containers Like a Pro Docker: How To Debug Distroless And Slim Containers How To Extract Container Image Filesystem Using Docker | iximiuz Labs In Pursuit of Better Container Images: Alpine, Distroless, Apko, Chisel, DockerSlim, oh my! How To Start Programming In Go: Advice For Fellow DevOps Engineers Kubernetes Ephemeral Containers and kubectl debug Command How To Develop Kubernetes CLIs Like a Pro Docker Container Commands Explained: Understand, Don't Memorize | iximiuz Labs Learning Docker with Docker - Toying With DinD For Fun And Profit How To Extend Kubernetes API - Kubernetes vs. Django The Influence of Plumbing on Programming How To Call Kubernetes API from Go - Types and Common Machinery How To Call Kubernetes API using Simple HTTP Client Kubernetes API Basics - Resources, Kinds, and Objects OpenFaaS - Run Containerized Functions On Your Own Terms Learning Containers From The Bottom Up Docker Containers vs. Kubernetes Pods - Taking a Deeper Look | iximiuz Labs Learn-by-Doing Platforms for Dev, DevOps, and SRE Folks How HTTP Keep-Alive can cause TCP race condition How to Work with Container Images Using ctr | iximiuz Labs Multiple Containers, Same Port, no Reverse Proxy... Exploring Go net/http Package - On How Not To Set Socket Options Disposable Local Development Environments with Vagrant, Docker, and Arkade DevOps, SRE, and Platform Engineering My Choice of Programming Languages How to learn PromQL with Prometheus Playground Prometheus Cheat Sheet - Basics (Metrics, Labels, Time Series, Scraping) Rust - Writing Parsers With nom Parser Combinator Framework pq - parse and query log files as time series Prometheus Cheat Sheet - Moving Average, Max, Min, etc (Aggregation Over Time) Prometheus Cheat Sheet - How to Join Multiple Metrics (Vector Matching) The Need For Slimmer Containers Understanding Rust Privacy and Visibility Model Bridge vs. Switch: Takeaways from a Real Data Center Tour | iximiuz Labs From LAN to VXLAN: Networking Basics for Non-Network Engineers | iximiuz Labs KiND - How I Wasted a Day Loading Local Docker Images Go, HTTP handlers, panic, and deadlocks Exploring Kubernetes Operator Pattern Making Sense Out Of Cloud Native Buzz Service Discovery in Kubernetes: Combining the Best of Two Worlds API Developers Never REST How Container Networking Works: Building a Bridge Network From Scratch | iximiuz Labs Traefik: canary deployments with weighted load balancing Service Proxy, Pod, Sidecar, oh my! You Need Containers To Build Images You Don't Need an Image To Run a Container Not Every Container Has an Operating System Inside Working with container images in Go Master Go While Learning Containers Implementing Container Runtime Shim: Interactive Containers How to use Flask with gevent (uWSGI and Gunicorn editions) My 10 Years of Programming Experience Implementing Container Runtime Shim: First Code Implementing Container Runtime Shim: runc Kubernetes Repository On Flame Dealing with process termination in Linux (with Rust examples) conman - [the] Container Manager: Inception Journey From Containerization To Orchestration And Beyond Linux PTY - How docker attach and docker exec Commands Work Inside Illustrated introduction to Linux iptables From Docker Container to Bootable Linux Disk Image Пишем свой веб-сервер на Python: протокол HTTP 9001 способ создать веб-сервер на Python Explaining async/await in 200 lines of code Explaining event loop in 100 lines of code Save the day with gevent Пишем свой веб-сервер на Python: процессы, потоки и асинхронный I/O Truly optional scalar types in protobuf3 (with Go examples) Node.js Writable streams distilled Node.js Readable streams distilled How to on starting processes (mostly in Linux) Дайджест интересных ссылок – Июль 2016 Пишем свой веб-сервер на Python: сокеты Наследование в JavaScript Мастерить!
Prometheus Is Not a TSDB
Ivan Velichko · 2021-07-24 · via Ivan on Containers, Kubernetes, and Server-Side

Misconception - the right word to explain my early Prometheus experience. I came to Prometheus with vast Graphite and moderate InfluxDB experience. In my eyes, Graphite was a highly performant but fairly limited system. Metrics in Graphite are just strings (well, dotted), and the values are always stored aggregated with the lowest possible resolution of 1 second. But due to these limitations, Graphite is fast. In contrast, InfluxDB adopts Metrics 2.0 format with multiple tags and fields per metric. It also allows the storage of non-aggregated data points with impressive nanosecond precision. But this power needs to be used carefully. Otherwise, you'll get all sorts of performance issues.

Prometheus is not a TSDB

For some reason, I expected Prometheus to reside somewhere in between these two systems. A kinda-sorta system that takes the best of both worlds: rich labeled metrics, non-aggregated values, and high query performance.

And at first, it indeed felt as such! But then I started noticing that I cannot really explain some of the query results. Like at all. Or sometimes, I couldn't find evidence in metrics that just had to be there. Like the metrics were showing me a different picture than I was observing with my eyes while analyzing raw data such as web server access logs.

So, I started looking for more details. I wanted to understand precisely how metrics are collected, how they are stored, what a query execution model is, et cetera, et cetera. And at first, I was shocked by my findings! Oftentimes, the Prometheus behavior didn't make any sense, especially comparing to Graphite or InfluxDB! But then it occurred to me that I was missing one important detail...

Both Graphite and InfluxDB are pure time-series databases (TSDB). Yes, they are often used as metric storage for monitoring purposes. But every particular setup of these systems comes with certain trade-offs and bolt-on additions addressing performance or reliability concerns. For instance, there is often a statsd-like daemon in front of your Graphite doing preaggregation; you use different rollup strategies for older data points, etc. But normally, you are aware of that. So when you query the last couple of days of metrics, you expect them to have a secondly precision. But when you query something a week or a month old, you already know that each data point represents a minute of aggregated data, not a second.

However, Prometheus is not a TSDB.

Prometheus is a monitoring system that just happens to use a TSDB under the hood. So, some of these trade-offs one would usually find in a typical Graphite installation were already taken by Prometheus authors. But maybe not just all of them were made clear in the documentation 🙈

So, when I would find a certain query producing quirky results, it would be due to my misunderstanding of Prometheus as a whole, not just its query execution model. Aiming for results that would be reasonable for a pure TSDB apparently may be way too expensive in a monitoring context. Prometheus, as a metric collection system, is tailored for monitoring purposes from day one. It does provide good collection, storage, and query performance. But it may sacrifice the data precision (scraping metrics every 10-30 seconds), or completeness (tolerating missing scrapes with a 5m long lookback delta), or extrapolate your latency distribution instead of keeping the actual measurements (that's how histogram_quantile() actually works).

Once I realized this difference, I reconsidered my expectations from the system. However, I still needed a better way to understand the PromQL's query execution model than the official documentation provides. There is also plenty of Prometheus and PromQL articles out there, but they give you just a shallow overview of the query language. Only a sweeping search finally revealed these two much more profound resources - PromLabs' blog and Robust Perception's blog. Both are apparently written by Prometheus creators.

But practice is even better than reading 😉 So, I went ahead and hacked a local Prometheus playground to solidify my learnings. And of course, I'm always striving to write down stuff on the way, mostly for future me, but feel free to check out my other Prometheus posts:

PromQL diagrams