惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Heimdal Security Blog
小众软件
小众软件
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
罗磊的独立博客
Google DeepMind News
Google DeepMind News
大猫的无限游戏
大猫的无限游戏
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Hugging Face - Blog
Hugging Face - Blog
阮一峰的网络日志
阮一峰的网络日志
A
About on SuperTechFans
宝玉的分享
宝玉的分享
博客园 - 聂微东
月光博客
月光博客
Cyberwarzone
Cyberwarzone
Microsoft Security Blog
Microsoft Security Blog
V
Visual Studio Blog
Project Zero
Project Zero
T
Tor Project blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
L
LINUX DO - 最新话题
博客园 - 叶小钗
Recent Commits to openclaw:main
Recent Commits to openclaw:main
Attack and Defense Labs
Attack and Defense Labs
Spread Privacy
Spread Privacy
Forbes - Security
Forbes - Security
Simon Willison's Weblog
Simon Willison's Weblog
N
Netflix TechBlog - Medium
P
Proofpoint News Feed
Engineering at Meta
Engineering at Meta
Hacker News: Ask HN
Hacker News: Ask HN
I
InfoQ
M
MIT News - Artificial intelligence
AI
AI
博客园 - 三生石上(FineUI控件)
W
WeLiveSecurity
C
Check Point Blog
The Hacker News
The Hacker News
C
Cyber Attacks, Cyber Crime and Cyber Security
Application and Cybersecurity Blog
Application and Cybersecurity Blog
T
Tenable Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
The Cloudflare Blog
Blog — PlanetScale
Blog — PlanetScale
美团技术团队
D
Darknet – Hacking Tools, Hacker News & Cyber Security
GbyAI
GbyAI
Hacker News - Newest:
Hacker News - Newest: "LLM"
腾讯CDC
K
Kaspersky official blog

Blog — PlanetScale

Keeping a Postgres queue healthy — PlanetScale Patterns for Postgres Traffic Control — PlanetScale Graceful degradation in Postgres — PlanetScale High memory usage in Postgres is good, actually — PlanetScale Stripe Projects partnership: Provision PlanetScale Postgres and MySQL databases from the Stripe CLI — PlanetScale Enhanced tagging in Postgres Query Insights — PlanetScale Behind the scenes: How Database Traffic Control works — PlanetScale Introducing Database Traffic Control — PlanetScale Scaling Postgres connections with PgBouncer — PlanetScale Drizzle joins PlanetScale — PlanetScale Video Conferencing with Postgres — PlanetScale Faster PlanetScale Postgres connections with Cloudflare Hyperdrive — PlanetScale Introducing the PlanetScale MCP server — PlanetScale Database Transactions — PlanetScale Automating our changelog with Cursor commands — PlanetScale Postgres 18 is now available — PlanetScale Using MotherDuck with PlanetScale — PlanetScale $50 PlanetScale Metal is GA for Postgres — PlanetScale AI-Powered Postgres index suggestions — PlanetScale $5 PlanetScale is live — PlanetScale Announcing Vitess 23 — PlanetScale $50 PlanetScale Metal — PlanetScale Report on our investigation of the 2025-10-20 incident in AWS us-east-1 — PlanetScale $5 PlanetScale — PlanetScale Benchmarking Postgres 17 vs 18 — PlanetScale Larger than RAM Vector Indexes for Relational Databases — PlanetScale Partnering with Cloudflare to bring you the fastest globally distributed applications — PlanetScale Processes and Threads — PlanetScale PlanetScale for Postgres is now GA — PlanetScale Postgres High Availability with CDC — PlanetScale Announcing Neki — PlanetScale Caching — PlanetScale The principles of extreme fault tolerance — PlanetScale Announcing PlanetScale for Postgres — PlanetScale Benchmarking Postgres — PlanetScale Announcing Vitess 22 — PlanetScale The Real Failure Rate of EBS — PlanetScale IO devices and latency — PlanetScale Announcing PlanetScale Metal — PlanetScale PlanetScale Metal: There’s no replacement for displacement — PlanetScale Upgrading Query Insights to Metal — PlanetScale Automating cherry-picks between OSS and private forks — PlanetScale Database Sharding — PlanetScale Anatomy of a Throttler, part 3 — PlanetScale Introducing sharding on PlanetScale with workflows — PlanetScale Announcing Vitess 21 — PlanetScale Announcing the PlanetScale vectors public beta — PlanetScale Anatomy of a Throttler, part 2 — PlanetScale Instant deploy requests — PlanetScale Anatomy of a Throttler, part 1 — PlanetScale Increase IOPS and throughput with sharding — PlanetScale Tracking index usage with Insights — PlanetScale Faster backups with sharding — PlanetScale Building data pipelines with Vitess — PlanetScale The State of Online Schema Migrations in MySQL — PlanetScale Optimizing aggregation in the Vitess query planner — PlanetScale Dealing with large tables — PlanetScale Announcing Vitess 20 — PlanetScale Self-managed Vitess vs Managed Vitess with PlanetScale — PlanetScale Achieving data consistency with the consistent lookup Vindex — PlanetScale The MySQL adaptive hash index — PlanetScale Introducing global replica credentials — PlanetScale Profiling memory usage in MySQL — PlanetScale Summer 2023: Fuzzing Vitess at PlanetScale — PlanetScale How PlanetScale makes schema changes — PlanetScale Identifying and profiling problematic MySQL queries — PlanetScale The Problem with Using a UUID Primary Key in MySQL — PlanetScale Announcing Vitess 19 — PlanetScale PlanetScale forever — PlanetScale Introducing schema recommendations — PlanetScale Amazon Aurora Pricing: The many surprising costs of running an Aurora database — PlanetScale Three common MySQL database design mistakes — PlanetScale OAuth applications are now available to everyone — PlanetScale Deprecating the Scaler plan — PlanetScale PlanetScale branching vs. Amazon Aurora blue/green deployments — PlanetScale Databases at scale — PlanetScale Considerations for building a database disaster recovery plan — PlanetScale Working with Geospatial Features in MySQL — PlanetScale PlanetScale vs Amazon Aurora replication — PlanetScale Introducing the Vantage and PlanetScale integration — PlanetScale MySQL isolation levels and how they work — PlanetScale Introducing the schemadiff command line tool — PlanetScale $ pscale ping — PlanetScale Announcing foreign key constraints support — PlanetScale The challenges of supporting foreign key constraints — PlanetScale What is HTAP? — PlanetScale Introducing Insights Anomalies — PlanetScale Webhook security: a hands-on guide — PlanetScale MySQL replication: Best practices and considerations — PlanetScale A guide to HTML email with Ruby on Rails and Tailwind CSS — PlanetScale Sharding for cost-effective database management — PlanetScale PlanetScale ranks 188th in Deloitte’s top 500 fastest-growing companies — PlanetScale Announcing the Fivetran integration — PlanetScale Introducing webhooks — PlanetScale What is MySQL replication and when should you use it? — PlanetScale Sync user data between Clerk and a PlanetScale MySQL database — PlanetScale Introducing database reports — PlanetScale What is MySQL partitioning? — PlanetScale MySQL High Availability: Connection handling and concurrency — PlanetScale Personalizing your onboarding with Markdoc — PlanetScale
Distributed caching systems and MySQL — PlanetScale
Brian Morrison II · 2023-10-11 · via Blog — PlanetScale

Brian Morrison II |

As the usage of your app grows, performance can steadily decline.

There’s nothing necessarily surprising by that statement, but what is surprising is the number of bottlenecks that can surface and the options available to you to fix them. One such bottleneck can be directly related to the time it takes to read from and write to your database. After all, behind the complexities of a relational database, you’re still working with storage systems that have inherent IO latency.

This is where a well-architected caching system can help.

A good caching system can reduce the load on your database and increase the general performance of your application. A faster application results in happier users and potentially more revenue, which is always a good thing! However, caching systems have their own setup complexities, along with a number of gotchas that might creep up unexpectedly.

In this article, we’ll explore backend caching, how to implement it, how caching works with MySQL, and potential issues to watch out for.

What is a cache?

At its very core, a cache is a component that stores data so that the data may be accessed more quickly.

Caches can be either hardware or software-based. Oftentimes, caching systems store data in memory, which allows accessing and manipulating that data to be much faster since it’s not working with traditional storage systems like solid state or spinning hard drives. These caching systems can be run locally on a server but can also be configured to work independently on their own system.

Caches that run independently of other services are known as distributed caches.

Redis and memcache are two very popular distributed caching systems that can be accessed by external systems. These systems will leverage the memory of the system they are running on to store data in a key/value setup, allowing developers to quickly call back data based on a specific key. They can even be configured on multiple servers as a cluster, adding to their distributed nature.

So how can caching and MySQL work together for a faster application?

Note

Want to learn about how Instagram scaled performance in the early days? Check out this Database Caching Tech talk from Rick Branson, Instagram's first full-time backend engineer, where he spills it all.

Implement caching into a MySQL environment

While MySQL does some minimal caching in the form of query caching, it leaves much to be desired.

So in order to properly utilize a caching system, it needs data. MySQL does not have any built-in mechanisms to hydrate (or fill) a cache, so the responsibility for this task will lie in the application code. Ultimately, you’ll need to build a system that will return data from a cache if it exists there, or return it from the database and hydrate the cache if it’s not.

There are two common patterns that can be used when designing a caching system: look-aside and look-through.

Example: retrieving follower count for a user

For the examples that follow, we’ll be using the following database diagram. It mimics a social media platform with two tables: users and followers.

Diagram of the social media app tables

Each example will show how caching can increase load times for viewing the number of followers a given user has. This may sound like a relatively simple problem, but consider the load that would be placed on a database if a user has a particularly high number of followers. Every time the user’s profile is viewed, a SELECT COUNT query would have to be run against a table to simply return a number.

The query to look up the number of followers would be this:

SELECT COUNT(*) FROM followers WHERE follower_id = ?

Look-aside cache

A look-aside cache is a system that sides outside of the data access path of your database.

Typically this setup has two distinct steps in its workflow. Using our example of retrieving follower count, the application would first check to see if the cache contains the follower count for the requested user. If the cache does not contain that information (this is known as a cache miss), the code would grab the value directly from the database, populate the cache for future requests, and finally, return it to the caller.

Here is what that flow might look like visually:

A look-aside cache diagram

Look-through cache

A look-through cache is a system that sits in line with the data access path, in front of the database.

This scenario would have the code hit the cache directly. If a cache miss is experienced, it would be up to the cache software to request the data from the database and populate itself. During this time, the caller would wait for that part of the process to complete before returning to the user.

This is a diagram that demonstrates how a look-through cache setup might look like:

A look-through cache diagram

Potential issues with caching

In the following sections, we’ll explore some potential issues with caching systems, along with ways that these problems can be mitigated.

Inconsistent data

One issue is that data within the cache is not accurate or up to date.

Let's say our caller requests the follower count at 9:00 am and a value of 1,120 is returned. Then at 9:15 am, that user gains an extra 1,500 followers because a post went viral. You’ll need to consider a method by which the cache is updated.

The first potential solution is to simply update the cache whenever a new follower is added.

The benefit of this approach is that it is relatively straightforward. Instead of just issuing an UPDATE statement to the database, you’d also increment the value currently stored in the cache. Where this becomes problematic is that for each of those 1,500 new followers, you’re also updating your cache (and your database) 1,500 times.

The second solution is to store a cache expiration with each value.

In this scenario, you could have a cache expiration value of 20 minutes. When the cache is populated at 9:00 am, that value will stay in the cache until 9:20 am. When the users’ follower count increases by 1500, any callers between 9:15 am and 9:20 am will receive the old count, but it's a short enough window that it may be acceptable to prevent systematic issues with your architecture.

Each solution has its tradeoffs, which is something to consider for your environment when configuring a cache.

Thundering herd

The thundering herd problem refers to a time in which too many callers are trying to update the cache at the same time, causing performance issues.

Using the same example from the previous section, let's assume you had 3,000 clients attempt to request data from the cache at exactly 9:20 am. All 3,000 processes would determine that the cache is expired and they’d attempt to rehydrate the cache simultaneously. Not only could this cause issues, but it's entirely unnecessary.

A solution to this issue would be to configure what's called a “cache lease.”

Essentially, each client would have a unique identifier. When the first caller attempts to rehydrate the cache, the system will note its unique ID as the process responsible for updating that specific value. All other clients would receive the old value while the cache is being updated.

Once the value is updated, the lease is released until the value in the cache expires again, and future requests will receive the most up-to-date value.