惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Heimdal Security Blog
小众软件
小众软件
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
罗磊的独立博客
Google DeepMind News
Google DeepMind News
大猫的无限游戏
大猫的无限游戏
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Hugging Face - Blog
Hugging Face - Blog
阮一峰的网络日志
阮一峰的网络日志
A
About on SuperTechFans
宝玉的分享
宝玉的分享
博客园 - 聂微东
月光博客
月光博客
Cyberwarzone
Cyberwarzone
Microsoft Security Blog
Microsoft Security Blog
V
Visual Studio Blog
Project Zero
Project Zero
T
Tor Project blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
L
LINUX DO - 最新话题
博客园 - 叶小钗
Recent Commits to openclaw:main
Recent Commits to openclaw:main
Attack and Defense Labs
Attack and Defense Labs
Spread Privacy
Spread Privacy
Forbes - Security
Forbes - Security
Simon Willison's Weblog
Simon Willison's Weblog
N
Netflix TechBlog - Medium
P
Proofpoint News Feed
Engineering at Meta
Engineering at Meta
Hacker News: Ask HN
Hacker News: Ask HN
I
InfoQ
M
MIT News - Artificial intelligence
AI
AI
博客园 - 三生石上(FineUI控件)
W
WeLiveSecurity
C
Check Point Blog
The Hacker News
The Hacker News
C
Cyber Attacks, Cyber Crime and Cyber Security
Application and Cybersecurity Blog
Application and Cybersecurity Blog
T
Tenable Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
The Cloudflare Blog
Blog — PlanetScale
Blog — PlanetScale
美团技术团队
D
Darknet – Hacking Tools, Hacker News & Cyber Security
GbyAI
GbyAI
Hacker News - Newest:
Hacker News - Newest: "LLM"
腾讯CDC
K
Kaspersky official blog

Blog — PlanetScale

Keeping a Postgres queue healthy — PlanetScale Patterns for Postgres Traffic Control — PlanetScale Graceful degradation in Postgres — PlanetScale High memory usage in Postgres is good, actually — PlanetScale Stripe Projects partnership: Provision PlanetScale Postgres and MySQL databases from the Stripe CLI — PlanetScale Enhanced tagging in Postgres Query Insights — PlanetScale Behind the scenes: How Database Traffic Control works — PlanetScale Introducing Database Traffic Control — PlanetScale Scaling Postgres connections with PgBouncer — PlanetScale Drizzle joins PlanetScale — PlanetScale Video Conferencing with Postgres — PlanetScale Faster PlanetScale Postgres connections with Cloudflare Hyperdrive — PlanetScale Introducing the PlanetScale MCP server — PlanetScale Database Transactions — PlanetScale Automating our changelog with Cursor commands — PlanetScale Postgres 18 is now available — PlanetScale Using MotherDuck with PlanetScale — PlanetScale $50 PlanetScale Metal is GA for Postgres — PlanetScale AI-Powered Postgres index suggestions — PlanetScale $5 PlanetScale is live — PlanetScale Announcing Vitess 23 — PlanetScale $50 PlanetScale Metal — PlanetScale Report on our investigation of the 2025-10-20 incident in AWS us-east-1 — PlanetScale $5 PlanetScale — PlanetScale Benchmarking Postgres 17 vs 18 — PlanetScale Larger than RAM Vector Indexes for Relational Databases — PlanetScale Partnering with Cloudflare to bring you the fastest globally distributed applications — PlanetScale Processes and Threads — PlanetScale PlanetScale for Postgres is now GA — PlanetScale Postgres High Availability with CDC — PlanetScale Announcing Neki — PlanetScale Caching — PlanetScale The principles of extreme fault tolerance — PlanetScale Announcing PlanetScale for Postgres — PlanetScale Benchmarking Postgres — PlanetScale Announcing Vitess 22 — PlanetScale The Real Failure Rate of EBS — PlanetScale IO devices and latency — PlanetScale Announcing PlanetScale Metal — PlanetScale PlanetScale Metal: There’s no replacement for displacement — PlanetScale Upgrading Query Insights to Metal — PlanetScale Automating cherry-picks between OSS and private forks — PlanetScale Database Sharding — PlanetScale Anatomy of a Throttler, part 3 — PlanetScale Introducing sharding on PlanetScale with workflows — PlanetScale Announcing Vitess 21 — PlanetScale Announcing the PlanetScale vectors public beta — PlanetScale Anatomy of a Throttler, part 2 — PlanetScale Instant deploy requests — PlanetScale Anatomy of a Throttler, part 1 — PlanetScale Increase IOPS and throughput with sharding — PlanetScale Tracking index usage with Insights — PlanetScale Faster backups with sharding — PlanetScale Building data pipelines with Vitess — PlanetScale The State of Online Schema Migrations in MySQL — PlanetScale Optimizing aggregation in the Vitess query planner — PlanetScale Dealing with large tables — PlanetScale Announcing Vitess 20 — PlanetScale Self-managed Vitess vs Managed Vitess with PlanetScale — PlanetScale Achieving data consistency with the consistent lookup Vindex — PlanetScale The MySQL adaptive hash index — PlanetScale Introducing global replica credentials — PlanetScale Profiling memory usage in MySQL — PlanetScale Summer 2023: Fuzzing Vitess at PlanetScale — PlanetScale How PlanetScale makes schema changes — PlanetScale Identifying and profiling problematic MySQL queries — PlanetScale The Problem with Using a UUID Primary Key in MySQL — PlanetScale Announcing Vitess 19 — PlanetScale PlanetScale forever — PlanetScale Introducing schema recommendations — PlanetScale Amazon Aurora Pricing: The many surprising costs of running an Aurora database — PlanetScale Three common MySQL database design mistakes — PlanetScale OAuth applications are now available to everyone — PlanetScale Deprecating the Scaler plan — PlanetScale PlanetScale branching vs. Amazon Aurora blue/green deployments — PlanetScale Databases at scale — PlanetScale Considerations for building a database disaster recovery plan — PlanetScale Working with Geospatial Features in MySQL — PlanetScale PlanetScale vs Amazon Aurora replication — PlanetScale Introducing the Vantage and PlanetScale integration — PlanetScale MySQL isolation levels and how they work — PlanetScale Introducing the schemadiff command line tool — PlanetScale $ pscale ping — PlanetScale Announcing foreign key constraints support — PlanetScale The challenges of supporting foreign key constraints — PlanetScale What is HTAP? — PlanetScale Introducing Insights Anomalies — PlanetScale Webhook security: a hands-on guide — PlanetScale A guide to HTML email with Ruby on Rails and Tailwind CSS — PlanetScale Sharding for cost-effective database management — PlanetScale PlanetScale ranks 188th in Deloitte’s top 500 fastest-growing companies — PlanetScale Announcing the Fivetran integration — PlanetScale Introducing webhooks — PlanetScale What is MySQL replication and when should you use it? — PlanetScale Sync user data between Clerk and a PlanetScale MySQL database — PlanetScale Introducing database reports — PlanetScale Distributed caching systems and MySQL — PlanetScale What is MySQL partitioning? — PlanetScale MySQL High Availability: Connection handling and concurrency — PlanetScale Personalizing your onboarding with Markdoc — PlanetScale
MySQL replication: Best practices and considerations — PlanetScale
Brian Morrison II · 2023-11-15 · via Blog — PlanetScale

Brian Morrison II |

MySQL offers a wide array of options to configure replication, but with all of those options, how can you be sure you are doing it right?

Replication is the first step to providing a higher level of availability to your MySQL database. A well configured replication architecture can be the difference between your data being highly available, or your MySQL setup becoming a management nightmare. At PlanetScale, we support hundreds of thousands of database clusters, all using replication to provide high availability, so we have a little bit of experience in this arena!

In this article, we’re going to explore some of the best practices when it comes to replication, both locally and across longer distances.

Use an active/passive configuration

When replicating with active/passive mode, one MySQL server acts as the source and all other servers are read-only replicas from that source.

In this configuration, the replicas can be used to serve up read-only queries, but all writes must be sent to the source. This helps split the load across all replicas, but it is important to note that when using the default asynchronous replication mode (covered in more detail later), there may be some delay between when data is written to the source and when it is available on the replica. Keep this in consideration when designing your application.

The alternate configuration is active/active, which means multiple servers are actively used to read and write data.

Active/active might seem like a good idea since you have two servers to process write requests, though each server is processing the others query workload, making write distribution more of an illusion. The failover between servers can appear seamless though conflicts can easily occur as there is no native conflict resolution logic within MySQL. When conflicts do occur, neither node can be considered the source of truth for a rebuild without significant data loss.

We always recommend using an active/passive configuration for replication, and sharding if you need more throughput from your database.

Use GTIDS

Global Transaction Identifiers (aka, GTIDs) are unique IDs attached to a given transaction within a replicated environment.

By default, replicas will read the binary log file on a source database and track the processed records based on the position within that file. As transactions are processed by the replica, the position within that file continues to advance to indicate what has and has not been processed until this point. This system is relatively fragile as issues can occur if the source crashes and the logs need to be restored.

With GTIDs enabled, each transaction is assigned an ID so replicas can concretely determine if a transaction has been processed or not.

Each GTID is a UUID that represents the source server followed by an auto-incrementing ID, formatted as 14a54b2f-2ad0-43b6-b803-72b5d7151d3b:1. As transactions are handled on the source, the UUID remains the same but the ID continues to grow. When a replica processes a transaction from the source, the GTID or GTID range (formatted as 14a54b2f-2ad0-43b6-b803-72b5d7151d3b:1-10) is stored in the gtid_executed table.

This process dramatically reduces the risk of your data getting out of sync based on transactions either not being processed, or being processed more than they should.

Use the correct replication mode

MySQL has two different modes to replicate data, and you should know how they behave to ensure you have the best mode set for your environment.

By default, MySQL will be configured with asynchronous replication. With this mode, transactions will be sent to the source and then read by each replica and processed independently. There is no validation from the source that any replica in the environment processes the transaction.

The other form of replication supported by MySQL is semi-synchronous replication, enabled as a plugin.

When configured with semi-synchronous replication, transactions are received by the source and processed by each replica, however, the source will wait until at least one replica accepts the transaction before responding to the caller. The benefit is that data consistency is greater since at least two database servers in your environment will have the data, but it does add a bit of overhead in the response time. PlanetScale actually uses semi-synchronous replication for our databases within a given region.

With semi-sync mode enabled, there are some additional options also available that can be tweaked.

By default, the primary server will wait 10 seconds for a replica with semi-sync mode enabled to acknowledge the transaction. This value can be modified, and if you rely on semi-sync for data consistency, you should increase this value to be high enough to guarantee consistency. We set the timeout value extremely high to ensure that the data for our databases are always consistent.

It’s also worth mentioning that you can mix and match these two modes.

If you want to guarantee that one specific server always contains an up-to-date copy of your database, but also want additional replicas for more resiliency, you could configure one replica with semi-sync and one without. This means when data is written to the source, it will always make sure that the one server with semi-sync enabled has received that transaction before responding, and the other replicas in the cluster will catch up when they can. In a disaster scenario (discussed further down this article), this can help you easily identify the best candidate to recover from.

Understanding the difference between these two modes can help you make a better decision on what makes more sense for your business

Log storage

As mentioned earlier in this article, each replica in your environment will read from the binary logs of the source as their source of truth.

By default, these logs are stored on the same disk as the database. As you can imagine, busy databases can be bogged down when you consider the amount of throughput being processed by a single disk (manipulating the database + reading binary logs for replication). That said, the better approach would be to store binary logs on a separate disk than the database.

This approach can also save you some money in cloud environments where free volumes have hard IOPS limits.

Monitor replication

All infrastructure requires monitoring to catch issues proactively, and replication is no exception.

If left unmonitored, you’d have no idea whether or not your data is actually being replicated once it's configured. SolarWinds Database Performance (formerly VividCortex) is one of the more popular database monitoring solutions available and does support monitoring replication. At PlanetScale, we use Prometheus to monitor replication, along with other metrics, for the clusters we manage.

Regardless of the solution used, make sure that when issues do occur, the proper people are notified so that things can be fixed before they become a real issue.

Create a failover strategy

It is inevitable that software or hardware can fail, so planning for said failovers can minimize the pain when they do happen.

One of the major benefits of using replication is the increase in resiliency by having more than one server containing your data online at any given time. Your team should have a good strategy ready in case the primary data source fails. The following is an example of what an unplanned failover might look like:

  1. Take measures to ensure the downed source won't come back online. This could cause replication issues if it happens unexpectedly.
  2. Identify the replica you want to choose as the new source and unset the read_only option. If semi-sync is used, this would be the replica you’ve configured with the plugin along with the source.
  3. Update your application to direct queries to the newly promoted source.
  4. Update the other replicas to start replicating from the new source.

Considerations when replicating to other data centers

When building in the cloud, you’ll have the ability to deploy services to almost anywhere in the world, and this includes databases too.

Each cloud provider is made up of a number of geographical regions. Within those regions are multiple data centers that are close enough to be considered in the same region, but far enough to survive many disasters. These data centers are known as availability zones or AZs for short.

If possible, replicating your database to other physical locations is definitely a best practice, but it comes with some additional considerations.

The time it takes to send data over the wire is one such consideration.

Since the network traffic needs to travel over a farther distance, replicating between locations does introduce additional latency between replicas. Luckily, the latency between AZs is often not very high. In fact, AWS claims that they have single-digit millisecond latency between availability zones in the same region.

Regions are much farther from one another and often have significant latency.

At the time of this writing, cloudping.co reported that the latency between us-east-1 and us-west-1 is over 60ms. Replication in itself has a bit of a delta between the time that data is written to the source and the time it is written to a replica, known as replication lag. This is exacerbated when replicating across longer distances.

As such, replicating across regions should be done in asynchronous mode so as to not cause unnecessary delay for the application making requests.

Conclusion

Knowing the available options when configuring replication is not enough. Understanding the best configuration for these options can make a tremendous difference in how well your MySQL cluster operates and serves data.

If this article has helped you better understand replication and how it should be configured, let us know by sharing it on X and tagging @planetscale!