MySQL InnoDB Cluster is the High Availability solution for MySQL. It delivers automatic fail-over in case of issue and guarantees zero data loss (RPO=0
).
RPO: Recovery Point Objective describes the interval of time that might pass during a disruption before the quantity of data lost during that period exceeds the Business Continuity Plan’s maximum allowable tolerance.
Example: our business architecture needs to have RPO=2 minutes. This means that in case of major issue, 2 minutes of data can be lost.
However, and we saw this recently in Europe, an entire data center can “disappear” instantaneously… So it’s also important to consider a Disaster Recovery plan.
The best of course is to setup this DR on a different region where the latency between both sides can be too large for an optimal InnoDB Cluster (Group Replication) in case of high workload.
In this case, Asynchronous Replication is still the best option. Using asynchronous replication means that it’s impossible to guarantee zero data loss (with asynchronous replication RPO>0
) between the source and the replica but this is usually enough for Disaster Recovery as it is usually measured in seconds.
Clik here to view.

Let’s see how to create such DR solution. The solution exposed here is based on MySQL 8.0.23.
Prepare the production
On the production side, MySQL InnoDB Cluster, a user for the asynchronous replication must be created and some privileges must be granted:
mysql-node1> CREATE USER 'repl'@'%' IDENTIFIED BY 'password' REQUIRE SSL; mysql-node1> GRANT REPLICATION SLAVE, BACKUP_ADMIN, CLONE_ADMIN ON . TO 'repl'@'%'; mysql-node1> GRANT SELECT ON performance_schema.* TO 'repl'@'%';
These statements must be entered on the Primary member.
Please pay attention to some privileges, CLONE_ADMIN
is important to use the wonderful CLONE Plugin. And usually, when using asynchronous replication the last privilege (SELECT ON performance_schema.*
) is not required, but we will see later why it’s very important in our case.
Prepare the Asynchronous Replica
After having installed MySQL on the server we will use as DR, we need prepare it to act as Asynchronous Replica:
mysql-replica> SET PERSIST_ONLY gtid_mode=on; mysql-replica> SET PERSIST_ONLY enforce_gtid_consistency=true; mysql-replica> RESTART; mysql-replica> INSTALL PLUGIN clone SONAME 'mysql_clone.so'; mysql-replica> SET GLOBAL clone_valid_donor_list='mysql-node1:3306'; mysql-replica> CLONE INSTANCE FROM repl@"mysql-node1":3306 IDENTIFIED BY 'password'; mysql-replica> SET PERSIST server_id=round(uuid_short()/1000000000);
mysql-node1
is the Primary member of the production MySQL InnoDB Cluster.
server_id
must be unique, you can use any value you want, in this example I use a random one generated byuuid_short()
Asynchronous Connection Failover
If you follow my blog, you know that I was advising to use asynchronous replication via MySQL Router when using asynchronous replication from an InnoDB Cluster. This is not required anymore !
Now the best solution is to use the Asynchronous Connection Failover mechanism.
- https://dev.mysql.com/doc/refman/8.0/en/replication-asynchronous-connection-failover.html
- https://mysqlhighavailability.com/automatic-asynchronous-replication-connection-failover/
This is what we will deploy now. The first thing to do is to retrieve the Group Replication UUID from the production MySQL InnoDB Cluster:
mysql-nodeX> sql show global variables like 'group_replication_group_name'\G ********************* 1. row **************************** Variable_name: group_replication_group_name Value: b2b2b6de-9ad7-11eb-88a0-020017018a7b
And on the future Asynchronous Replica used as DR, we can setup the automated asynchronous failover:
mysql-replica> SELECT asynchronous_connection_failover_add_managed("async_from_idc", "GroupReplication", "b2b2b6de-9ad7-11eb-88a0-020017018a7b", "mysql-node1", 3306, "", 80, 60);
The call to this UDF will return the following text:
The UDF asynchronous_connection_failover_add_managed() executed successfully.
Setup & Start Asynchronous Replication Channel
We are now ready to setup the asynchronous replication channel and start it:
mysql-replica> CHANGE REPLICATION SOURCE TO source_host='mysql-node1', source_port=3306, source_user='repl', source_password='password', source_auto_position=1, source_ssl=1, source_retry_count=3, source_connect_retry=10, source_connection_auto_failover=1 FOR CHANNEL 'async_from_idc';
async_from_idc
is just a name, you can use whatever you want but must be the same as defined in the asynchronous connection failover.
And we can start replication:
mysql-replica> start replica for channel 'async_from_idc';
We are done !
Now, if the Production MySQL InnoDB Cluster promotes a new Primary node, the asynchronous replica will also automatically change its source ! \o/
In case you need to use your DR site for production, it’s very easy to “upgrade” it to InnoDB Cluster once promoted after having stopped and reset the asynchronous replication channel.
Setting up an asynchronous replication channel from one MySQL InnoDB Cluster to another MySQL InnoDB cluster is not supported and does not work at this time.
Observability
It’s also possible to verify the setup using performance_schema
tables:
mysql-replica> select * from performance_schema.replication_asynchronous_connection_failover_managed\G *********************************** 1. row ***************************** CHANNEL_NAME: async_from_idc MANAGED_NAME: b2b2b6de-9ad7-11eb-88a0-020017018a7b MANAGED_TYPE: GroupReplication CONFIGURATION: {"Primary_weight": 80, "Secondary_weight": 60}
mysql-replica> select * from performance_schema.replication_asynchronous_connection_failover\G *********************************** 1. row ***************************** CHANNEL_NAME: async_from_idc HOST: mysql-node1 PORT: 3306 NETWORK_NAMESPACE: WEIGHT: 80 MANAGED_NAME: b2b2b6de-9ad7-11eb-88a0-020017018a7b *********************************** 2. row ***************************** CHANNEL_NAME: async_from_idc HOST: mysql-node2 PORT: 3306 NETWORK_NAMESPACE: WEIGHT: 60 MANAGED_NAME: b2b2b6de-9ad7-11eb-88a0-020017018a7b ********************************** 3. row ******************************* CHANNEL_NAME: async_from_idc HOST: mysql-node3 PORT: 3306 NETWORK_NAMESPACE: WEIGHT: 60 MANAGED_NAME: b2b2b6de-9ad7-11eb-88a0-020017018a7b
Troubleshooting
If you remember what I wrote at the beginning of this post, the privileges of the replication user are very important. If the user is not allowed to select from performance_schema
tables, you will see the following error in the error log and the failover won’t succeed:
The IO thread failed to detect if the source belongs to the group majority on the source (host:mysql-node1 port:3306 network_namespace:) for channel 'async_from_idc'.
Don’t forget that in MySQL 8.0 you can now access the error log also from
performance_schema.error_log
table !