Learn how MySQL’s online DDL operations can affect your data and explore best practices to protect your tables during ALTER TABLE and OPTIMIZE TABLE operations.
Azure Database for MySQL - Flexible Server is built on the open-source MySQL database engine, and the service supports MySQL 8.0 and newer versions. This means that users can take advantage of the flexibility and advanced capabilities of MySQL’s latest features while benefitting from a fully managed database service.
While newer versions and features can provide a lot of value, the recent issues identified with MySQL versions 8.0+ makes it important to be aware of potential risks that can occur during certain operations, particularly if you are making online schema changes.
Issues with data loss and duplicate keys with Online DDL
Online Data Definition Language (DDL) operations are a powerful feature in MySQL, enabling schema changes like ALTER TABLE or OPTIMIZE TABLE with minimal impact on table availability. These operations are designed to reduce downtime by allowing concurrent reads and writes during schema modifications, making them an essential tool for managing active databases efficiently.
However, a recent post on the Percona blog, Who Ate My MySQL Table Rows? highlights critical risks associated with MySQL 8.0.x versions after 8.0.27 and all versions beyond 8.4.y. Specifically, the open-source INPLACE algorithm, commonly used for online schema changes, can lead to data loss and duplicate key errors under certain conditions. These issues arise from constraints in the INPLACE algorithm, particularly during ALTER TABLE and OPTIMIZE TABLE operations, exposing vulnerabilities that compromise data integrity and system reliability.
These risks are called out in the following bug reports:
- Bug #115511: Data loss during online ALTER operations with concurrent DML
- Bug #115608: Duplicate key errors caused by online ALTER operations
Documented issues related to the INPLACE algorithm (used for online DDL) can cause:
- Data Loss: Rows may be accidentally deleted or become inaccessible.
- Duplicate Keys: Indexes can end up with duplicate entries, leading to data consistency issues and potential replication errors.
Problems arise when INPLACE operations, such as ALTER TABLE or OPTIMIZE TABLE, run concurrently with:
- DML operations (INSERT, UPDATE, DELETE): Modifications to table data during the rebuild.
- A purge activity: Background cleanup operations for old row versions in InnoDB.
These scenarios can lead to anomalies resulting from race conditions and incomplete synchronization between concurrent activities.
Impact on Azure Database for MySQL - Flexible Server Customers
For Azure Database for MySQL Flexible Server customers using MySQL 8.0+ and all versions after 8.4.y, this issue is particularly critical as it affects:
- Data Integrity: During schema changes such as ALTER TABLE or OPTIMIZE TABLE run using the INPLACE algorithm, data rows may be lost or duplicated if these operations run concurrently with a DML activity (e.g., INSERT, UPDATE, or DELETE) or background purge tasks. This can compromise the accuracy and reliability of the database, potentially leading to incorrect query results or the loss of critical business data.
- Replication Instability: Duplicate keys or missing rows can interrupt replication processes, which rely on a consistent data stream across the primary and replica servers. These issues can arise when there are concurrent insertions into the table during schema changes, leading to data inconsistencies between the primary and replicas. Such inconsistencies may result in replication lag, errors, or even a complete breakdown of high-availability setups, requiring manual intervention to restore synchronization.
- Operational Downtime: Resolving these issues often involves manually syncing data or restoring backups. These recovery efforts can be time-consuming and disruptive, leading to extended downtime for applications and potential business impact.
Recommendations for safe schema changes on Azure Database for MySQL flexible servers
To minimize the risks of data loss and duplicate keys while making schema changes, follow these best practices:
- Set old_alter_table=ON to Default to COPY Algorithm
Enable the server parameter old_alter_table system variable so that ALTER TABLE operations without a specified ALGORITHM default to using the COPY algorithm instead of INPLACE. This reduces the risk for users who do not explicitly specify the ALGORITHM in their commands. Learn more on how configure server parameters in Azure Database for MySQL.
- Avoid using ALGORITHM=INPLACE
Do not explicitly use ALGORITHM=INPLACE for ALTER TABLE commands, as it increases the risk of data loss or duplicate keys.
- Back up your data before schema changes
Always perform a full on-demand backup of your server before executing schema changes. This precaution ensures data recoverability
in case of unexpected issues. Learn more on how to take full on-demand backups for your server.
- Avoid Concurrent DML during schema changes
Schedule schema changes like ALTER TABLE and OPTIMIZE TABLE during application maintenance windows when no concurrent writes activities occur. This minimizes race conditions and synchronization conflicts.
- Use External Tools for Safer Online Schema Changes
Consider using external tools like pt-online-schema-change to modify table definitions without blocking concurrent changes. These tools enable you to make schema changes with minimal impact on availability and performance. Learn more about pt-online-schema-change.
Disclaimer: The pt-online-schema-change tool is not managed or supported by Microsoft; use it at your discretion.
Mitigation plans
To address these risks, we’re actively working to integrate the necessary fixes to ensure a more robust and reliable experience for our customers.
- New Servers Fully Secured by End of February 2025
All new Azure Database for MySQL Flexible Server instances created after 1st March 2025, will include the latest fixes, ensuring that schema changes are safeguarded against data loss and duplicate key risks.
- Rollout for Existing Servers
For existing servers, we will roll out patches during upcoming maintenance windows by end of Q1 of Calendar Year 2025 We recommend monitoring your Azure portal for scheduled maintenance windows and Release notes for announcements about critical updates and patches.
- Priority updates available upon request
If you require an urgent update outside of the scheduled maintenance windows, you can contact Azure Support. Provide the necessary server details and an appropriate maintenance window, and our team will work with you to prioritize the patching process. Note that priority patching will be available by February 2025. We recommend monitoring Release notes for announcements about critical updates and patches.
Conclusion
Safely managing schema changes on MySQL servers requires understanding the risks associated with online DDL operations, such as potential data loss and duplicate keys.
To help safeguard data integrity and maintain server stability, implement best practices, for example enabling the COPY algorithm, using offline operations if feasible, or scheduling changes during low activity periods.
Fixes are expected by the end of February 2025, and new Azure Database for MySQL flexible servers will be fully protected against these bugs. We will apply updates to existing servers during maintenance windows in Q1 2025.
Following the recommendations above will help ensure that you can confidently make schema changes while preserving the reliability and performance of your server.