HTAP Summit 2024 session replays are now live!Access Session Replays
pingcap_blog_1600x600

Online Data Definition Language (DDL) is a crucial feature for modern databases. It allows schema changes without significant downtime or locking that could disrupt database operations. This means these operations carry out while the database continues to be available for reads and writes, minimizing downtime and avoiding disruption to ongoing activities.

Online DDL is particularly important for applications requiring high availability and where maintenance windows are not feasible or are too disruptive. This capability is crucial for large, operational databases that need to scale, evolve, and undergo schema changes without affecting the user experience or service availability.

In this blog post, we’ll explore what online DDL is and then dive into how TiDB, a distributed SQL database, efficiently handles online DDL operations.

What is Online DDL?

Online DDL performs schema changes such as adding columns, creating indexes, or modifying table structures while minimizing the impact on the database’s availability and performance. Traditional DDL operations in relational database management systems (RDBMSs) often require locking tables, which can lead to downtime or slow performance, but online DDL aims to mitigate these issues.

A distributed SQL database implements online DDL using the protocol introduced by the Google F1 paper. In short, DDL changes break up into smaller transition stages so they can prevent data corruption scenarios, and the system tolerates an individual node being behind up to one DDL version at a time.

Key Aspects of Online DDL

  • Non-blocking Operations: Avoids long-term locks on tables.
  • Minimal Downtime: Keeps the database available during schema changes.
  • Concurrency: Allows simultaneous DML (Data Manipulation Language) operations.
  • Phased Approach: Performs changes in multiple stages to reduce impact.

​​How MySQL Handles Online DDL

MySQL, particularly with the InnoDB storage engine, has robust support for online DDL. Here’s how MySQL addresses online DDL:

Non-blocking Schema Changes

MySQL’s InnoDB engine allows many schema changes without blocking access to the table. For instance, adding a column or creating an index is done without locking the entire table, as shown below.

Non-blocking schema changes with online DLL in MySQL.
  • Metadata Update: MySQL updates the table’s metadata to include the new column.
  • Background Processing: The InnoDB engine performs the addition of the new column in the background. Existing rows update incrementally to include the new column without blocking read or write operations.
  • Concurrent Operations: While the background task is running, the table remains fully accessible for other operations. New data includes the new column, and existing data gradually updates.

Online Index Creation

MySQL supports creating indexes concurrently, meaning the table remains available for reads and writes while the index builds.

Online index creation with online ddl in MySQL.
  • Metadata Preparation: MySQL updates the table metadata to include the index definition.
  • Index Build Process: The InnoDB engine builds the index in the background. It reads existing table data and populates the index incrementally.
  • Minimal Disruption: During the index build, the table remains accessible. Queries and updates proceed without being blocked, and the new index is used only after it is fully built and activated.

Phased Approach

MySQL uses a phased approach for applying schema changes, ensuring minimal disruption and maintaining performance:

  • Preparation Phase:
    • Metadata updates, and initial preparations are made for the schema change.
    • Ensures that the change propagates across the system safely.
  • Reorganization Phase:
    • The actual changes are applied in the background. For example, when adding a column, the data backfills incrementally.
    • This phase allows concurrent read and write operations to continue without significant impact.
  • Commit Phase:
    • The schema change finalizes and is made active.
    • The metadata updates to reflect the new schema, making the changes visible and operational.

These capabilities ensure that MySQL databases can remain available and performant during schema changes, making it a robust choice for many applications. By understanding the underlying processes, users can better appreciate the efficiency and reliability of MySQL’s online DDL operations.

How TiDB Handles Online DDL

Online DDL is essential for making schema changes without downtime or disruptions. It ensures databases remain available and performant during updates. We’ll now explore how TiDB manages online DDL. We’ll look at its three-phase schema change protocol and provide examples of common DDL operations.

Three-Phase Schema Change Protocol

TiDB employs a three-phase approach to handle online DDL operations:

1. Prepare Phase:

  •  TiDB updates metadata and prepares for the schema change.
  •  Ensures the change propagates safely across the cluster.

2. Reorganization Phase:

  • The actual changes are applied in the background.
  • For example, building a new index or adding a column incrementally.
  • Allows concurrent DML operations, ensuring the database remains fully operational.

3. Commit Phase:

  • The schema change is finalized and made active.
  • Metadata updates to reflect the new schema, making changes visible.

Examples of Online DDL in TiDB

In this section, we’ll explore some common examples of online DDL operations in TiDB.

Adding a Column

When adding a column in TiDB, the process avoids table locks and ensures continuous availability. Here’s how it works technically:

  • Prepare Phase: TiDB updates the table’s metadata to include the new column definition. This change propagates across all nodes in the cluster to ensure consistency.
  • Reorganization Phase: The new column is added in the background. TiDB uses a background task to backfill the column data for existing rows incrementally. During this phase, the table remains accessible for reads and writes, with new writes including the new column data.
  • Commit Phase: Once the backfill is complete, TiDB finalizes the schema change. The metadata updates to reflect the new schema, making the new column fully active and visible for all subsequent operations.

This approach allows TiDB to add a column without long-term locking, ensuring minimal disruption to ongoing operations.

Adding a column in TiDB with online ddl.

Creating an Index

Creating an index in TiDB is another common operation that benefits from online DDL. The process ensures that queries and updates continue smoothly:

  • Prepare Phase: TiDB prepares for the index creation by updating metadata and ensuring the change distributes across the cluster.
  • Reorganization Phase: The index is built in the background. TiDB processes existing data to populate the index incrementally, allowing read and write operations to proceed concurrently.
  • Commit Phase: The new index is finalized and made active. The metadata updates, and the index becomes available for query optimization, enhancing performance without having caused downtime.
Creating an index in TiDB.

Modifying a Column

Modifying a column, such as changing its type or attributes, is handled with similar efficiency:

  • Prepare Phase: TiDB updates the table’s metadata to reflect the changes in the column’s definition.
  • Reorganization Phase: The actual data transformation performs in the background. For example, changing a column from INT to BIGINT involves converting existing data incrementally, ensuring that the table remains accessible.
  • Commit Phase: The schema change is finalized. The metadata updates to reflect the new column type, making the changes fully active.
Modifying a column in TiDB.

Advantages of Online DDL in TiDB

Online DDL operations offer significant benefits, particularly in high-availability, large-scale database environments. TiDB leverages online DDL to enhance its performance and reliability. In this section, we’ll explore the specific advantages that TiDB’s implementation of online DDL provides, such as maintaining high availability, ensuring scalability, and offering operational flexibility.

  • High Availability: Ensures the database remains available for reads and writes.
  • Scalability: Efficiently handles schema changes across large, distributed datasets.
  • Operational Flexibility: Schema changes function without scheduling downtime.

Challenges and Considerations

While TiDB’s online DDL offers numerous advantages, it’s important to be aware of potential challenges and considerations. Understanding these can help in planning and executing schema changes more effectively. In this section, we’ll discuss the resource usage implications, the complexity of certain schema changes, and the importance of version compatibility when using online DDL in TiDB.

  • Resource Usage: Online DDL operations can be resource-intensive. Monitoring and managing resources is crucial to prevent performance issues.
  • Complex Schema Changes: Some changes might require careful planning and execution.
  • Version Compatibility: Ensure the TiDB version supports the specific online DDL operations needed.

Conclusion

Online DDL is an essential feature for modern databases, enabling schema changes with minimal disruption. TiDB’s robust implementation, with its three-phase approach and support for concurrent operations, ensures high availability and performance even during schema modifications. By leveraging TiDB’s capabilities, organizations can maintain operational flexibility and scalability, making it an excellent choice for managing large-scale, distributed databases.

Understanding and utilizing online DDL in TiDB can significantly enhance the efficiency and reliability of database operations, helping businesses stay agile and responsive to changing requirements. As you consider the needs of your database management strategy, TiDB’s advanced features offer a compelling solution to ensure seamless schema changes without compromising on performance or availability.

Want to enhance the efficiency and reliability of your database operations? Explore our comprehensive guides below so you can say goodbye to downtime and disruptions holding back your business.


Experience modern data infrastructure firsthand.

Try TiDB Serverless

Have questions? Let us know how we can help.

Contact Us

TiDB Cloud Dedicated

A fully-managed cloud DBaaS for predictable workloads

TiDB Cloud Serverless

A fully-managed cloud DBaaS for auto-scaling workloads