Introduction to TiDB’s Internal Architecture

When it comes to modern data management, the ability to handle vast amounts of data with scalability, high availability, and transactional integrity is imperative. Enter TiDB, an open-source distributed SQL database that is MySQL compatible and built to support Hybrid Transactional and Analytical Processing (HTAP) workloads. TiDB aims to provide a one-stop database solution for OLTP, OLAP, and HTAP services, bringing the best of both relational databases and NoSQL systems.

What is TiDB?

At its core, TiDB is a distributed SQL database that brings together the robustness of traditional relational databases with the scalability features of NoSQL systems. It’s fully compatible with the MySQL protocol, which means migrating from MySQL to TiDB requires minimal changes. This compatibility, coupled with its flexible and elastic architecture, makes TiDB a formidable player in the database world.

Key Components and Layers of TiDB

TiDB is comprised of several core components that work in harmony to ensure seamless data management. These include:

  1. TiDB Server: This is the SQL layer responsible for handling SQL queries, parsing, and generating execution plans. It’s stateless and can be scaled horizontally.
  2. TiKV: The distributed Key-Value (KV) storage engine that stores the actual data. It is designed for high availability and horizontal scalability.
  3. Placement Driver (PD): PD is the brain of the TiDB cluster. It manages metadata, schedules data distribution, and provides transaction timestamp allocation.
  4. TiFlash: An analytical engine designed for real-time analytical queries, operating alongside TiKV for efficient data processing.
Diagram showing the interaction between TiDB Server, TiKV, PD, and TiFlash components.

Each component works seamlessly to provide high throughput, low latency, and strong consistency, making TiDB an excellent choice for diverse data workloads.

Why Understanding TiDB’s Architecture Matters

Understanding the architecture of TiDB is crucial for several reasons. Firstly, it allows database administrators and architects to make informed decisions about deploying and scaling their TiDB clusters. Secondly, developers can optimize their applications’ performance by leveraging the unique capabilities of each TiDB component. Lastly, a thorough understanding of TiDB’s internals can help troubleshoot and resolve issues more efficiently, ensuring smooth and reliable operation.

Storage Layer: TiKV and Placement Driver

The storage layer of TiDB comprises two main components: TiKV and the Placement Driver (PD). This layer is pivotal in ensuring data integrity, high availability, and efficient data distribution.

Role and Structure of TiKV

TiKV stands as the backbone of the TiDB storage layer. It is a distributed, transactional Key-Value (KV) storage engine designed to handle vast amounts of data with ACID compliance.

  1. Distributed Nature: TiKV’s architecture enables it to distribute data across multiple nodes. Each node, or TiKV server, stores a portion of the overall dataset and works in collaboration with other nodes to provide a unified storage solution.
  2. Transactional Support: TiKV natively supports distributed transactions. By using a two-phase commit protocol, TiKV ensures ACID properties across its distributed environment.
  3. RocksDB: At the core of each TiKV node is RocksDB, an embeddable persistent KV store that ensures high-performance storage and retrieval of data.
  4. Regions and Raft Groups: Data in TiKV is divided into small chunks called Regions. Each Region is replicated to multiple nodes, forming a Raft group to ensure data consistency and high availability.

Below is a simplistic example of how data insertion is handled in TiKV:

INSERT INTO user (id, name) VALUES (1, 'Alice');

This command translates into a put operation within TiKV, storing the key-value pair {id: 1, name: 'Alice'}. TiDB uses Raft consensus to replicate this data across multiple TiKV nodes, ensuring data durability and consistency.

Diagram showing the data insertion process in TiKV with Raft consensus.

Data Distribution and Sharding

TiKV uses a sophisticated data sharding mechanism to distribute data evenly across the cluster. Sharding is implemented through the concept of Regions. Each Region covers a specific Key-Value range, ensuring that data is evenly distributed and balanced across all nodes.

  1. Regions: Each Region holds a subset of the data and is the fundamental unit for data movement within the cluster. When a Region grows beyond a specified size, it splits into two smaller Regions. Conversely, small adjacent Regions can merge to form a larger Region to optimize resource usage.
  2. Raft Groups: Each Region is replicated across multiple nodes, forming a Raft group. This ensures that even if one node fails, other replicas can take over, maintaining the availability and integrity of data.

The key outcome of this sharding is horizontal scalability. New nodes can be added to the TiKV cluster, and Regions will be automatically balanced, enabling the system to handle increased load without compromising performance.

The Role of Placement Driver (PD) in TiDB

The Placement Driver (PD) is the control center of the TiDB cluster. It is responsible for:

  1. Metadata Management: PD stores and manages the metadata for the TiDB cluster, including Region information, data distribution, and cluster topology.
  2. Data Distribution and Balancing: PD continuously monitors the load and data distribution across TiKV nodes. It dynamically adjusts the placement of Regions to balance the load and optimize performance.
  3. Timestamp Services: PD acts as the timestamp oracle in the TiDB cluster. It provides globally unique timestamps for transactions, ensuring consistency across the distributed environment.

The PD component is designed for high availability and fault tolerance. It consists of at least three nodes to form a quorum, and it ensures the continuous, consistent operation of the TiDB cluster.

CREATE PLACEMENT POLICY p1 PRIMARY_REGION="us-east-1" REGIONS="us-east-1,us-west-1" FOLLOWERS=4;
ALTER DATABASE test PLACEMENT POLICY=p1;

In the above example, the placement policy ensures that the database test has its primary region in us-east-1 with replicas in us-west-1.

By understanding the pivotal roles played by TiKV and PD, database administrators can better manage data distribution and ensure the high availability and performance of their TiDB clusters.

Compute Layer: TiDB Server

The compute layer in TiDB is responsible for SQL parsing, query optimization, and transaction management. This layer is represented by the TiDB server component, which serves as the bridge between application requests and the underlying storage layer.

SQL Layer and Parser

The TiDB server acts as a stateless SQL layer that handles SQL queries from applications. It adheres to the MySQL protocol, making it straightforward to integrate with existing applications that use MySQL. The SQL layer involves several key processes:

  1. Connection Handling: TiDB server manages client connections, facilitating communication between applications and the database.
  2. SQL Parsing: Incoming SQL queries are parsed into parse trees, decomposing SQL statements into their constituent parts for further processing.
  3. Query Optimization: The parsed queries undergo optimization to generate efficient execution plans. This involves logical and physical optimization steps to ensure optimal query performance.

The following exemplifies a simple query execution process:

SELECT name FROM user WHERE id = 1;

When this query is issued, the TiDB server parses it, generates an execution plan, and then delegates the actual data retrieval to the TiKV layer.

Execution Engine

The execution engine in the TiDB server is responsible for executing the query plans generated during query optimization. It interprets the execution plan and coordinates with the underlying storage layer to fetch and manipulate data as required.

  1. Physical Operators: The TiDB execution engine uses a range of physical operators (e.g., table scan, index scan, join, aggregation) to execute various parts of the query plan.
  2. Vectorized Execution: To maximize performance, TiDB uses vectorized execution, which processes data in batches rather than row-by-row. This significantly speeds up query execution, particularly for analytical workloads.
  3. Concurrency Control: The execution engine can run multiple queries concurrently, leveraging the distributed nature of the TiDB cluster to enhance throughput and reduce latency.

Transaction Model and Concurrency Control

TiDB supports both optimistic and pessimistic transaction models to cater to different application requirements.

  1. Optimistic Transactions: Suitable for workloads with low contention, optimistic transactions assume that conflicts are rare. They check for conflicts only during the commit phase, retrying the transaction if necessary.
  2. Pessimistic Transactions: Ideal for high-contention scenarios, pessimistic transactions lock resources as they are accessed, preventing other transactions from modifying the same data until the locks are released.

Below is an example of a simple transaction in TiDB:

START TRANSACTION;
UPDATE account SET balance = balance - 100 WHERE id = 1;
UPDATE account SET balance = balance + 100 WHERE id = 2;
COMMIT;

TiDB translates these SQL statements into key-value operations on TiKV, ensuring ACID properties through its two-phase commit protocol and timestamp-based concurrency control.

By understanding the intricacies of the TiDB server’s compute layer, users can fine-tune their queries and transactions to achieve optimal performance and reliability.

High Availability and Fault Tolerance

One of the standout features of TiDB is its robust high availability and fault tolerance mechanisms. These features ensure that the database remains operational even in the face of hardware failures or network issues.

Raft Consensus Algorithm

At the heart of TiDB’s fault tolerance is the Raft consensus algorithm. Raft ensures that the data is consistently replicated across multiple nodes, providing high availability and durability.

  1. Leader Election: Raft elects a leader node that manages all write operations. If the leader node fails, Raft quickly promotes a follower node to become the new leader, ensuring continuity.
  2. Log Replication: The leader node replicates all write requests to follower nodes. Once a majority of nodes acknowledge the write, it is considered committed, providing strong consistency.
  3. Fault Tolerance: Since data is replicated across multiple nodes, TiDB can tolerate node failures without data loss. The system remains operational as long as a majority of nodes are available.

Data Replication and Failover

TiDB employs a multi-replica model to enhance data availability and fault tolerance.

  1. Multi-Replica Storage: TiKV stores multiple replicas of each piece of data across different nodes. This redundancy ensures that data is preserved even if some nodes fail.
  2. Automatic Failover: When a node fails, TiDB automatically redirects read and write operations to the remaining healthy replicas. The Raft protocol ensures that any in-progress transactions are safely recovered and completed by the new leader.
  3. Cross-Data Center Replication: TiDB can be deployed across multiple data centers, offering disaster recovery and high availability even in the event of a data center outage.

Self-Healing Mechanisms and Automatic Recovery

TiDB is designed to detect and recover from failures automatically, minimizing downtime and manual intervention.

  1. Heartbeat Mechanism: TiKV nodes continuously send heartbeat signals to the PD. If a node fails to send a heartbeat, PD triggers a failover process.
  2. Region Rebalancing: When a node becomes unavailable, PD rebalances Regions to ensure even data distribution across the remaining nodes. New replicas are created to maintain the desired replication factor.
  3. Automatic Repair: In case of data corruption or loss, TiDB can automatically repair the affected Regions by copying data from healthy replicas, ensuring data integrity and availability.

By leveraging these high availability and fault tolerance features, TiDB provides a resilient and reliable database platform suitable for mission-critical applications.

Conclusion

TiDB’s innovative architecture marries the scalability and flexibility of NoSQL systems with the robustness and transactional capabilities of traditional relational databases. By understanding the key components of TiDB—TiKV, PD, and the TiDB server—users can fully harness its power to handle a diverse range of data workloads efficiently.

Whether it’s ensuring high availability through Raft consensus, optimizing queries with advanced execution engines, or maintaining transactional integrity with sophisticated concurrency controls, TiDB stands out as a comprehensive solution for modern data management challenges.

For those keen to dive deeper into TiDB, exploring the documentation and contributing to the open-source project can provide valuable insights and opportunities to shape the future of distributed SQL databases.

For more detailed information and technical documentation, visit the official TiDB documentation.


Last updated September 16, 2024