TiDB 4.0 GA, Gearing You Up for an Unpredictable World with a Real-Time HTAP Database
Author: Siddon Tang (Chief Architect at PingCAP)
Transcreator: Caitin Chen; Editor: Tom Dewan
Today, I'm proud to announce that TiDB 4.0 has reached general availability (GA). TiDB is an elastic, real-time Hybrid Transactional/Analytical Processing (HTAP) database, and, best of all, it's now ready for the cloud.
The first half of 2020 has seen a world of uncertainty and unpredictability, and it's high time for IT technology to keep up, especially for databases because it's the foundation of every business in the world. In a previous post, our CTO Ed Huang shared his thoughts about the future of databases. We believe that the future of the database is about unification, adaptiveness, and intelligence. As you may have seen in our TiDB 4.0 preview, our new release is a big step closer to achieving that vision.
TiDB 4.0 has made great progress in its stability, ease of use, performance, and cloud-native services. New features in version 4.0 also let TiDB support more application scenarios and many of our users have been testing and adopting the 4.0 version in their production environments. Now we believe the TiDB 4.0 GA version has become the right database solution to prepare you for the unpredictable world.
TiDB 4.0 is a real-time HTAP, truly elastic, cloud-native database, which meets your application requirements in various scenarios. In fact, TiDB 4.0 can do the work of three different databases:
- If your application data is correlated, and you need to guarantee atomicity, consistency, isolation, durability (ACID) compliance while your storage capacity is expected to be in a controllable range, TiDB can perform like a traditional relational database.
- If your application data is scattered in the system and not correlated, and if you need to scale your storage capacity but don't require ACID compliance, TiDB can perform like a NoSQL database.
- If you need to do real-time data analytics and associate multiple tables to do aggregation operations, TiDB can perform like an analytical database.
TiDB 4.0 is a one-stop solution for both Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) applications to process HTAP workloads. Even if you are not sure whether your application is OLTP or OLAP, you can efficiently run your application on TiDB.
TiFlash is an extended analytical engine and columnar store for TiDB. A TiDB database that incorporates TiFlash lets you perform real-time HTAP analytics.
TiFlash offers you:
- Strong consistency in real time. TiDB replicates updated data to TiFlash in real time to ensure that TiFlash processes the latest (not just fresh) data.
- Automatic storage selection. TiDB intelligently determines whether to select row storage or column storage to cope with various query scenarios without your intervention.
- Flexibility and workload isolation. Both row and column stores scale separately.
Serverless computing is a very important concept in the field of cloud services. In TiDB 4.0, we not only support real-time HTAP analytics in the cloud, but also implement the first version of an elastic scheduling mechanism. This turns TiDB into a truly serverless database in the cloud.
Now, you only need to deploy your TiDB cluster in the cloud (or your own Kubernetes cluster) with the minimum cluster topology and to configure rules. For example, one rule could be that when TiDB CPU usage exceeds 50%, a TiDB node is automatically added. Then, based on your own application load, TiDB will automatically:
- Auto-scale. When the service peak comes, TiDB automatically adds or reduces instances to meet the number of service requests.
- Automatically split hot Regions (the basic unit for data storage in TiDB's storage engine) with high read loads.
- Isolate hotspots. TiDB moves hot application data to a separate instance to ensure that it does not affect other applications.
This feature is new in TiDB 4.0. We believe it will be the cornerstone of many product possibilities in the future.
Compared with version 3.0, TiDB 4.0 has achieved a significant performance boost. For OLTP scenarios, the Sysbench and TPC-C benchmarks have improved by 30% to 50%. For OLAP scenarios, the TPC-H performance has increased by about 100% on average over a set of 22 queries. In addition, TiFlash greatly enhances TiDB's real-time analytics capabilities without affecting OLTP tasks.
TiDB versions to compare:
TiDB 3.0.13 vs. TiDB 4.0.0
|Component||EC2 type||Instance count|
|Placement Driver (PD)||AWS m5.xlarge||3|
We ran Sysbench tests on 16 tables, each with 10 million rows of data. For more detailed information on the system configurations and the tests we ran, see the Sysbench Performance Test Report.
The tests showed that compared with version 3.0, TiDB 4.0's Point Select performance increased by about 14%, and read-write performance increased by 31%.
In the TPC-C test, we found that TiDB 4.0 performed about 50% better. For test details, see the TiDB TPC-C Performance Test Report.
We ran TPC-H queries on TiDB 3.0.13 and TiDB 4.0 to compare their OLAP capabilities. For test details, see the TiDB TPC-H Performance Test Report – v4.0 vs. v3.0.
Because TiDB 4.0 introduces TiFlash to strengthen TiDB's HTAP capabilities, our test objects were:
- TiDB 3.0.13 that only read data from TiKV
- TiDB 4.0 that only read data from TiKV
- TiDB 4.0 that automatically read data from TiKV or TiFlash through cost-based query optimization
The test results showed that TiDB's query performance notably improved—about 100% on average.
TiDB 4.0 also has many new features and improvements, in terms of security, ecosystem, and features.
- TiDB supports Transport Layer Security (TLS) and can dynamically update the certificate online.
- TiDB supports encryption at rest, to ensure data reliability and security.
TiDB 4.0 introduces TiUP, a component manager that streamlines installing and configuring a TiDB cluster into a few easy commands. With TiUP, you can get your cluster up in just one minute! For details, see Deploy a TiDB Cluster Using TiUP.
It's difficult to troubleshoot issues in a distributed database because system information is scattered among different machines. TiDB 4.0 offers TiDB Dashboard, a graphical interface with various built-in widgets that lets you easily diagnose, monitor, and manage your clusters. You can read these blog posts to learn more:
As users store more and more data in TiDB, quickly backing up and restoring data becomes a big challenge. In TiDB 4.0, we provide Backup & Restore (BR), a distributed backup and restore tool that offers high backup and restore speeds—1 GB/s or more for 10 TB of data. BR already supports backup to AWS S3 and will soon support backup to Google Cloud Storage. To get more details, see our BR documentation and blog post.
In TiDB 4.0, we introduce a new TiDB change data capture framework, TiCDC. This open-source feature replicates TiDB's incremental changes to downstream platforms. For more details, see TiCDC: Replication Latency in Milliseconds for 100+ TB Clusters.
- Previously, the transaction size in TiDB was limited to 100 MB. But TiDB 4.0 sets the transaction size limit to 10 GB. Now you can easily process large amounts of data in a single transaction without having to consider batch processing. For details, see Large Transactions in TiDB.
- TiDB 4.0 has officially adopted pessimistic locking as its default transaction model. With pessimistic locking, TiDB 4.0 is better compatible with MySQL, and it's also more convenient for you to migrate your application from MySQL to TiDB. For more details, see TiDB Pessimistic Transaction Model and Pessimistic Locking: Better MySQL Compatibility, Fewer Rollbacks Under High Load.
- During TiDB's long runtime, as data changes, the optimizer may select a wrong index and then, slow queries may occur. This affects the application's normal operation. To solve this problem, TiDB 4.0 introduces SQL Plan Management (SPM), a mechanism that narrows down the optimizer's searching space to execution plans that are proven to perform well. SPM avoids performance degradation caused by unanticipated plan changes, and you don't have to change the application code. To learn more, see SQL Plan Management: Never Worry About Slow Queries Again.
This post shows only a few of the highlights in TiDB 4.0. For a full list of features and improvements, check out the TiDB 4.0 GA Release Notes. If you're running an earlier version of TiDB and want to try 4.0, read Upgrade TiDB Using TiUP. You're also welcome to join our community on Slack and send us your feedback.
This release is a big step forward for TiDB on its way to becoming "the database of the future." We'd like to thank all our contributors and TiDB users who have helped us shape TiDB into what it is today.