What companies are using TiDB in production?

TiDB is trusted by over 3000 global enterprises across a variety of industries, such as financial services, gaming, and e-commerce. Users include Square (US), Shopee (Singapore), and China UnionPay (China).

How is TiDB different from other relational databases like MySQL?

TiDB is a next-generation, distributed relational database that can independently scale both computing and storage capacity by adding new nodes. Unlike traditional relational databases that only scale vertically, TiDB offers horizontal scalability, high availability with automatic failover, HTAP capabilities for both OLTP and OLAP workloads, and MySQL protocol compatibility so you can replace MySQL without changing application code.

What is the relationship between TiDB and TiDB Cloud?

TiDB is an open-source database best suited for organizations that want to run it on-premises or in their own data centers. TiDB Cloud is a fully managed cloud Database-as-a-Service (DBaaS) built on TiDB, with an easy-to-use web-based management console for managing TiDB clusters in mission-critical production environments.

Is TiDB compatible with MySQL?

TiDB is highly compatible with the MySQL protocol and the common features and syntax of MySQL 5.7 and MySQL 8.0. Ecosystem tools for MySQL such as PHPMyAdmin, Navicat, MySQL Workbench, and DBeaver can all be used with TiDB. Some MySQL features are not supported in TiDB due to architectural differences in a distributed system.

What programming languages can I use to work with TiDB?

You can use any programming language supported by the MySQL client or driver, including Java, Go, Python, Ruby, PHP, and more.

How does TiDB support strong consistency?

TiDB implements Snapshot Isolation consistency, delivering REPEATABLE-READ for MySQL compatibility. Data is redundantly copied between TiKV nodes using the Raft consensus algorithm to ensure recoverability in the event of node failure. TiDB uses a replication log and State Machine model — write requests go to a Leader node which replicates the command to Followers as a log, and once the majority of nodes receive the log, it is committed and applied.

Where can I run TiDB?

TiDB is available for bare metal, cloud-based, or hybrid installations. A Kubernetes Operator is available, and you can also use TiUp to quickly deploy a test environment on your laptop or a full production cluster across many nodes.

How does TiDB ensure high availability?

TiDB uses the Raft consensus algorithm to ensure data is highly available and safely replicated throughout storage in Raft Groups. Data is redundantly copied between TiKV nodes across different Availability Zones to protect against machine or data center failure. Automatic failover ensures your service stays online continuously.

What support is available for TiDB customers?

TiDB is supported by a team with experience running mission-critical use cases for over 3000 global enterprises across financial services, e-commerce, enterprise applications, and gaming. 24/7 support is available for TiDB Enterprise Subscription users.

What are PD, TiDB, TiKV, and TiFlash nodes in a TiDB Cluster?

PD (Placement Driver) is the brain of the TiDB cluster, storing metadata and sending data scheduling commands to TiKV nodes. TiDB is the SQL computing layer that aggregates query results and is horizontally scalable. TiKV is the transactional store for OLTP data, maintained in multiple replicas with native high availability. TiFlash is the analytical storage layer that replicates data from TiKV in real-time to support OLAP workloads using columnar storage.

How does TiDB replicate data between TiKV nodes?

TiKV divides the key-value space into key ranges called Regions. Data is distributed across all nodes using Regions as the basic unit, with PD responsible for spreading Regions evenly. TiDB uses the Raft consensus algorithm to replicate data by Regions — multiple replicas of a Region form a Raft Group, and each data change is recorded as a Raft log that is reliably replicated across nodes.

How do I make use of TiDB HTAP capabilities?

As a Hybrid Transactional Analytical Processing (HTAP) database, TiDB automatically replicates data between the OLTP store (TiKV) and OLAP store (TiFlash) in real-time. This eliminates the need for a separate data warehouse and supports real-time analytics on transactional data. Typical HTAP use cases include user personalization, AI recommendations, fraud detection, business intelligence, and real-time reporting.

Is there an easy migration path from another RDBMS to TiDB?

Yes. TiDB provides TiDB Lightning and a Data Migration Tool to migrate data from MySQL databases. Since TiDB implements the MySQL wire protocol, you can use the MySQL client directly. TiKV APIs are also available for Java, Go, Rust, and Python.

What is the difference between TiDB Community Edition and the Enterprise Subscription?

Some features such as audit logging are not included in the Community Edition. The most significant difference is the inclusion of Enterprise Support at the Enterprise Subscription level, providing 24/7 professional support for production environments.

How does TiDB protect data privacy and ensure security?

TiDB includes Transport Layer Security (TLS) and Transparent Data Encryption (TDE) for encryption at rest. It operates across two network planes: one for application-to-TiDB server communication and one for internal data communication. TiDB also supports extended syntax for Subject Alternative Name verification and TLS context for internal communication.

What companies are using TiDB Cloud in production?

TiDB Cloud is trusted by enterprises including Catalyst (US), KNN3 Network (Singapore), and CAPCOM (Japan), alongside thousands of other global organizations across financial services, SaaS, Web3, gaming, and e-commerce.

TiDB Cloud is a fully managed cloud Database-as-a-Service (DBaaS) built on TiDB. It allows developers and DBAs to deploy on Amazon Web Services or Google Cloud through an intuitive console, handling infrastructure management and cluster deployment so teams can focus on building applications. Clusters can be scaled in or out with a simple click.

Is TiDB Cloud compatible with MySQL?

TiDB Cloud is highly compatible with the MySQL protocol and the common features and syntax of MySQL 5.7 and MySQL 8.0. MySQL ecosystem tools including PHPMyAdmin, Navicat, MySQL Workbench, and DBeaver can all be used with TiDB Cloud.

Where can I run TiDB Cloud?

TiDB Cloud is currently available on Amazon Web Services (AWS) and Google Cloud.

How does TiDB Cloud ensure high availability?

TiDB Cloud uses the Raft consensus algorithm to replicate data safely across TiKV nodes in different Availability Zones, protecting against machine or data center failure. As a SaaS provider, PingCAP meets SOC 2 Type 2, ISO 27001, ISO 27701, PCI DSS, GDPR, and HIPAA standards to ensure data security, availability, and confidentiality.

What support is available for TiDB Cloud customers?

TiDB Cloud is supported by the same team behind TiDB, with experience running mission-critical workloads for over 3000 global enterprises. 24/7 support is available for all TiDB Cloud users.

How do I make use of TiDB Cloud HTAP capabilities?

TiDB Cloud automatically replicates data between the OLTP store (TiKV) and OLAP store (TiFlash) in real-time, enabling real-time analytics on transactional data without a separate data pipeline. Typical use cases include AI recommendations, fraud detection, business intelligence, and real-time reporting.

Is there an easy migration path from another RDBMS to TiDB Cloud?

Yes. TiDB provides TiDB Lightning and a Data Migration Tool for migrating from MySQL. TiDB Cloud implements the MySQL wire protocol so existing MySQL clients work directly. TiKV APIs are also available for Java, Go, Rust, and Python.

TiDB 4.0: An Elastic, Real-Time HTAP Database Ready for the Cloud

TiDB is an open-source, distributed, Hybrid Transactional/Analytical Processing (HTAP) database built by PingCAP and its open-source community. At TiDB DevCon 2020, the TiDB community’s annual technical conference, more than 80 developers, TiDB users, and partners online from all over the world shared their first-hand development and practical experience with TiDB. The topics covered finance, telecommunications, e-commerce, logistics, video, information, education, medical care, and many other industries. At the meeting, we showed TiDB 4.0‘s general availability (GA) technical details and its performance in a production environment. More than 3,000 people signed up to watch the live broadcast and exchanged their ideas in the group.

This post is based on the keynote speech Max Liu, CEO at PingCAP, gave at this conference.

Last year, at TiDB DevCon 2019, we released TiDB 3.0 Beta. Today, at TiDB DevCon 2020, I’m so excited to show you TiDB 4.0 GA‘s cutting-edge features and functionalities.

I’ve always believed that, in this day and age, databases should be more real-time, more flexible, and easier to use. TiDB 4.0, an elastic, cloud-native, real-time HTAP database, is exactly that kind of database, because it provides:

Serverless TiDB
Real-time HTAP
Cloud-native TiDB

They are the most exciting and appealing characteristics to TiDB users. They distinguish TiDB from other databases.

Serverless TiDB

Serverless computing is a very important concept in the field of cloud services. If you use a large-scale TiDB cluster, you might want to reduce costs. Now, with serverless capabilities, TiDB 4.0 can automatically scale in or out in Kubernetes based on your application load.

Serverless TiDB

Serverless TiDB brings you these advantages:

Being automatically elastically scalable on the cloud reduces costs. In the past, when you wanted to launch a system, the first task was capacity planning: assessing how many servers you needed. However, even the best plans can be inaccurate in practice. For example, you may have prepared 50 servers, but after the system ran in the production environment for a month, you found that 5 machines were enough. This led to a lot of wasted resources. Now, the entire system can elastically scale in the cloud, ensuring the most efficient use of your database resources.
More importantly, TiDB’s elastic scaling means that you never need to provision system resources according to your application’s peak load. For example, when you have two load peaks in the morning and evening, you provision to peak capacity for 24 hours. But in fact, each peak lasts for only about 2 hours. That is to say, your application peaks are a total of 4 hours but you pay for 24 hours of peak capacity resources. Now with serverless TiDB, you can save resources and costs during off-peak hours. You can save about 70% of the costs, or even more.

In addition, TiDB can automatically scale based on your application needs to handle unpredictable workloads. For example, no one knows when a commodity will be hot. If you give the system permission to automatically scale based on the actual situation, this may be “life-saving” for a business, since human intervention is often too slow and too late.

Serverless scaling

Real-time HTAP analytics

In TiDB 4.0, we introduce TiFlash, an extended analytical engine and columnar store for TiDB. A TiDB database that incorporates TiFlash lets you perform real-time HTAP analytics.

Why real-time HTAP analytics?

In today’s world, everyone wants everything to be faster and simpler. But if you still use a database in the traditional way to gain insights from large volumes of data, you can’t meet this “faster, simpler” demand. This is because in the traditional way, you need to go through a series of complicated processes to extract the changing information, events, and logs from the database and then analyze the data. In this process, a long delay often occurs. Working with outdated information can mean poor decisions and economic losses.

TiFlash is seamlessly integrated with TiDB, and it inherits TiDB’s easy-to-maintain characteristics, such as online data definition language (DDL), seamless scalability, and automatic fault tolerance. And TiFlash can be synchronized with the row-store engine automatically in real time.

With TiFlash, TiDB 4.0 can be at least 10 times faster than TiDB 3.0 in scenarios with a large number of complex data analytics, and you never need to worry about data inconsistency. No matter whether TiFlash processes simple Online Transactional Processing (OLTP) workloads or complex Online Analytical Processing (OLAP) workloads, it always guarantees data consistency and freshness. It also can automatically scale in or out.

A case for real-time HTAP analytics

Now let’s see an example. Look at the architecture diagram below. Almost all companies with a certain scale of data used this architecture. I know a TiDB user who once built a complex system like this architecture in a scenario with only dozens of TB of data. He did this just to be able to deal with OLTP workloads and make a report query. In this process, he had to connect to Kafka and an extract, transform, load (ETL) tool, reserialize the report query results, and then store the results in a storage system such as HBase. Is there a method to simplify the entire architecture?

A complex architecture

When we recommended TiDB 4.0 to them, they accepted and deployed it in their production environment. As shown in the diagram below, we put TiDB in the middle layer, and the system complexity was greatly reduced.

TiDB’s real-time HTAP architecture

The TiDB HTAP architecture saves costs

From the user’s perspective, it doesn’t matter whether a workload is a long or short query. To save costs, improve development efficiency, and create more value, users just want to get query results as soon as possible and reduce operation complexity as much as possible.

Cloud-native TiDB

We’re thrilled to release the beta version of TiDB Cloud, the fully-managed TiDB service delivered by PingCAP. TiDB Cloud is the easiest, most economical, and most resilient way to unlock the full power of TiDB in the cloud, allowing the users to deploy and run TiDB clusters with just a few clicks.

Two years ago, we began to develop TiDB Cloud. Today, it can seamlessly “dance in the cloud.”

TiDB Cloud

If you don’t want to install or maintain TiDB, you can try TiDB Cloud. Currently, TiDB Cloud supports two cloud platforms, Amazon Web Services (AWS) and Google Cloud Platform (GCP). If you’re using AWS or GCP, you can easily use TiDB with just a few clicks. It’s truly an “out of the box” solution. We’re also developing TiDB Cloud to make it support other cloud platforms.

Out-of-the-box TiDB Cloud

In TiDB 4.0, we introduce more than 70 new features. To learn more about them, you can read TiDB 4.0 GA, Gearing You Up for an Unpredictable World with a Real-Time HTAP Database.

TiDB Dashboard, a visual troubleshooting tool

TiDB 4.0 introduces TiDB Dashboard, a graphical interface with various built-in widgets that let you easily diagnose, monitor, and manage your clusters. In a single interface, you can check a distributed cluster’s runtime status and manage the cluster.

Even if you’re an inexperienced DBA, you can solve most cluster problems in the graphical interface. In TiDB Dashboard, you can identify hotspots and slow queries in the system and observe application load. With TiDB Dashboard, you can locate most of your system problems within 10 seconds!

TiDB Dashboard

TiDB performance: faster and faster

Performance is always an “exciting” issue. Compared with version 3.0, TiDB 4.0’s TPC-C performance improved by about 50%, and TPC-H performance increased by about 100%. For aggregate queries, compared with version 3.0, TiDB 4.0’s performance improved by 10 times—and in many scenarios even more.

These achievements are attributed to contributions from the TiDB open-source community. At the end of 2019, we launched the TiDB Challenge Program, an on-going community effort to bring TiDB to a new level of stability, performance, and usability. A total of 165 participants joined in this campaign, including 23 teams and 122 individual developers. We’d like to thank them for helping us shape TiDB into a competitive product in the database industry.

TiUP: get a TiDB cluster up in only one minute

Some users told us TiDB could be challenging to install. It could take them from several hours to an entire day to deploy the system. Now, this is about to change. TiDB 4.0 introduces TiUP, a component manager that streamlines installing and configuring a TiDB cluster into a few easy commands. With TiUP, you can get your cluster up in just one minute! To deploy a 15-node production cluster, it takes only 45 seconds. Whatever your need or experience level, TiUP will get your cluster up and running quickly with a minimal learning curve.

curl https://tiup-mirrors.pingcap.com/install.sh | sh  && tiup playground nightly --monitor

Security matters!

TiDB security

Not only are more enterprises using TiDB, they are using TiDB in more critical scenarios. There’s a lot of focus on data security, so we provide security features to meet the security and privacy compliance requirements of each country.

In TiDB clusters (including the ecosystem tools), data is encrypted both in-flight and at-rest. Neither PingCAP nor any other cloud vendor can violate the data privacy or security of TiDB users. When TiDB runs in the cloud, no one can see the database, and no one can intercept the data from the communication process.

TiDB 4.0 is ready for production

You might wonder: is TiDB 4.0 really ready for production? Let’s see a real-world case.

TiDB 4.0 in Zhihu

Zhihu, which means “Do you know?” in classical Chinese, is the Quora of China: a question-and-answer website where all kinds of questions are created, answered, edited, and organized by the community of its users. It is China’s biggest knowledge sharing platform.

Last year, Zhihu adopted TiDB in their Moneta application (which stores posts users have already read), and they published a post that showed how they kept their query response times at milliseconds levels despite having over 1.3 trillion rows of data.

Recently, Zhihu upgraded to TiDB 4.0. Their cluster has a capacity of 1 PB, and they’ve stored 471 TB of data in the cluster.

I was shocked when I first saw this, not only because of the data scale, but also shocked and moved by Zhihu’s confidence in 4.0. They upgraded to TiDB 4.0 only four days after the GA release. When we saw this, our confidence grew stronger. TiDB not only supports such a large data scale, but more importantly, it has greatly improved the Moneta application’s computing capability and reduced the system latency.

Reduced latency

As we can see in these diagrams, compared with TiDB 3.0, TiDB 4.0 has reduced latency by 40%. In other words, if Zhihu maintains the same latency as before, they can reduce their costs by 40%.

Why is TiDB so popular?

In the last year, we were often asked, “Why is TiDB so popular?”

The TiDB project’s stars on GitHub

We’re grateful for the success of TiDB around the world and thrilled that our customers find TiDB so valuable. But the credit doesn’t entirely belong to PingCAP. We gladly share it with the open-source community. After all, PingCAP is just part of that community. It’s because of developers around the world—like Square, Azure, and Zoom in the United States and Dailymotion in France—who give us their feedback, file pull requests, and contribute code, that we can shape TiDB into what it is today and build TiDB’s active open-source community.

When 4.0 was released, we made a word cloud to show the organizations that contribute to TiDB. We discovered that many organizations continually contribute to the TiDB community:

TiDB contributors’ organizations

At the same time, what surprises me is the community’s creativity. For example, TiDB Contributor Dongpo Liu visualized the top 100 contributors like this:

TiDB’s top 100 contributors

If you want to learn more about TiDB, you can attend online or offline training courses through PingCAP University.

You might be ambitious and want to write a distributed database of your own. No problem. Our Talent Plan program offers courses in how to build a distributed database’s computing and storage layers step by step. There will be tutors from all over the world to help you review your code or tasks.

Bonus: Chaos Mesh®

Finally, let’s talk about our experience in chaos engineering. There is a common understanding in the software field that all foreseeable accidents will eventually occur. We must accept the fact that complex systems are unavoidable, and we must do our best to keep them stable and resilient. Today, the complexity of the entire system is not only limited to the database, but extends to all parts of the business, and ultimately settles in the quality of services the system provides to users.

Amazon’s and Netflix’s microservices

The figure above graphically represents the connections within Amazon’s and Netflix’s microservices. These connections are actually much more complicated than spider webs. Therefore, we need a system to simulate all possible faults, let the faults happen continuously, and take precautions to enhance the robustness of the system.

Therefore, when we were developing TiDB, we built a system called Chaos Mesh®, a cloud-native Chaos Engineering platform that orchestrates chaos experiments in Kubernetes environments. It features all-around fault injection methods for complex systems in Kubernetes, covering faults in the Pod, the network, the file system, and even the kernel.

For example, Chaos Mesh can simulate a disk failure. In our test environment, if the disk breaks every minute, the network is isolated every minute. Although this situation rarely occurs in reality, if it does, it causes a catastrophic failure.

Chaos MeshÂ® designed for cloud-native systems

Chaos Mesh® designed for cloud-native systems

When we’re developing TiDB, we use Chaos Mesh to test TiDB. TiDB 4.0 received very good feedback from test users, partly due to Chaos Mesh’s “crazy and brutal” tests on TiDB. We invite you to use Chaos Mesh to test and improve your own systems.

Conclusion

TiDB is no longer just a database product. It has become the cornerstone of many systems. Before you use it, you can refer to other people’s experiences or solutions. We’ll be posting more articles about TiDB DevCon 2020, so stay tuned.

You’re welcome to try TiDB, join our community on Slack, and send us your feedback.

Book a Demo