TiSpark: More Data Insights, Less ETL

The motivation behind building TiSpark was to enable real-time analytics on TiDB without the delay and challenges of ETL. Extract, transform, and load (ETL)--a process to extract data from operational databases, transform that data, then load it into a database designed to supporting analytics--has been one of the most complex, tedious, error-prone, and therefore disliked tasks for many data engineers. However, it was a necessary evil to make data useful, because there hasn’t been good solutions on the market to render ETL obsolete--until now.

Ele.me? TiDB At Your Service

With a fast-growing business comes soaring data size, which has placed tremendous pressure on Ele.me’s backend system, especially the database. How to tackle the challenges that come with mounting data has been a nightmare until we found TiDB, a MySQL compatible distributed hybrid transactional and analytical processing (HTAP) database, and its distributed key-value storage engine TiKV, both built and supported by PingCAP. Finally, we can harness the power of our data and not be intimidated by it.

How To Spin Up an HTAP Database in 5 Minutes with TiDB + TiSpark

In this 5-minute tutorial for beginners, we will show you how to spin up a standard TiDB cluster using Docker Compose on your local computer, so you can get a taste of its hybrid power, before using it for work or your own project in production.

Implement Raft in Rust

As an open-source distributed scalable HTAP database, TiDB uses the Raft Consensus Algorithm in its distributed transactional key-value storage engine, TiKV, to ensure data consistency, auto-failover, and fault tolerance. TiDB has thus far been used by more than 200 companies in their production environments in a wide range of industries, from e-commerce and food delivery, to fintech, media, gaming, and travel.

TiDB 2.0 is Ready - Faster, Smarter, and Battle-Tested

TiDB 2.0 is released! We absorbed insights and feedbacks from our customers, listened to requests and issues from our community, and reflected internally on our ultimate vision of building a distributed hybrid transactional and analytical processing database that scales itself, heals itself, and lives in the cloud.

From Chaos to Order -- Tools and Techniques for Testing TiDB, A Distributed NewSQL Database

As an open source distributed NewSQL Hybrid Transactional/Analytical Processing (HTAP) database, TiDB contains the most important asset of our customers--their data. One of the fundamental and foremost requirements of our system is to be fault-tolerant. But how do you ensure fault tolerance in a distributed database? This article covers the top fault injection tools and techniques in Chaos Engineering, as well as how to execute Chaos practices in TiDB.

Blitzscaling the Largest Dockless Bikesharing Platform with TiDB’s Help

Mobike has been using the TiDB database in the production environment since early 2017. Now they have deployed TiDB in multiple clusters with close to 100 nodes, handling dozens of TBs of data for different application scenarios. This post will provide a deep dive on why Mobike chose TiDB over MySQL and its sharding solutions by illustrating how TiDB solves their pain points.

How to do Performance Tuning on TiDB, A Distributed NewSQL Database

Doing performance tuning on distributed systems is no joking matter. It’s much more complicated than on a single node server, and bottlenecks can pop up anywhere, from system resources in a single node or subcomponent, to cooperation between nodes, to even network bandwidth. Performance tuning is a practice that aims to find these bottlenecks and address them, in order to reveal more bottlenecks and address them as well, until the system reaches an optimal performance level. In this article, I will share some best practices on how to tune "write" operations in TiDB to achieve maximum performance.

Bringing TiKV to Rust Devroom at FOSDEM 2018

At the crack of dawn on February 1, I landed in Brussels, Belgium, for the first time in my life. The goal of my trip wasn’t to taste the local cuisine, tour world-famous museums, or grab a pint of the local brew. It was to deliver a talk three days later at FOSDEM 2018 Rust Devroom about our experience at PingCAP using Rust to build TiKV, a distributed transactional Key-Value storage engine.

TiDB DevCon 2018 Recap - News, Latest Development, and Roadmap

On January 20th, 2018, more than 200 coders, hackers, and techies streamed into Garage Café, a chic coffee shop in the heart of Beijing’s techhub, Zhongguancun. They were there to be part of TiDB DevCon 2018, a technology party for the developers, by the developers!

2017 Reflection and Gratitude

Thank you all, our beloved contributors, customers, and partners, for an amazing 2017! Hello, 2018!

Tick or Tock? Keeping Time and Order in Distributed Databases

At re:Invent 2017, Amazon Web Services (AWS) announced Amazon Time Sync Service which is a highly accurate and reliable time reference that is natively accessible from Amazon EC2 instances. It is much like the Google TrueTime which was published in 2012. Why do Google and AWS both want to make efforts to provide global time service? Is there any inspiration for building distributed database? This topic is important to think about.

PingCAP Plants its Seed in Silicon Valley

PingCAP, a cutting-edge distributed Hybrid Transactional/Analytical Processing (HTAP) database company, is excited to announce the opening of its Silicon Valley office, located at the GSV Labs in Redwood City, California.

A TiKV Source Code Walkthrough – Raft Optimization

Paxos or Raft is frequently used to ensure data consistency in the distributed computing area. But Paxos is known for its complexity and is rather difficult to understand while Raft is very simple. Therefore, a lot of emerging databases tend to use Raft as the consensus algorithm at its bottom layer. TiKV is no exception.

PingCAP Launches TiDB 1.0

TiDB is compatible with MySQL, strong consistent and highly available.

Scale the Relational Database with NewSQL

This is the speech Li SHEN gave at the 3rd NEXTCON.

Why did we choose Rust over Golang or C/C++ to develop TiKV?

Every developer has his/her favorite programming language. For the TiKV team members, it's Rust.

RocksDB in TiKV

This is the speech Siddon Tang gave at the RocksDB meetup on August 28, 2017.

Futures and gRPC in Rust

This is the speech Siddon Tang gave at Bay Area Rust Meetup August 2017.

How we Hunted a Data Corruption bug in RocksDB

Data was corrupted. A cluster panicked. The crime scene was compromised. What happened? Detective Huang went all lengths to locate the criminal and solved it once and for all.

When TiDB Meets Jepsen

What happens when TiDB meets Jepsen?

The Design and Implementation of Multi-raft

The goal of TiKV is to support 100 TB+ data and it is impossible for one Raft group to make it, we need to use multiple Raft groups, which is called Multi-raft.

How TiDB tackles fast data growth and complex queries for yuanfudao.com

This document is a use case details the reasons why yuanfudao.com chose TiDB as its backend database solution to tackle their fast data growth and complex queries.

A TiKV Source Code Walkthrough - Raft in TiKV

TiKV uses the Raft algorithm to implement the strong consistency of data in a distributed environment. This blog introduces the details how Raft is implemented.

TiDB Best Practices

This article summarizes some best practices in using TiDB, mainly including SQL usage, OLAP/OLTP optimization techniques and especially TiDB's exclusive optimization switches.

TiDB Internal (III) - Scheduling

This is the third one of three blogs to introduce TiDB internal.

TiDB Internal (II) - Computing

This is the second one of three blogs to introduce TiDB internal.

TiDB Internal (I) - Data Storage

This is the first one of three blogs to introduce TiDB internal.

Refactoring the Built-in Functions in TiDB

In order to accelerate expression evaluation, we recently refactored its framework. This tutorial will show you how to use the new computational framework to rewrite or add a built-in function in TiDB.

Rust in TiKV

This is the speech Siddon Tang gave at the 1st Rust Meetup in Beijing on April 16, 2017.

A Brief Introduction of TiDB

This is the speech Edward Huang gave at Percona Live Open Source Database Conference 2017.

Migration from MySQL to TiDB to handle tens of millions of rows of data per day

This document is a use case that details the performance of MySQL and TiDB with tens of millions of rows of data per day.

About the TiDB Source Code

The target audience of this document is the contributors in the TiDB community. The document aims to help them understand the TiDB project. It covers the system architecture, the code structure, and the execution process.

Adding Built-in Functions

TiDB code is updated and the procedure of adding built-in functions is greatly simplified. This document describes how to add built-in functions to TiDB.

Subquery Optimization in TiDB

Subquery optimization, especially rewriting the correlated subquery, is a very difficult part in SQL query optimization. To be compatible with MySQL, TiDB enables users to write subqueries anywhere they want. For those subqueries that are not correlated, which are also called uncorrelated subqueries, TiDB evaluates in advance; for those correlated subqueries, TiDB removes the correlations as much as possible. For example, TiDB can rewrite a correlated subquery to `SemiJoin`. This article is focused on introducing the correlated subquery optimization methods in TiDB.


This document gives an overview of MVCC implementation in TiKV.

Travelling Back in Time and Reclaiming the Lost Treasures

This document introduces the History Read feature in TiDB.

A Deep Dive into TiKV

This document introduces how TiKV works as a Key-Value database.

How we build TiDB

This is the speech Max Liu gave at Percona Live Open Source Database Conference 2016.