Engineering Archives | TiDB

📣 Want to scale over 1 million tables in a single cluster? Join our webinar on May 29th.Register Now

Start for Free

TiDB Performance Hotspots: How to Identify and Fix Issues Using Top SQL

Hotspots are silent performance killers in distributed databases. They rarely trigger alerts — instead, they quietly erode throughput, increase tail latency, and leave your engineering team guessing. If you’re running TiDB at scale and suddenly one TiKV node is running hot while others sit idle, you’re likely facing a hotspot. In this post, we’ll walk […]

Write Latency, Solved: TiKV’s Journey to Smoother Performance

Whether you’re processing thousands of concurrent writes per second or scaling out infrastructure to meet a burst in demand, latency spikes can undermine user experience, reliability, and trust. At PingCAP, we obsess over these details. Our mission is to help teams build and scale confidently on distributed SQL. This post highlights how a subtle performance […]

Optimizing Backup Verification: How to Enhance Performance and Reliability in TiDB

With the release of TiDB 8.5, TiDB BR (Backup & Restore) has made a significant change: Full-table checksum verification is now turned off by default during backups. This update boosts backup efficiency by cutting unnecessary overhead while keeping data integrity intact. In this post, we’ll explain how TiDB has optimized backup verification, the expected performance […]

Accelerating Query Performance: The Benefits of TiDB’s In-Memory Engine (IME)

If you’ve ever struggled with slow queries or high resource consumption in your database, you’re not alone. Many databases, including those built on Multi-Version Concurrency Control (MVCC), face query performance degradation over time. While MVCC is essential for managing concurrent access, excessive historical data scanning can slow down queries and overload system resources — a […]

Time’s Up! How TiDB Efficiently Handles Expired Data

Managing large-scale data efficiently is a critical challenge for modern databases, especially when dealing with time-sensitive data that can quickly become outdated. Starting from TiDB 6.5 and becoming generally available in TiDB 7.0, TTL (Time To Live) automates the deletion of expired data, offering a powerful, customizable solution for maintaining data freshness while minimizing operational […]

Unleashing 50x Performance: In-Depth Analysis of TiDB DDL Framework Optimizations

Managing schema changes in traditional databases often leads to downtime, blocking, and operational complexity. TiDB has long simplified this process with its online DDL capabilities, allowing developers to evolve their databases without disrupting applications. As user bases and data volumes have surged, however, index creation was increasingly becoming a performance bottleneck. To address this, we […]

Blazing-Fast Cluster Recovery: How TiDB 8.1 Redefines Large-Scale Data Restoration

Backup and restore are critical for ensuring business continuity, with the Recovery Time Objective (RTO) serving as a key metric for assessing restore performance. As TiDB continues to grow in popularity for its scalability, many users have datasets reaching hundreds of terabytes (TBs). That means the challenge of ensuring a fast RTO for such large […]

Navigating Business Growth: How TiDB Scales Petabyte-Level Data Volumes

Business growth is an exciting milestone, bringing more users, transactions, and opportunities to innovate. However, with growth comes significant technical challenges, such as: In this blog, we’ll explore how TiDB, an open-source distributed SQL database, addresses these business growth challenges. We’ll also walk through real-world examples from companies such as Bolt and Flipkart. Bolt: A […]

Effective Online DDL: Making Critical Database Schema Changes with Zero Downtime

Online Data Definition Language (DDL) is a crucial feature for modern databases. It allows schema changes without significant downtime or locking that could disrupt database operations. This means these operations carry out while the database continues to be available for reads and writes, minimizing downtime and avoiding disruption to ongoing activities. Online DDL is particularly […]

Multi-Tenant Architecture: Enhancing Database Scalability with TiDB

In the era of cloud computing and Software as a Service (SaaS), it’s essential to optimize resource use and scalability in databases. Multi-tenant architecture meets these needs by allowing a single database instance to serve multiple customers, or tenants. This ensures each tenant’s data remains isolated and secure, leading to enhanced cost efficiency, simplified management, […]

Web3Bench: A New HTAP Benchmark for Web3 Workloads

This blog introduces Web3Bench, a hybrid transaction/analytical processing (HTAP) benchmark that addresses earlier limitations. Web3Bench is based on real-world Web3 use cases that utilize HTAP. Our data model is a simplified version of the decentralized blockchain Ethereum. We leverage a sample data set from Ethereum to build a scale factor-based data generator. The workload in […]

What is Database Sharding? An Architecture Pattern for Increased Database Performance

Database sharding is a data architecture strategy that increases database performance by splitting up data into chunks and then spreading these chunks “intelligently” across multiple database servers (or database instances). These chunks of data are called shards, while each shard contains a subset of our data. All shards represent the entire set of data, and […]