HTAP Summit 2024 session replays are now live!Access Session Replays
tidb_feature_1800x600 (1)

TiDB is a powerful, open-source distributed SQL database built for handling both transactional and analytical workloads. With its MySQL compatibility, horizontal scalability, and real-time HTAP capabilities, it’s designed for modern, cloud-native architectures. In this distributed SQL tutorial, we’ll walk you through setting up a local TiDB cluster for development and testing in just a few simple steps. 

For a fully-managed experience with a free tier, you can also explore TiDB Cloud Serverless

Set Up Your Environment

This tutorial demonstrates examples using macOS. They will also work on Linux with trivial adaptations.

Install TiUP

TiUP is TiDB’s package manager. Install it with the following command:

curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh

You will see an output like this:

...
Successfully set mirror to https://tiup-mirrors.pingcap.com
Detected shell: zsh
Shell profile:  ~/.zshrc
~/.zshrc has been modified to add tiup to PATH
open a new terminal or source ~/.zshrc to use it
Installed path: ~/.tiup/bin/tiup

===============================================
Have a try:     tiup playground
===============================================

The script automatically modifies your PATH. To apply the changes, use the source command shown in the installation output or open a new terminal. In this example, run:

source ~/.zshrc

Start Your Database Playground

Start your local TiUP playground environment with:

tiup playground

When playground starts, you’ll see:

...
TiDB Playground Cluster is started, enjoy!

Connect TiDB:    mysql --host 127.0.0.1 --port 4000 -u root
TiDB Dashboard:  http://127.0.0.1:2379/dashboard
Grafana:         http://127.0.0.1:3000

Keep this terminal open and proceed to the next steps in a new terminal.

Connect to TiDB and Run Your First Queries

Let’s connect to the database server and run your first queries using TiUP’s built-in client. While this client only works with TiUP playground environments, it’s the quickest way to get started. Notice that TiDB Playground output shows you how to connect using the standard MySQL client. We’ll cover that later. 

Run the built-in client with:

tiup client

When prompted to choose an endpoint, simply press Enter to connect. Since we started tiup playground without extra arguments, there’s only one endpoint available.

You will see a prompt like this:

...
Starting component client: ~/.tiup/components/client/v1.16.1/tiup-client
Connected with driver mysql (8.0.11-TiDB-v8.4.0)
Type "help" for help.

my:root@127.0.0.1:4000=> 

Then, try these sample SQL commands:

CREATE DATABASE hello;
USE hello;

CREATE TABLE hello_tidb (
    id INT,
    name VARCHAR(50)
);

INSERT INTO hello_tidb VALUES (1, 'Hello World');

SELECT * FROM hello_tidb;

The SQL commands above demonstrate basic interactions with the TiDB database. First, a new database hello and a table hello_tidb are created. Then, a row with id = 1 and name = 'Hello World' is inserted. The SELECT * FROM hello_tidb command retrieves and displays all rows in the table, resulting in the following output:

 id |    name     
----+-------------
  1 | Hello World 
(1 row)

This confirms that the data has been successfully inserted and queried from the database. To exit the TiUP client, type \q or exit and press Enter. This will close the interactive session and return you to the terminal.

Understanding What Just Happened

You just started to use a fully functional distributed database cluster with:

  • TiDB Server: SQL layer that processes queries and speaks MySQL protocol
  • TiKV: Distributed storage layer that replicates data automatically and handles transactions across nodes
  • PD (Placement Driver): Cluster orchestrator that works behind the scenes to coordinate TiDB, TiKV, and TiFlash components, managing cluster metadata and ensuring data is distributed optimally

Additionally, tiup playground installed optional components:

  • TiFlash: Real-time analytics engine that accelerates complex queries and reporting without slowing down your live application
  • Built-in monitoring stack with:

You can verify your running database cluster components with:

tiup playground display

The command above displays your local cluster nodes, and you can see the running processes, node roles, and processes uptime:

Pid    Role     Uptime
---    ----     ------
56624  pd       2m33.91622925s
56625  tikv     2m33.755937666s
56626  tidb     2m33.73377625s
56646  tiflash  2m25.408932208s

You can clean up the playground environment you just started by simply returning to the terminal where you started tiup playground, and pressing Ctrl+C.

Congratulations! You now have a fully functional distributed database environment at your fingertips. This is a great moment to take a pause and explore the basics on your own. However, if you’re eager to dive deeper, let’s continue with this guide. We will learn more about TiUP playground and how to create a sample application and workload, simulate workload balancing across your distributed database, scale your cluster, and leverage TiFlash to accelerate your queries.

Enhance Your Development Workflow

The previous steps gave you an ephemeral environment – perfect for first experiments but everything is lost when you stop the playground. Let’s set up a more suitable environment for development and tests.

Connect to TiDB Using MySQL Client

Since TiDB is compatible with MySQL, you can also utilize the extensive MySQL ecosystem of tools and connectors. Unlike tiup client which is restricted to the playground environment, the MySQL standard client is recommended for all TiDB deployment options and more advanced usage. Let’s set it up now and use it for the rest of this guide. 

If you already have MySQL installed, verify it by running:

mysql --version

If you don’t have MySQL installed or need only the client with a specific version, you can install MySQL Client 8.4 on macOS with:

brew install mysql-client@8.4

Note: When installing the MySQL client using Homebrew on macOS, we recommend version 8.4 due to a known issue (not solved at the time of this writing) where newer versions cannot load the mysql_native_password plugin correctly.

After installation, you should see an output similar to this:

...
mysql-client@8.4 is keg-only, which means it was not symlinked into /opt/homebrew, because this formula is mainly used by MySQL server.

If you need to have mysql-client@8.4 first in your PATH, run:
  echo 'export PATH="/opt/homebrew/opt/mysql-client@8.4/bin:$PATH"' >> ~/.zshrc

To make mysql-client available globally, update your PATH as per the instructions above and reload the rc file:

echo 'export PATH="/opt/homebrew/opt/mysql-client/bin:$PATH"' >> ~/.zshrc

source ~/.zshrc

Once the client is available, make sure your playground cluster is running (if you stopped it earlier, start it again with tiup playground). Then connect to TiDB using:

mysql --host 127.0.0.1 --port 4000 -u root

You will see a prompt like this:

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3126853646
Server version: 8.0.11-TiDB-v8.4.0 TiDB Server (Apache License 2.0) Community Edition, MySQL 8.0 compatible

Copyright (c) 2000, 2024, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

This confirms that the MySQL client is successfully interacting with the TiDB cluster, and the data previously inserted is correctly retrieved. To exit the MySQL client, type \q or exit and press Enter.

Persistent and Customized Environments

With TiUP you can create tagged playgrounds that persist data between restarts. You can also use playground options to specify different TiDB versions, or components that better meet your development and testing goals. 

If your playground cluster is still running, first stop it by pressing Ctrl+C in the terminal where you started it and wait for all processes to stop. Now we can create, for example, a tagged local environment with 2 TiDB servers, 3 TiKV, 1 PD, and no TiFlash:

tiup playground --tag sample-app-env --db 2 --kv 3 --pd 1 --tiflash 0

The --tag flag preserves your data after the playground environment restarts and lets you run multiple environments using different tags (ex., sample-app-env, dev-env).

You can list all your playground environments with:

tiup status

You will see an output similar to this:

Name            Component  PID    Status  ... Args
sample-app-env  playground 30661  RUNNING ... tiup-playground --tag sample-app-env --db 2 --kv 3 --pd 1 --tiflash 0

If you stop a tagged playground cluster by pressing Ctrl+C in the terminal where you started it, you can start it again without losing data. To restart the environment, you simply run tiup playground with the same arguments as seen in Args in the tiup status output above. 

Tags also make the clean-up process more explicit. For example, first, stop the playground cluster by pressing Ctrl+C in the terminal where you started it and wait a few seconds for TiUP to gracefully stop all processes, then run:

tiup clean sample-app-env

See all TiUP playground options with:

tiup playground --help

Common options:

  • --*.host: Bind to specific network interface
  • --*.port: Change default ports
  • --without-monitor: Start without monitoring stack

You can even customize existing environments by changing some arguments. This means that you have the flexibility to add and remove cluster components like monitoring, TiFlash, or TiProxy without losing data.

Add a Load Balancer

Before we demonstrate distributed SQL with a sample application, let’s add TiProxy to help us visualize how queries are distributed across TiDB servers. TiProxy will automatically balance our application’s database connections, making it easier to observe distributed query processing in action. 

If your playground cluster is still running, first stop it by pressing Ctrl+C in the terminal where you started it and wait for all processes to stop. Then, to add a TiProxy instance in a new or existing sample-app-env, start the environment with:

tiup playground --tag sample-app-env --db 2 --kv 3 --pd 1 --tiflash 0 --tiproxy 1

You should see the cluster starting normally, with an output like this:

...
TiDB Playground Cluster is started, enjoy!

Connect TiDB:    mysql --comments --host 127.0.0.1 --port 4001 -u root
Connect TiDB:    mysql --comments --host 127.0.0.1 --port 4000 -u root
Connect TiProxy: mysql --comments --host 127.0.0.1 --port 6000 -u root
TiDB Dashboard:  http://127.0.0.1:2379/dashboard
Grafana:         http://127.0.0.1:62007

Note: If tiup playground is taking longer than usual to start or shows errors, check for running instances with tiup status and stop them if needed.

Now you can connect through TiProxy (port 6000):

mysql --host 127.0.0.1 --port 6000 -u root

By adding a TiProxy instance in the environment, you’ll be enhancing TiDB with a new layer. TiProxy sits between your applications and TiDB servers, intelligently distributing database connections and managing traffic to improve scalability and reduce connection overhead. In the next section, we’ll simulate a sample workload and see the SQL queries automatically distributed across the TiDB server nodes.

Build a Sample Scalable Application

Let’s create a simple weather monitoring system that stores data from weather stations worldwide. Using the environment we just created with TiProxy and the MySQL client, connect to TiDB and run the following SQL statements: 

-- Create the schema/database for the weather monitoring app
CREATE DATABASE sample_weather;
USE sample_weather;

CREATE TABLE stations (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(100),
    location VARCHAR(100),
    latitude DECIMAL(8,6),
    longitude DECIMAL(9,6)
);

CREATE TABLE readings (
    station_id INT,
    temperature DECIMAL(4,1), -- in Celsius
    humidity INT, -- percentage
    pressure INT, -- in hPa
    recorded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_station_time(station_id, recorded_at)
);

You will see a series of Query OK responses. Now insert sample data:

-- Insert some weather stations from different cities
INSERT INTO stations (name, location, latitude, longitude) VALUES
    ('Sydney Harbor', 'Sydney, Australia', -33.8688, 151.2093),
    ('Shibuya Weather', 'Tokyo, Japan', 35.6595, 139.7006),
    ('Beijing Station', 'Beijing, China', 39.9040, 116.4275),
    ('Singapore Central', 'Singapore', 1.3521, 103.8198),
    ('Mumbai Central', 'Mumbai, India', 18.9710, 72.8194),
    ('Cairo Station', 'Cairo, Egypt', 30.0629, 31.2474),
    ('Berlin Central', 'Berlin, Germany', 52.5251, 13.3694),
    ('London City', 'London, UK', 51.5074, -0.1278),
    ('Sao Paulo Central', 'Sao Paulo, Brazil', -23.5505, -46.6333),
    ('Central Park', 'New York, USA', 40.7829, -73.9654);

The response Query OK, 10 rows affected confirms successful data insertion. Now we’re ready to generate sample workloads in the next section.

Generate Sample Workload

Open two new terminal windows and run the following shell commands to generate write and read workloads through TiProxy.

Terminal 1: Write Workload

Run this command in a new terminal to continuously insert weather readings:

while true; do
    mysql -h 127.0.0.1 -P 6000 -u root -D sample_weather -vv -e "INSERT INTO readings
    (station_id, temperature, humidity, pressure) SELECT id as station_id, ROUND(10 + (RAND() *
    30), 1), ROUND(40 + (RAND() * 60)), ROUND(980 + (RAND() * 40)) FROM stations;"
    sleep 0.5
done

This command continuously inserts simulated readings from all stations every 0.5 seconds. You will see output showing successful insertions like Query OK, 10 rows affected, confirming that readings are being recorded for all 10 stations.

Terminal 2: Read Workload

Run this command in another new terminal to continuously query the data:

while true; do
    mysql -h 127.0.0.1 -P 6000 -u root -D sample_weather -e "SELECT s.name, s.location,
    r.temperature, r.humidity, r.recorded_at FROM stations s JOIN readings r ON s.id = r.station_id
    WHERE r.recorded_at >= NOW() - INTERVAL 5 MINUTE ORDER BY r.recorded_at DESC LIMIT 5;"
    sleep 1
    mysql -h 127.0.0.1 -P 6000 -u root -D sample_weather -e "SELECT s.name,
    ROUND(AVG(r.temperature), 1) as avg_temp, ROUND(AVG(r.humidity)) as avg_humidity,
    COUNT(*) as num_readings FROM stations s JOIN readings r ON s.id = r.station_id WHERE
    r.recorded_at >= NOW() - INTERVAL 1 HOUR GROUP BY s.id, s.name ORDER BY avg_temp DESC;"
    sleep 3
done

This command alternates between showing the most recent readings and hourly statistics. You’ll see tables updating with the latest temperatures, humidity levels, and running averages from all stations.

Keep both terminals running as we explore monitoring and scaling in the next sections.

Monitor Your Application

Now that you have a running application with active read and write workloads, let’s use TiDB’s built-in monitoring tools to understand what’s happening.

TiDB Dashboard: SQL Insights

Access TiDB Dashboard by checking its URL on the terminal you are running TiUP playground (typically http://127.0.0.1:2379/dashboard – user: root, no password) to:

  1. View your running queries:
    • Go to “SQL Statements” to see your weather data queries
    • Check execution times, and additional optional columns, and click on the statement for more execution details
    • Go to “Slow Queries” to identify queries that might need optimization
  2. Monitor instance status:
    • Check “Overview” and “Cluster Info” for general topology, and resource usage
    • Check “Monitoring” to see the database load and other relevant metrics
    • Check “Search Logs” for any query errors
TiDB dashboard overview.
TiDB Dashboard Overview

Grafana: System Performance

Access Grafana by checking its URL in the terminal where TiUP playground is running – the port may change between restarts (look for something like Grafana: http://127.0.0.1:<port>). Log in with user admin, password admin, and either change the password or click Skip. You can visualize the built-in dashboards by clicking on the Search icon and the folder “Test-Cluster”:

  1. Detailed TiDB metrics:
  • Search for “Overview”, and select the “Test-Cluster-Overview” dashboard for general cluster health status
  1. TiProxy Load Balancing:
  • Search for “TiProxy”, and check the “Balance” row
Distributed SQL tutorial: Built-in Grafana dashboards.
Built-in Grafana Dashboards

Tip: Keep these monitoring pages open while experimenting with different data volumes or query patterns to understand how your application behaves under load.

Scale Your Cluster

As your weather monitoring system grows, you’ll need to handle more stations, readings, and queries. Let’s scale the cluster while observing the process in real time.

Prepare Monitoring Views

Before scaling, set up these monitoring views to watch the scaling process:

  1. Open TiDB Dashboard:
    • Go to “Cluster Info”, “Store Topology”
  1. Open Grafana:
    • Find the “TiKV-Summary” dashboard
    • Look for the “Cluster” panel:
      • Store, Available, and Capacity size charts

Add Storage Capacity (TiKV)

With monitoring in place, let’s add two additional TiKV nodes online (scale-out). Open a new terminal and run:

tiup playground scale-out --kv 2

Watch the expansion:

  • In TiDB Dashboard, “Cluster Info”, “Store Topology”, click the Refresh button. You should see the new TiKV nodes in the topology.
  • In Grafana, watch the TiKV size increase.
  • In your application terminals, verify continuous data ingestion and reads without any disruption.
Distributed SQL tutorial: TiDB cluster store topology after TiKV scale-out.
TiDB Cluster Store Topology After TiKV Scale-Out
TiDB Cluster Increased Size After TiKV Scale-Out.
TiDB Cluster Increased Size After TiKV Scale-Out

Remove Storage Capacity (TiKV)

Let’s remove one TiKV node. For these scale-in operations, we specify the exact node we want to remove using its PID. First, in your available terminal, check the TiKV nodes’ PIDs:

tiup playground display

You will see an output similar to:

Pid     Role    Uptime
...
73976  tikv     1m44.587040583s
73977  tikv     1m44.566830708s
...

You can scale-in your cluster by removing one TiKV node of your choice using its PID. For example, select one of the TiKV nodes and run the following replacing the –pid value with your TiKV PID:

tiup playground scale-in --pid 73977

Check TiDB Dashboard and Grafana again, and you will see the cluster is smaller now.

TiDB Cluster Store Topology After TiKV Scale-In
TiDB Cluster Store Topology After TiKV Scale-In
TiDB Cluster Decreased Size After TiKV Scale-In.
TiDB Cluster Decreased Size After TiKV Scale-In

Scale Query Processing (TiDB)

Now that storage is expanded, let’s add TiDB nodes while watching query distribution:

tiup playground scale-out --db 1

The output you will see is similar to this:

tiup playground scale-out --db 1
... 
To connect new added TiDB: mysql --comments --host 127.0.0.1 --port 64477 -u root -p (no password)

Watch the query layer scaling:

  • TiDB Dashboard, “Cluster Info”, “Instances”. Check the new TiDB instance.
  • Grafana, search for “TiProxy-Summary”, and expand the “Query Summary” or “Traffic”. You should see a new instance in the backend (in this example, with port 64477) and the traffic is being balanced across them.
Distributed SQL Tutorial: TiDB Cluster Distributed Query Processing After Scale-Out
TiDB Cluster Distributed Query Processing After Scale-Out

Add Analytical Processing (TiFlash)

TiDB supports real-time analytics through its columnar storage engine, TiFlash. Let’s add it to our playground cluster.

Add TiFlash node:

tiup playground scale-out --tiflash 1

Enable TiFlash replica for our weather data in the table readings. Using the MySQL client or TiUP client run:

USE sample_weather;
ALTER TABLE readings SET TIFLASH REPLICA 1;

Check replication progress:

SELECT table_schema, table_name,
    replica_count, available, progress
FROM information_schema.tiflash_replica;

Once the replica is ready (progress=1), you can run analytical queries that will automatically use TiFlash when beneficial. 

+----------------+------------+---------------+-----------+----------+
| table_schema   | table_name | replica_count | available | progress |
+----------------+------------+---------------+-----------+----------+
| sample_weather | readings   |             1 |         1 |        1 |
+----------------+------------+---------------+-----------+----------+
1 row in set (0.00 sec)

Check the TiDB Dashboard, SQL Statements, “#Plans” columns. If there is a new plan, you can inspect the execution details to see if there is a task performed by mpp[tiflash].

Query Execution Plan Accelerated by TiFlash
Query Execution Plan Accelerated by TiFlash

You may not notice too much performance difference because it is a local test environment with low concurrency. In a production environment with real workloads, TiFlash significantly improves analytical query performance while isolating these resource-intensive operations from your transactional workload. This separation enables real-time analytics without impacting your operational applications. 

If you are interested in testing in a production-like environment, see Deploy a TiDB Cluster Using TiUP or Deploy a TiDB Cluster on Kubernetes.

Clean Up (Optional)

If you want to clean up after experimenting:

  1. Stop running workloads (Ctrl+C in Terminal 1 and 2)
  2. Stop the playground cluster (Ctrl+C in the playground terminal)
  3. Clean up the environment:
tiup clean sample-app-env

If you want to clean everything, all environments and components, run:

tiup clean --all

Conclusion

In this distributed SQL tutorial, you’ve learned how to set up and experiment with a fully functional TiDB distributed database environment on your local machine. Using TiUP Playground, you:

  • Gained hands-on experience with TiDB’s core architecture components: TiDB servers for SQL processing, TiKV for distributed storage, and PD for cluster coordination
  • Mastered basic TiUP commands to manage and control your local cluster
  • Set up a development environment
  • Created a sample weather monitoring application
  • Explored monitoring capabilities and real-time scaling
  • Learned how to scale your cluster by adding or removing TiDB and TiKV nodes and load balancing with TiProxy
  • Added analytical processing with TiFlash

You now have the foundation to continue exploring TiDB’s capabilities for both transactional and analytical workloads. 

Next Steps

Continue your learning journey! Whether you’re interested in development, operations, or data analytics, check out our comprehensive resources for additional distributed SQL tutorials, guides, and insights.

Need Help?

  • TiDB AI (tidb.ai) – get instant, accurate answers about any TiDB topic, with cited sources from our documentation and community resources

Developer Resources

Operations Resources

Data Analytics Resources

Community Resources


Experience modern data infrastructure firsthand.

Try TiDB Serverless

Have questions? Let us know how we can help.

Contact Us

TiDB Cloud Dedicated

A fully-managed cloud DBaaS for predictable workloads

TiDB Cloud Serverless

A fully-managed cloud DBaaS for auto-scaling workloads