Authors: Zhiyuan Liang, Qizhi Wang (Senior Software Engineers at PingCAP)
Editors: Fendy Feng, Tom Dewan
The database is a crucial component for building applications. It stores and analyzes the huge amounts of data coming in from outside tools or generated when people use the application. By retrieving data that fits certain criteria, queries can help you make the best use of that data.
TiDB is an open source, distributed, and highly-scalable NewSQL database. It has an established record in supporting many data-intensive applications. In this tutorial, we’ll use TiDB as an example and show you how to query data from a relational SQL database with SQL or Java.
Before you get started
Before getting started to query data, you need to build a TiDB cluster, import data into this cluster, and then connect to TiDB.
- Build a TiDB cluster. You can either create a local TiDB cluster or a TiDB Cloud free cluster, which is a fully-managed Database-as-a-Service (DBaaS).
- To create a local TiDB cluster, see Deploy a TiDB Cluster Using TiUP.
- To create a TiDB Cloud free cluster, see Build a TiDB Cluster in TiDB Cloud.
- Import data into TiDB. We’ll use the sample data from the Bookshop application, a popular virtual bookstore. If you’ve built a local TiDB cluster, use TiUP to import this data; if you’ve built a TiDB Cloud cluster, just click the Import button on the TiDB Cloud page and follow the instructions that followed to finish the data importing process. For a detailed guide, see Import Table Structures and Data from Bookshop to TiDB.
- Connect to TiDB. TiDB is fully MySQL compatible, so you can choose any SQL client drivers or object relational mapping frameworks to connect to TiDB. For a detailed guide on TiDB connection, see Connect to TiDB.
After you’ve fully prepared, you can start to make queries. In the following sections, we’ll elaborate on how to query data from TiDB via SQL or Java.
Execute a simple query
The Bookshop application dataset includes an authors
table, which stores basic information about their writers. To query author information from TiDB, use the SELECT ... FROM …
SQL statement.
For example, by executing the following SQL statement in a MySQL client, you can get the id
and name
of all the authors stored in TiDB.
SELECT id, name FROM authors;
The results TiDB returns are:
+------------+--------------------------+
| id | name |
+------------+--------------------------+
| 6357 | Adelle Bosco |
| 345397 | Chanelle Koepp |
| 807584 | Clementina Ryan |
| 839921 | Gage Huel |
| 850070 | Ray Armstrong |
| 850362 | Ford Waelchi |
| 881210 | Jayme Gutkowski |
| 1165261 | Allison Kuvalis |
| 1282036 | Adela Funk |
...
| 4294957408 | Lyla Nitzsche |
+------------+--------------------------+
20000 rows in set (0.05 sec)
In Java, to get the same results:
- You should first define how to store the author’s basic information by declaring an
Author
class. - Select the appropriate data type from the Java language according to the data types and value range TiDB supports to store the corresponding data. For example, you can use a variable of:
- Type
Int
to store data of typeint
. - Type
Long
to store data of typebigint
. - Type
Short
to store data of typetinyint
. - Type
String
to store data of typevarchar
.
- Type
public class Author {
private Long id;
private String name;
private Short gender;
private Short birthYear;
private Short deathYear;
public Author() {}
// Skip the getters and setters.
}
3.After connecting to TiDB using the JDBC driver, create a Statement
object with conn.createStatus()
.
4.Call stmt.executeQuery("query_sql")
to initiate a database query request to TiDB.
5.The query results will be stored in a ResultSet
object. With theResultSet
traversed, the returned results can be mapped to the Author
object.
public class AuthorDAO {
// Omit initialization of instance variables...
public List<Author> getAuthors() throws SQLException {
List<Author> authors = new ArrayList<>();
try (Connection conn = ds.getConnection()) {
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("SELECT id, name FROM authors");
while (rs.next()) {
Author author = new Author();
author.setId( rs.getLong("id"));
author.setName(rs.getString("name"));
authors.add(author);
}
}
return authors;
}
}
Filter results
To filter query results, use the WHERE
statement.
For example, by executing the following SQL statement, you can filter the authors born in 1998.
SELECT * FROM authors WHERE birth_year = 1998;
In Java, you can use the same SQL statement to handle data query requests with dynamic parameters. To deal with such queries, we recommend that you use a prepared statement as shown below instead of a normal statement.
public List<Author> getAuthorsByBirthYear(Short birthYear) throws SQLException {
List<Author> authors = new ArrayList<>();
try (Connection conn = ds.getConnection()) {
PreparedStatement stmt = conn.prepareStatement("""
SELECT * FROM authors WHERE birth_year = ?;
""");
stmt.setShort(1, birthYear);
ResultSet rs = stmt.executeQuery();
while (rs.next()) {
Author author = new Author();
author.setId( rs.getLong("id"));
author.setName(rs.getString("name"));
authors.add(author);
}
}
return authors;
}
Sort results
To sort results from tables stored in TiDB, you can use the ORDER BY
statement.
For example, by executing the SQL statement below, you can sort all authors in the authors
table in descending order (DESC
) according to their birth year.
SELECT id, name, birth_year
FROM authors
ORDER BY birth_year DESC;
You can also use the ORDER BY
statement in Java:
public List<Author> getAuthorsSortByBirthYear() throws SQLException {
List<Author> authors = new ArrayList<>();
try (Connection conn = ds.getConnection()) {
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("""
SELECT id, name, birth_year
FROM authors
ORDER BY birth_year DESC;
""");
while (rs.next()) {
Author author = new Author();
author.setId(rs.getLong("id"));
author.setName(rs.getString("name"));
authors.add(author);
}
}
return authors;
}
In both cases, TiDB returns:
+-----------+------------------------+------------+
| id | name | birth_year |
+-----------+------------------------+------------+
| 83420726 | Terrance Dach | 2000 |
| 57938667 | Margarita Christiansen | 2000 |
| 77441404 | Otto Dibbert | 2000 |
| 61338414 | Danial Cormier | 2000 |
| 49680887 | Alivia Lemke | 2000 |
| 45460101 | Itzel Cummings | 2000 |
| 38009380 | Percy Hodkiewicz | 2000 |
| 12943560 | Hulda Hackett | 2000 |
| 1294029 | Stanford Herman | 2000 |
| 111453184 | Jeffrey Brekke | 2000 |
...
300000 rows in set (0.23 sec)
Limit the number of query results
To limit the number of query results, you can use the LIMIT
statement.
For example, to get the id
and name
of the 10 youngest authors and display them in descending order of their birth year, execute the following SQL statement:
SELECT id, name, birth_year
FROM authors
ORDER BY birth_year DESC
LIMIT 10;
You can also use the LIMIT
statement in Java:
public List<Author> getAuthorsWithLimit(Integer limit) throws SQLException {
List<Author> authors = new ArrayList<>();
try (Connection conn = ds.getConnection()) {
PreparedStatement stmt = conn.prepareStatement("""
SELECT id, name, birth_year
FROM authors
ORDER BY birth_year DESC
LIMIT ?;
""");
stmt.setInt(1, limit);
ResultSet rs = stmt.executeQuery();
while (rs.next()) {
Author author = new Author();
author.setId(rs.getLong("id"));
author.setName(rs.getString("name"));
authors.add(author);
}
}
return authors;
}
In both cases, TiDB returns:
+-----------+------------------------+------------+
| id | name | birth_year |
+-----------+------------------------+------------+
| 83420726 | Terrance Dach | 2000 |
| 57938667 | Margarita Christiansen | 2000 |
| 77441404 | Otto Dibbert | 2000 |
| 61338414 | Danial Cormier | 2000 |
| 49680887 | Alivia Lemke | 2000 |
| 45460101 | Itzel Cummings | 2000 |
| 38009380 | Percy Hodkiewicz | 2000 |
| 12943560 | Hulda Hackett | 2000 |
| 1294029 | Stanford Herman | 2000 |
| 111453184 | Jeffrey Brekke | 2000 |
+-----------+------------------------+------------+
10 rows in set (0.11 sec)
Using the LIMIT
statement cut the query time by more than 50%—from 0.23 s to 0.11 s.
Aggregate queries
To better understand the overall data situation, you can combine the GROUP BY
statement with an aggregate function like the COUNT
function to aggregate query results.
For example, to know how many authors were born in each year and which year has the most authors, execute the following statement:
SELECT birth_year, COUNT (DISTINCT id) AS author_count
FROM authors
GROUP BY birth_year
ORDER BY author_count DESC;
You can also use the Java Group BY
statement and COUNT
function:
public class AuthorCount {
private Short birthYear;
private Integer authorCount;
public AuthorCount() {}
// Skip the getters and setters.
}
public List<AuthorCount> getAuthorCountsByBirthYear() throws SQLException {
List<AuthorCount> authorCounts = new ArrayList<>();
try (Connection conn = ds.getConnection()) {
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("""
SELECT birth_year, COUNT(DISTINCT id) AS author_count
FROM authors
GROUP BY birth_year
ORDER BY author_count DESC;
""");
while (rs.next()) {
AuthorCount authorCount = new AuthorCount();
authorCount.setBirthYear(rs.getShort("birth_year"));
authorCount.setAuthorCount(rs.getInt("author_count"));
authorCounts.add(authorCount);
}
}
return authorCount;
}
In both cases, TiDB returns:
+------------+--------------+
| birth_year | author_count |
+------------+--------------+
| 1932 | 317 |
| 1947 | 290 |
| 1939 | 282 |
| 1935 | 289 |
| 1968 | 291 |
| 1962 | 261 |
| 1961 | 283 |
| 1986 | 289 |
| 1994 | 280 |
...
| 1972 | 306 |
+------------+--------------+
71 rows in set (0.00 sec)
In addition to the COUNT
function, TiDB also supports other aggregate functions. For more information, see Aggregate (GROUP BY) Functions.
Wrapping Up
In this tutorial, you’ve learned how to query data from TiDB with SQL statements or Java commands. I hope you find this guide helpful. If you have any questions, contact us through Twitter or our Slack channel. You can also join TiDB Internals to share your thoughts and feedback.
If you want to know more developer guides related to TiDB or TiDB Cloud, see TiDB Developer Guide or TiDB Cloud Developer Guide.
Keep reading:
Building a Web Application with Spring Boot and TiDB
Using Retool and TiDB Cloud to Build a Real-Time Kanban in 30 Minutes
Analytics on TiDB Cloud with Databricks
Want to explore TiDB without installing any software? Go to TiDB Playground
Experience modern data infrastructure firsthand.
TiDB Cloud Dedicated
A fully-managed cloud DBaaS for predictable workloads
TiDB Cloud Serverless
A fully-managed cloud DBaaS for auto-scaling workloads