Over the past decade, MySQL 5.7 has been the go-to open-source database for many enterprises and developers. However, with Oracle ceasing support for MySQL 5.7 this month, many users are confronted with a critical decision: upgrade to MySQL 8.0 or miss out on essential support and updates.
MySQL 8.0 not only introduces a slew of exciting features such as window functions, Common Table Expressions (CTEs), and enhanced JSON support but also offers superior performance via refined indexing and optimized query execution.
TiDB is an advanced open-source, distributed SQL database that has embraced the MySQL ecosystem since its inception. TiDB is wire compatible with MySQL’s protocol and syntax commands, which means that MySQL clients, MySQL drivers, and some of MySQL’s utilities can run directly on TiDB. For the vast majority of applications running on MySQL, there is little to no code modification required.
In alignment with the advancements in MySQL 8.0, TiDB 7.4, released on Oct 12, has officially extended MySQL compatibility beyond MySQL 5.7 and embraced many of MySQL 8.0’s standout features. This enhanced compatibility significantly streamlines the migration process for MySQL 8.0 applications.
This post will dive into key MySQL 8.0 features that are now supported by TiDB 7.4 Development Milestone Release (DMR):
- Common Table Expressions (CTE)
- Window Functions
- Resource Management
- Multi-valued Indexes
- Hint SET_VAR()
Common Table Expressions
Common Table Expressions, or CTEs, provides a mechanism to define temporary named result sets that can be reused multiple times within a single SQL statement. By using CTEs, you can enhance the readability and maintainability of intricate SQL queries and potentially boost their performance.
TiDB has supported CTEs since version v5.0, aligned with the ANSI SQL 99 standard and its recursive syntax. In TiDB 7.4, this compatibility also extends to TiFlash, the columnar storage engine of TiDB.
In the following example, let’s consider two tables: authors (containing information about authors) and book_authors (recording the relationship between author IDs and the book IDs).
SQL
mysql> desc authors;
+------------+--------------+------+------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+------+---------+-------+
| id | bigint(20) | NO | PRI | NULL | |
| name | varchar(100) | NO | | NULL | |
| gender | tinyint(1) | YES | | NULL | |
| birth_year | smallint(6) | YES | | NULL | |
| death_year | smallint(6) | YES | | NULL | |
+------------+--------------+------+------+---------+-------+
mysql> desc book_authors;
+-----------+------------+------+------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+------------+------+------+---------+-------+
| book_id | bigint(20) | NO | PRI | NULL | |
| author_id | bigint(20) | NO | PRI | NULL | |
+-----------+------------+------+------+---------+-------+
To illustrate the advantages of using CTEs in SQL, let’s construct a query that retrieves the number of books written by each of the 50 oldest authors:
SQL
mysql> WITH top_50_eldest_authors_cte AS (
-> SELECT a.id, a.name, (IFNULL(a.death_year, YEAR(NOW())) - a.birth_year) AS age
-> FROM authors a
-> ORDER BY age DESC
-> LIMIT 50
-> )
-> SELECT
-> ANY_VALUE(ta.id) AS author_id,
-> ANY_VALUE(ta.age) AS author_age,
-> ANY_VALUE(ta.name) AS author_name,
-> COUNT(*) AS books
-> FROM top_50_eldest_authors_cte ta
-> LEFT JOIN book_authors ba ON ta.id = ba.author_id
-> GROUP BY ta.id;
+-----------+------------+----------------------+-------+
| author_id | author_age | author_name | books |
+-----------+------------+----------------------+-------+
| 524470241 | 80 | Alexie Kirlin | 7 |
| 67511645 | 80 | Bridgette Tromp | 9 |
...
| 48355494 | 80 | Audrey Mayert | 7 |
+-----------+------------+----------------------+-------+
50 rows in set (0.23 sec)
The query provides a tabulated result showing the author_id, age, name, and the total number of books written by each of the 50 oldest authors. The result is efficiently achieved in a time of 0.23 seconds.
This query exemplifies the power and utility of CTEs in managing and processing complex datasets while maintaining a structured and readable SQL statement.
Window Functions
Window functions, frequently referred to as analytic functions, play a pivotal role in data analytics, aggregation, and sorting processes. Tasks such as data grouping, sorting, and identifying trends often require the use of window functions. These functions facilitate SQL users in addressing complex data processing challenges without resorting to cumbersome self-joins or subqueries.
TiDB 7.4 provides full support for MySQL 8.0’s window functions. Most of these functions can be pushed down to TiFlash. With TiFlash’s architecture tailored for real-time analytics, using window functions in TiDB 7.4 offers users a powerful combination of SQL expressiveness and execution efficiency.
Resource Management
Effective resource management is crucial in ensuring the optimal performance and stability of large-scale databases. With the introduction of resource management in v7.1, TiDB took significant strides in improving the allocation and utilization of cluster resources. The benefits include:
- Effectively minimizing the impact of application load changes from other applications
- Addressing the impact of batch jobs and background tasks on the core business
- Mitigating unexpected SQL performance issues that slow down the entire cluster,
While the implementation mechanisms may differ between TiDB and MySQL, TiDB ensures compatibility with MySQL’s syntax for defining resource groups and associated hints. This means that users familiar with MySQL can seamlessly transition to TiDB without a steep learning curve or additional migration costs.
Furthermore, TiDB’s approach to resource isolation provides a robust mechanism to control vital I/O resources. This capability often results in performance enhancements, sometimes even surpassing what’s achievable in MySQL.
The following example demonstrates how to implement resource control for a user usr1, ensuring they don’t exceed 500 Requests Units (RUs) per second.
- Estimate cluster capacity.
mysql> CALIBRATE RESOURCE
- Create app1 resource group with a resource quota of 500 requests/second.
mysql> CREATE RESOURCE GROUP IF NOT EXISTS app1 RU_PER_SEC = 500;
- Associate user user1 with resource group app1 to impose the resource limits on this user.
mysql> ALTER USER usr1 RESOURCE GROUP app1;
You can also modify the resource group of the session:
mysql> SET RESOURCE GROUP `app1`;
Or, you can use hint RESOURCE_GROUP() to specify the resource group for a statement:
mysql> SELECT /*+ RESOURCE_GROUP(rg1) */ * FROM t limit 10;
Enhanced utf8mb4 Character Set
One of the noteworthy changes in MySQL 8.0 is the transition of the default character set to the all-encompassing utf8mb4. In addition, the default collation is changed to utf8mb4_0900_ai_ci. TiDB 7.4 has added utf8mb4_0900_ai_ci collation to make system migration easier.
TiDB’s dedication to compatibility is evident through its support for the MySQL-compatible variable default_collation_for_utf8mb4. This feature provides the flexibility to adjust the default sorting methodology of the utf8mb4 character set, creating a seamless transition between distinct MySQL versions. Consequently, TiDB users can effortlessly customize their database settings in alignment with specific application requirements.
When migrating to TiDB, the collation setting largely depends on your originating MySQL version:
- If you are migrating from MySQL 8.0, set the default sorting method for utf8mb4 to utf8mb4_0900_ai_ci:
set global default_collation_for_utf8mb4='utf8mb4_0900_ai_ci';
- If you are migrating from MySQL 5.7, set the default sorting method of utf8mb4 to utf8mb4_general_ci:
set global default_collation_for_utf8mb4='utf8mb4_general_ci';
Multi-valued Indexes
Multi-valued Index in MySQL 8.0 is a powerful feature that redefines how developers index and access JSON arrays. At its core, a Multi-valued Index is an evolved form of the traditional index structure. What makes it particularly striking is its capability to index an array of JSON types. This unique capability facilitates rapid retrieval of JSON data, thereby enhancing query performance and efficiency.
TiDB 7.4 offers full support for Multi-valued Indexes identical to MySQL’s implementation. Key functions like MEMBER OF(), JSON_CONTAINS(), and JSON_OVERLAPS() seamlessly integrate with Multi-value indexes, which greatly simplifies the querying process.
For teams considering migrating to TiDB, this support ensures a hassle-free transition. There’s no need to reconfigure data modeling or adjust applications. Users can retain their existing practices and continue manipulating JSON data with methods they’re accustomed to.
For example, consider a scenario with a customer info table that encapsulates all the details within a JSON type column. The column employs an array structure to capture the various cities where a customer might reside.
Without using multi-valued index, the query requires a full table scan if we want to retrieve customers located in Beijing:
SELECT name FROM customer
WHERE 'beijing' MEMBER OF $.city;
After we create a multi-value index for the city array, the above query can use the index to quickly pinpoint all records that match:
ALTER TABLE customers add index idx_city (name, (CAST(custinfo->'$.city' AS char(20) ARRAY)))
Note: Due to various reasons, the database’s query optimizer might not automatically leverage a Multi-valued Index. In such cases, you can use the optimizer hint USE_INDEX() or USE_INDEX_MERGE() to “nudge” the optimizer to utilize the specified indexes.
Hint SET_VAR()
SET_VAR ()
is a versatile hint for fine-tuning the execution of specific SQL statements without affecting global or session variables. TiDB 7.4 supports this hint, which allows users to easily customize system variables for specific SQL statements. This provides granular control, ensuring that changes don’t inadvertently impact other operations.
For example, when analyzing and processing large tables, it is possible to increase the concurrency of SQL execution with the hint /*+ set_var(tidb_executor_concurrency=20 */
:
SELECT /*+ set_var(tidb_executor_concurrency=20) */
l_orderkey,
SUM(
l_extendedprice * (1 - l_discount)
) AS revenue,
o_orderdate,
o_shippriority
FROM
customer,
orders,
lineitem
WHERE
c_mktsegment = 'BUILDING'
AND c_custkey = o_custkey
AND l_orderkey = o_orderkey
AND o_orderdate < DATE '1996-01-01'
AND l_shipdate > DATE '1996-02-01'
GROUP BY
l_orderkey,
o_orderdate,
o_shippriority
ORDER BY
revenue DESC,
o_orderdate
limit 10;
You can use the hint /*+ set_var(tidb_isolation_read_engines='tidb,tiflash') */
to force the query you just made to select TiFlash while leaving other queries unchanged:
SELECT /*+ set_var(tidb_isolation_read_engines='tidb,tiflash') */
l_orderkey,
SUM(
l_extendedprice * (1 - l_discount)
) AS revenue,
o_orderdate,
o_shippriority
FROM
customer,
orders,
lineitem
WHERE
c_mktsegment = 'BUILDING'
AND c_custkey = o_custkey
AND l_orderkey = o_orderkey
AND o_orderdate < DATE '1996-01-01'
AND l_shipdate > DATE '1996-02-01'
GROUP BY
l_orderkey,
o_orderdate,
o_shippriority
ORDER BY
revenue DESC,
o_orderdate
limit 10;
Refer to TiDB documentation for more details.
Migrate to TiDB for a Scalable, Cloud-native MySQL Alternative
In summary, TiDB 7.4 further reinforces its position as a powerful database solution that maintains high compatibility with both MySQL 5.7 and 8.0. By supporting the latest features in MySQL 8.0, TiDB provides users with a comprehensive platform that combines advanced capabilities with familiar functionality. This consistent commitment to compatibility and innovation makes TiDB an appealing choice for organizations looking to leverage the best of both database worlds.
We encourage you to test your MySQL 8.0 workloads with TiDB and let us know if you have any compatibility issues. Our team is continually improving TiDB’s MySQL compatibility and performance with each new release. For instructions on how to migrate, check out our MySQL to TiDB migration guide.
Experience modern data infrastructure firsthand.
TiDB Cloud Dedicated
A fully-managed cloud DBaaS for predictable workloads
TiDB Cloud Serverless
A fully-managed cloud DBaaS for auto-scaling workloads