HTAP Summit 2024 session replays are now live!Access Session Replays

PingCAP is an enterprise-grade software service provider committed to delivering an open-source, cloud-native, one-stop database for growth-oriented clients. PingCAP is a certified AWS Qualified Software Partner.

TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. Designed for the cloud, TiDB provides flexible scalability, reliability, and security on the cloud platform. Cloud service users can easily implement remote disaster recovery over TiDB, further improving business continuity and availability.

TiDB Cloud, the fully-managed database service for TiDB, is also available on AWS.

TiDB with AWS: maximizing the value of data for cloud service users

With more and more TiDB users building their technology stacks on the cloud, the combined solution of TiDB + AWS allows cloud service users to easily implement remote disaster recovery on TiDB, further improving business continuity and availability.

TiDB’s cloud-native design combined with AWS’s professional cloud services enables users to automate scaling on the cloud. With the close collaborations of both parties, TiDB can reach more customers more quickly, while continuing to hone its products to meet customers’ needs for digital transformation.

Currently, TiDB supports more than 60 types of Amazon EC2 instances on AWS, with 7*24 enterprise-grade support that includes:

  • Online support ticketing service and telephone support that cover product database operation, and maintenance related issues
  • Media download of products, product upgrade packages, hotfixes, and patches
  • Product security-related alarms and notifications
  • Dedicated emergency response support channel for P1 grade faults

In addition to the products and services, the ecosystems of TiDB and AWS are also well integrated to maximize the value of data.

Success story with TiDB + AWS from Xiaohongshu

Xiaohongshu is a popular social media and e-commerce platform in China. The Xiaohongshu app allows users to post and share product reviews, travel blogs, and lifestyle stories via short videos and photos. Xiaohongshu receives more than 100 million rows of data every day.

A growing business—and growing challenges

For data reports, Xiaohongshu previously used the Hadoop data warehouse to pre-aggregate the data, and then aggregate the high-dimensional data and put it in MySQL for query. However, with the rapid growth of their business, the types of reports became more diverse. The scalability of MySQL was also a challenging issue.

The sharded MySQL database solution brought along high complexities and maintenance difficulties, making it hard to run complex and distributed queries.

In the anti-fraud data analytics scenario, the traditional data warehouse in T+1 mode had a delayed time to insights. They need a database solution with real-time analytics processing capabilities.

The TiDB HTAP solution

To try and solve the challenges mentioned above, Xiaohongshu introduced the TiDB HTAP solution into their application architecture.

In the data report scenario, TiDB replaces MySQL. The horizontal scalability of TiDB solves the complicated problem of scaling out MySQL as the business grows.

For complex queries, Xiaohongshu replicated MySQL data to TiDB via Binlog in real time and merged the tens of thousands of sharded tables to one large table in TiDB. Complicated queries, ACID transactions, and join operations could be done in real-time on TiDB without affecting the primary MySQL database.

For anti-fraud analytic processing, Xiaohongshu changed the T+1mode for writes to writing with SQL statements in Apache Flink in real time. Their peak queries per second (QPS) can reach 30,000 or 40,000. A single table may receive about 500 million rows of data per day. By bypassing the Hadoop data warehouse and doing real-time queries in TiDB, an analyst can see the coupon distribution status within a few minutes.

Xiaohongshu aggregates other data into a data lake built upon Amazon S3 and EMR, and then loads it into a TiDB cluster for unified and efficient operational analysis.

Related Resources