tidb_feature_1800x600 (1)

Plaid is a financial services company that builds a data transfer network for applications to connect with users’ bank accounts. At HTAP Summit 2024, Zander Hill, Experienced Software Engineer at Plaid, and Andrew Chen, Engineering Manager at Plaid, shared an in-depth look into their company’s ambitious migration from Amazon Aurora to a distributed SQL database alternative.

In this blog, we’ll recap the major takeaways from their keynote presentation. We’ll explore the reasons behind Plaid’s migration, the challenges they faced, and the benefits realized since adopting distributed SQL. 

The Need for Change: Plaid’s Database Scaling Challenges

Plaid manages vast amounts of sensitive data, which necessitates a robust, scalable, and highly available database infrastructure. Initially, the company relied on Amazon Aurora, managing over 800 database servers to support its data-intensive workloads. However, as Plaid scaled to handle over 500,000 queries per second (QPS) across its infrastructure, its system began to encounter limitations.

Hill outlined that managing over 300 MySQL clusters became increasingly cumbersome, especially when performing upgrades or scaling operations. Plaid’s team had to dedicate significant engineering resources to maintain database performance, particularly as they approached the limits of what Amazon Aurora could handle in write throughput and scalability. The inability to make online schema changes on large tables—ranging from 2 to 10 terabytes (TBs)—further slowed down development and impacted engineering velocity.

Plaid Experienced Software Engineer Zander Hill discusses the company's journey to an Amazon Aurora alternative.

Plaid Experienced Software Engineer Zander Hill on stage during his keynote at HTAP Summit 2024.

Plaid’s key pain points included:

  • Database Scalability Limits: Amazon Aurora’s inability to scale write operations efficiently led to performance bottlenecks.
  • High Maintenance Burden: Database maintenance consumed significant time, with upgrades taking up to six months of engineering effort.
  • Developer Velocity: Engineers struggled with slow schema changes, which delayed feature releases and impacted customer satisfaction.

Finding a Suitable Amazon Aurora Alternative

Plaid’s decision to migrate to distributed SQL was driven by three core objectives: improving reliability at scale, enhancing developer productivity, and reducing maintenance overhead. After evaluating viable alternatives such as Google Spanner, Plaid selected TiDB, an advanced distributed SQL database, for its MySQL compatibility, horizontal scalability, and ability to support distributed transactions without sacrificing performance.

Chen explained that TiDB’s distributed SQL architecture allows for seamless horizontal scaling. This enabled Plaid to handle unpredictable spikes in demand without taking systems offline. TiDB also supports online schema changes. This gives developers the ability to modify large tables without impacting performance. These capabilities addressed Plaid’s critical pain points and provided a clear path forward.

Plaid Engineering Manager Andrew Chen discusses the company's journey to an Amazon Aurora alternative.

Plaid Engineering Manager Andrew Chen on stage during his keynote at HTAP Summit 2024.

Plaid’s top reasons for choosing TiDB were:

  • Zero-Downtime Scaling: TiDB provided non-disruptive scaling and automated failover, essential for Plaid’s mission-critical workloads.
  • MySQL Compatibility: TiDB ensured smooth integration with existing systems without extensive rewrites.
  • Distributed Transactions: TiDB enabled consistent and reliable performance across a distributed environment.

Plaid’s Migration to an Amazon Aurora Alternative

The migration from Amazon Aurora to TiDB began in early 2023 and should be fully completed by mid-2025. Plaid adopted a phased migration approach to minimize risks, starting with non-critical services before moving to high-throughput applications.

Some of the key steps in the company’s migration were:

  1. Identifying Compatibility Issues: Plaid’s team focused on fixing common incompatibilities, such as primary keys, foreign keys, and auto-increment IDs, before migrating data.
  2. Data Synchronization: Using tools like TiDB Lightning and TiCDC, Plaid synchronized data between Amazon Aurora and TiDB to ensure consistency during the transition.
  3. Validation and Testing: A blue-green-red deployment strategy was used to validate performance and correctness before fully switching over to TiDB.

Plaid leveraged feature flags to perform seamless cutovers, allowing them to switch traffic between Amazon Aurora and TiDB with minimal downtime. This process ensured that customers experienced no disruptions during the migration. Some challenges included:

  • Query Performance Issues: Some queries optimized for Amazon Aurora required adjustments to perform efficiently on TiDB.
  • Tooling and Ecosystem: Plaid encountered difficulties with open-source tools related to binary primary keys and timestamp columns. This required close collaboration with TiDB’s support team to resolve.

Plaid’s Results: Improvements in Database Scalability and Efficiency

Since adopting TiDB, Plaid has seen significant improvements in system performance and operational efficiency. Maintenance operations, such as scaling and configuration changes, can now be performed without taking services offline, leading to better uptime and reduced latency. 

The key benefits achieved were:

  • Reduced Maintenance Effort: The shift to TiDB reduced the operational burden, freeing up engineering resources to focus on innovation.
  • Improved Uptime: Maintenance operations are now non-disruptive, with zero downtime for most upgrades.
  • Cost Savings: By reducing the complexity of managing multiple MySQL clusters, Plaid has achieved cost efficiencies while improving system reliability.

Hill and Chen shared that TiDB’s multi-tenancy capabilities also enabled Plaid to optimize resource usage. This ensured workloads were effectively isolated to prevent noisy neighbors from impacting performance.

Conclusion: Scaling for the Future with an Amazon Aurora Alternative

Plaid’s journey to TiDB illustrates the transformative power of adopting a distributed SQL database for scaling modern SaaS applications. By moving away from the limitations of Amazon Aurora and embracing TiDB’s distributed SQL architecture, Plaid was able to overcome its scalability challenges, enhance developer productivity, and reduce operational complexity.

TiDB has enabled Plaid to modernize its data systems, ensuring that the company can support its rapid growth while maintaining high availability and performance. For organizations facing similar challenges, TiDB is an ideal Amazon Aurora alternative that balances scalability, reliability, and cost efficiency. As Plaid continues to migrate its remaining workloads, it’s a strong example of how other enterprises can use distributed SQL to future-proof data infrastructure.

Want to gain even more strategies for accelerating distributed SQL adoption within your own organization? Register to watch this entire keynote from the event for additional insights. Happy viewing!


Watch Now


HTAP Summit 2024 session replays!

Watch Now

Have questions? Let us know how we can help.

Contact Us

TiDB Cloud Dedicated

A fully-managed cloud DBaaS for predictable workloads

TiDB Cloud Serverless

A fully-managed cloud DBaaS for auto-scaling workloads