reduce_latency_edge

When building applications for global users, it is common for users to make requests to servers located far from the region where the application is deployed. This often leads to a poor user experience due to network latency. Therefore, reducing latency is vital in such scenarios.

OSSInsight, our open-source GitHub project insight platform, faces the same challenge. To enhance the user experience for our global audience, we’ve explored a solution combining Vercel’s Edge Functions with TiDB Serverless—a fully managed MySQL-compatible Database as a Service (DBaaS). This approach aims to significantly reduce request latency and ensure a seamless experience for users worldwide. 

This post will detail our journey of migrating the API Server from an EC2 instance to an Edge Function and the remarkable outcome of a global latency reduction of 80%.

Reducing latency - comparison

Figure 1. Latency Comparison

Maximizing Website Efficiency by Reducing Latency

Edge Functions operate on the edge node closest to the user, minimizing latency arising from cross-region network requests. By significantly reducing response times, we can deliver analyzed data to our users more efficiently, ensuring a seamless browsing experience on our site. 

Reducing Latency - before and after

Figure 2. Edge Function’s Advantage

However, moving to the Edge solution was not a seamless process. We faced the following key challenges:

  • TCP-based database driver does not work on Edge
  • Edge nodes may be far from the database

Challenge 1: TCP-based Database Driver Does Not Work on Edge

OSS Insight uses TiDB Serverless as its underlying database. Previously, we used node-mysql2 as the database driver and Prisma as the ORM (Object-Relational Mapping) Framework for database connectivity. 

However, Edge environments such as Cloudflare Workers and Vercel Edge Functions only provide HTTP-based Fetch API. TCP Socket API is not an option. This means that traditional TCP protocol-based database drivers (for instance, mysql.js/mysql2) will not work.

Server Error
Error: Module not found: Can't resolve 'net'

Solutions for Database Connectivity in Edge Environments

After some investigation, we identified the following options to address the connectivity challenge:

SolutionNoteOptions
Node.js drivers adaptation to edge runtimeCloudflare Workers (upon which Vercel Edge Function is based) provides a new sockets API for establishing TCP connections on Edge environments. However, this requires the traditional database driver to adapt the cloudflare:sockets API.
The MySQL ecosystem doesn’t have an adapted driver yet.
node-mysql2 (WIP) (Engineers from PingCAP are leading the adaptation effort.sidorares/node-mysql2#2179)
Data proxy from ORM vendorsORM vendors provide a proxy server with built-in connection pooling.You can indirectly access the database by connecting to the proxy server via HTTP   on the Edge environment through the ORM Client.

This approach may introduce additional cost and maintenance burden.
Prisma Accelerate (GA)
Serverless driver from database vendorsServerless database vendors provide developers with a Fetch API-compatible database driver to execute raw SQL, which could be used on Edge runtime.Serverless Driver connects to the database via HTTPS protocol, which is more efficient than the MySQL protocol over TCP.

Database vendors usually provide this service for free to developers.
TiDB Serverless Driver

After careful consideration, especially to avoid additional costs, we ultimately decided to utilize the TiDB Serverless Driver to connect to the TiDB Serverless database.

Connect with Serverless Driver

Follow these steps to establish a connection with TiDB Serverless using the serverless driver:

  1. Install the package:
    npm install @tidbcloud/serverless --save
  2. Write the code for the edge function.

    Utilize the connect function from the serverless package to establish a connection with the database. Then, you can execute SQL statements. Here’s an example using Next.js. For example:

    import { NextRequest, NextResponse } from 'next/server';
    import { connect } from "@tidbcloud/serverless";
    
    export const runtime = 'edge';
    
    const conn = connect({
        url: process.env.DATABASE_URL
    });
    
    export async function GET(req: NextRequest) {
        const rows = await conn.execute(`
    SELECT COUNT(*) AS stars
    FROM github_events
    WHERE type = 'WatchEvent' AND repo_name = 'pingcap/tidb';
      `);
        return NextResponse.json(rows);
    }
    

Validate the Endpoint

To validate the setup, we can simply send a test request:

curl http://localhost:3000/api/queries/repository/stars

We can know the connection is successful when we receive the following response:

[
    {
      "stars": 35171
    }
]

Challenge 2: Edge Nodes May be Far from Database

Deploying the API Server and database within the same region or Virtual Private Cloud (VPC) typically reduces network latency between these components. However, this configuration can introduce higher latency for users located globally, far from the regional API Server.

Edge Functions solve this challenge by operating on an entirely different principle. They execute at nodes nearest to the user, significantly minimizing latency. 

While this approach effectively reduces user-facing latency, it introduces a new challenge: the edge function may be located far from the database. Consequently, this distance can increase the latency between the application and its database.

 Edge Function challange

Figure 3. Edge Nodes May be Far from Database

Enable Edge Cache

To address the challenge of balancing latency between edge functions and databases, we can leverage HTTP cache control technology and the Edge Cache feature. We can significantly improve efficiency by caching the responses from edge functions close to the end users. This approach ensures that users in the same region receive responses directly from the edge cache when they make identical requests, effectively minimizing cross-region database queries.

Following the code sample in the previous section, we can adjust the response headers to enable Edge Cache:

  1. Set a Time-to-Live (TTL) of 60 seconds for clients.
  2. Configure downstream Content Delivery Networks (CDNs) with a TTL of 300 seconds.
  3. Apply a TTL of 3600 seconds for Vercel’s Edge Cache.
-     return NextResponse.json(rows);
+     return NextResponse.json(rows, {
+       status: 200,
+       headers: {
+         'Cache-Control': 'max-age=60',
+         'CDN-Cache-Control': 'max-age=300',
+         'Vercel-CDN-Cache-Control': 'max-age=3600',
+       },
+     });

For details of the headers above, please check CDN-Cache-Control Header.

Validate the Endpoints

To assess the impact of our caching strategy, we first disable caching using the browser’s DevTool. This step ensures that our tests accurately simulate a user’s initial experience without the influence of pre-cached data. Next, we replicate typical user browsing behavior to trigger the API requests.

DevTool results - Before
DevTool results - after

Figure 4. Lantency Comparison on Dashboard

In the Network panel in DevTool, we observed a noticeable decrease in network request latency following the migration to our new caching strategy.

Conclusion

By leveraging the capabilities of the TiDB Serverless Driver and Edge Cache, we can establish a seamless connection to the database, initiate queries within the Edge Function environment, and cache the query result on globally distributed CDN nodes.

The result is a significant reduction in end-to-end latency of incoming global requests, providing our users with a much smoother and more enjoyable experience.

By the way, it also works well on Cloudflare Workers. Please refer to the docs page, Integrate TiDB Cloud with Cloudflare Workers, for additional details.


Start Free Now


Spin up a Serverless database with 25GiB free resources.

Start Right Away

Have questions? Let us know how we can help.

Contact Us

TiDB Cloud Dedicated

A fully-managed cloud DBaaS for predictable workloads

TiDB Cloud Serverless

A fully-managed cloud DBaaS for auto-scaling workloads