HTAP Summit 2024 session replays are now live!Access Session Replays

Introduction

In the realm of database management, the evolution of solutions like TiDB Cloud is revolutionizing how we interact with data. TiDB Serverless, a part of TiDB Cloud, now comes packed with a groundbreaking feature: built-in vector search integrated into the MySQL landscape. This enhancement not only streamlines database operations but also opens doors for developing AI applications seamlessly. This post serves as a comprehensive guide to harnessing the power of TiDB Serverless through LangChain, enabling you to store and load data effortlessly.

Prerequisites

Please make sure you have created a TiDB Serverless cluster with vector support enabled.

Join the waitlist for the private beta of built-in vector search in TiDB Serverless.

Join Now

1.Sign up TiDB Cloud

2.Follow this tutorial to create a TiDB Serverless cluster with vector support enabled

3.Navigate to the Clusters page, and then click the name of your target cluster to go to its overview page

4.Click Connect in the upper-right corner.

5.In the connection dialog, select General from the Connect With dropdown and keep the default setting of the Endpoint Type as Public.

6.If you have not set a password yet, click Create password to generate a random password.

Connect to TiDB Serverless clusters

7.Save the connection parameters to a safe place. You will need them to connect to the TiDB Serverless cluster in the following examples.

Before diving into the intricacies of TiDBLoader, ensuring you have the necessary dependencies installed is crucial. Execute the following command to install LangChain:

%pip install --upgrade --quiet langchain

Additionally, configuring the connection to your TiDB instance is essential. Utilize the provided connection string template from TiDB Cloud, ensuring a secure and efficient database connection.

import getpass

# Replace placeholders with your TiDB credentials
tidb_connection_string_template = "mysql+pymysql://<USER>:<PASSWORD>@<HOST>:4000/<DB>?ssl_ca=/etc/ssl/cert.pem&ssl_verify_cert=true&ssl_verify_identity=true"
tidb_password = getpass.getpass("Input your TiDB password:")
tidb_connection_string = tidb_connection_string_template.replace(
    "<PASSWORD>", tidb_password
)

Store and Load Data with TiDB

Now, let’s delve into the process of loading data from TiDB using TiDBLoader. Understanding key arguments and customization options is crucial for tailoring the loading process to your specific requirements.

from sqlalchemy import Column, Integer, MetaData, String, Table, create_engine

# Connect to the TiDB database
engine = create_engine(tidb_connection_string)
metadata = MetaData()
table_name = "test_tidb_loader"

# Create a sample table for demonstration
test_table = Table(
    table_name,
    metadata,
    Column("id", Integer, primary_key=True),
    Column("name", String(255)),
    Column("description", String(255)),
)
metadata.create_all(engine)

# Populate the sample table with dummy data
with engine.connect() as connection:
    transaction = connection.begin()
    try:
        connection.execute(
            test_table.insert(),
            [
                {"name": "Item 1", "description": "Description of Item 1"},
                {"name": "Item 2", "description": "Description of Item 2"},
                {"name": "Item 3", "description": "Description of Item 3"},
            ],
        )
        transaction.commit()
    except:
        transaction.rollback()
        raise

# Import TiDBLoader from LangChain
from langchain_community.document_loaders import TiDBLoader

# Set up TiDBLoader to retrieve data
loader = TiDBLoader(
    connection_string=tidb_connection_string,
    query=f"SELECT * FROM {table_name};",
    page_content_columns=["name", "description"],
    metadata_columns=["id"],
)

# Load data from TiDB using TiDBLoader
documents = loader.load()

# Display the loaded documents
for doc in documents:
    print("-" * 30)
    print(f"content: {doc.page_content}\nmetadata: {doc.metadata}")

# Clean up: drop the sample table
test_table.drop(bind=engine)

Expanding AI Application Examples with TiDB Serverless Vector Storage

Unlocking the potential of TiDB Serverless vector storage extends beyond traditional data management, offering a gateway to innovative AI applications. Here are some concise examples:

Leverage TiDB Serverless and AI models for transformative applications that redefine data management and AI-driven functionalities. Stay tuned for further advancements!

Conclusion

In this tutorial, we’ve explored how to leverage TiDB Serverless and LangChain to efficiently store and load data from a MySQL-compatible database. With TiDB Cloud’s groundbreaking features and LangChain’s seamless integration, you’re equipped to handle data operations with ease. Stay tuned for more tutorials and updates as we continue to explore the vast landscape of database management and AI application development.


Last updated May 20, 2024

Spin up a Serverless database with 25GiB free resources.

Start Right Away