In the rapidly evolving landscape of machine learning and database technologies, combining the strengths of different tools can lead to innovative solutions. One such powerful combination is using Jina AI’s embedding capabilities with TiDB’s vector search functionality. This blog will guide you through building a semantic cache service using Jina AI Embeddings and TiDB Vector.
What is a Semantic Cache?
A semantic cache stores the results of expensive queries and reuses them when the same or similar queries are made. This type of cache uses semantic understanding rather than exact key matching, making it particularly useful in applications requiring natural language processing or similar complex data retrieval tasks.
Why Jina AI and TiDB?
- Jina AI: Provides robust embedding capabilities, converting text into high-dimensional vectors that capture semantic meaning.
- TiDB Vector: Extends the TiDB database to support efficient vector operations, enabling fast similarity searches on high-dimensional data.
Setting Up the Environment
Prerequisites
Ensure you have the following installed:
- Python 3.8 or higher
- TiDB Serverless cluster setup and running
- An API key from Jina AI
Step-by-Step Implementation
1.Configuration
First, set up your environment configuration. Create a .env
file to store your database URI and TTL (Time to Live) settings.
DATABASE_URI=mysql+pymysql://<username>:<password>@<host>:<port>/<database>?ssl_mode=VERIFY_IDENTITY&ssl_ca=/etc/ssl/cert.pem
TIME_TO_LIVE=604800 # Default is 1 week
2.Install Required Libraries
Install the necessary Python packages:
pip install fastapi requests sqlmodel sqlalchemy python-dotenv tidb-vector
3.Define the Cache Model
Use SQLModel
to define your cache model, incorporating vector fields and automatic timestamping.
from sqlmodel import SQLModel, Field, Column, DateTime, String, Text
from sqlalchemy import func
from tidb_vector.sqlalchemy import VectorType
from typing import Optional
from datetime import datetime
class Cache(SQLModel, table=True):
__table_args__ = {
# Setting the TTL (Time to Live) for the cache entries
'mysql_TTL': f'created_at + INTERVAL {TIME_TO_LIVE} SECOND',
}
id: Optional[int] = Field(default=None, primary_key=True)
key: str = Field(sa_column=Column(String(255), unique=True, nullable=False))
key_vec: Optional[list[float]] = Field(
sa_column=Column(
VectorType(768), # Define the vector type with 768 dimensions
default=None,
comment="hnsw(distance=l2)", # Using HNSW (Hierarchical Navigable Small World) algorithm for distance calculation
nullable=False,
)
)
value: Optional[str] = Field(sa_column=Column(Text))
created_at: datetime = Field(
sa_column=Column(DateTime, server_default=func.now(), nullable=False)
)
updated_at: datetime = Field(
sa_column=Column(DateTime, server_default=func.now(), onupdate=func.now(), nullable=False)
)
4.Create the Database Engine
Create the engine and the database schema.
from sqlmodel import create_engine
# Create the engine using the database URI
engine = create_engine(DATABASE_URI)
# Create all tables in the database
SQLModel.metadata.create_all(engine)
5.FastAPI Setup
Set up the FastAPI application and endpoints for setting and getting cache entries.
from fastapi import FastAPI, Depends
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
from sqlmodel import Session, select
# Initialize FastAPI app
app = FastAPI()
security = HTTPBearer()
@app.post("/set")
def set_cache(
credentials: HTTPAuthorizationCredentials = Depends(security),
cache: Cache
):
# Generate embeddings for the given key using Jina AI
cache.key_vec = generate_embeddings(credentials.credentials, cache.key)
with Session(engine) as session:
session.add(cache)
session.commit()
return {'message': 'Cache has been set'}
@app.get("/get/{key}")
def get_cache(
credentials: HTTPAuthorizationCredentials = Depends(security),
key: str,
max_distance: Optional[float] = 0.1,
):
# Generate embeddings for the given key using Jina AI
key_vec = generate_embeddings(credentials.credentials, key)
# The max value of distance is 0.3
max_distance = min(max_distance, 0.3)
with Session(engine) as session:
result = session.exec(
select(
Cache,
Cache.key_vec.cosine_distance(key_vec).label('distance')
).order_by(
'distance'
).limit(1)
).first()
if result is None:
return {"message": "Cache not found"}, 404
cache, distance = result
if distance > max_distance:
return {"message": "Cache not found"}, 404
return {
"key": cache.key,
"value": cache.value,
"distance": distance
}
6.Generate Embeddings
Implement a function to get embeddings from Jina AI.
import requests
import os
from dotenv import load_dotenv
load_dotenv()
def generate_embeddings(jinaai_api_key: str, text: str):
JINAAI_API_URL = 'https://api.jina.ai/v1/embeddings'
JINAAI_HEADERS = {
'Content-Type': 'application/json',
'Authorization': f'Bearer {jinaai_api_key}'
}
JINAAI_REQUEST_DATA = {
'input': [text],
'model': 'jina-embeddings-v2-base-en' # Use the Jina Embeddings model with 768 dimensions
}
response = requests.post(JINAAI_API_URL, headers=JINAAI_HEADERS, json=JINAAI_REQUEST_DATA)
# Extract and return the embedding from the response
return response.json()['data'][0]['embedding']
How to Use This App
Prerequisites
- A running TiDB Serverless cluster with vector search enabled
- Python 3.8 or later
- Jina AI API key from Jina AI
Run the example
1.Clone this repo
git clone https://github.com/pingcap/tidb-vector-python.git
2.Create a virtual environment
cd tidb-vector-python/examples/semantic-cache
python3 -m venv .venv
source .venv/bin/activate
3.Install dependencies
pip install -r requirements.txt
4.Set the environment variables
Get the HOST
, PORT
, USERNAME
, PASSWORD
, and DATABASE
from the TiDB Cloud console, as described in the [Prerequisites](../README.md#prerequisites) section. Then set the following environment variables:
export DATABASE_URI="mysql+pymysql://<USERNAME>:<PASSWORD>@<HOST>:<PORT>/<DATABASE>?ssl_ca=/etc/ssl/cert.pem&ssl_verify_cert=true&ssl_verify_identity=true"
or create a .env
file with the above environment variables.
5.Run this example
Start the semantic cache server
uvicorn cache:app --reload
6.Test the API
Get the Jina AI API key from the Jina AI Embedding API page, and save it somewhere safe for later use.
POST /set
curl --location ':8000/set' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <your jina token>' \
--data '{
"key": "what is tidb",
"value": "tidb is a mysql-compatible and htap database"
}'
GET /get/<key>
curl --location ':8000/get/what%27s%20tidb%20and%20tikv?max_distance=0.5' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <your jina token>'
Conclusion
By combining Jina AI’s powerful embedding capabilities with TiDB’s efficient vector operations, you can build a robust semantic cache service. This service is ideal for applications requiring fast, intelligent caching and retrieval of semantically similar data. Start experimenting with this setup to explore its full potential in your projects.
More Demos
There are some examples to show how to use the tidb-vector-python to interact with TiDB Vector in different scenarios.
- OpenAI Embedding: use the OpenAI embedding model to generate vectors for text data, store them in TiDB Vector, and search for similar text.
- Image Search: use the OpenAI CLIP model to generate vectors for image and text, store them in TiDB Vector, and search for similar images.
- LlamaIndex RAG with UI: use the LlamaIndex to build an RAG(Retrieval-Augmented Generation) application.
- Chat with URL: use LlamaIndex to build an RAG(Retrieval-Augmented Generation) application that can chat with a URL.
- GraphRAG: 20 lines code of using TiDB Serverless to build a Knowledge Graph based RAG application.
- GraphRAG Step by Step Tutorial: Step by step tutorial to build a Knowledge Graph based RAG application with Colab notebook. In this tutorial, you will learn how to extract knowledge from a text corpus, build a Knowledge Graph, store the Knowledge Graph in TiDB Serverless, and search from the Knowledge Graph.
- Vector Search Notebook with SQLAlchemy: use SQLAlchemy to interact with TiDB Serverless: connect db, index&store data and then search vectors.
- Build RAG with Jina AI Embeddings: use Jina AI to generate embeddings for text data, store the embeddings in TiDB Vector Storage, and search for similar embeddings.
Happy coding!