Landing Your First Rust Pull Request in TiKV
This guide is intended to show how you can land your first Pull Request (PR) in Rust to contribute to TiKV in less than 30 minutes. But before we do that, here's some helpful background.
TiDB ("Ti" = Titanium) is an opensource distributed scalable Hybrid Transactional and Analytical Processing (HTAP) database, built by the company PingCAP (that's us!) and its active opensource community (that's you!). It's designed to provide infinite horizontal scalability, strong consistency, and high availability with MySQL compatibility. It serves as a onestop data warehouse for both OLTP (Online Transactional Processing) and OLAP (Online Analytical Processing) workloads.
What powers this experience is TiKV, a distributed transactional keyvalue store (all built in Rust!), which is now deployed in more than 200 companies in production (see the constantlyupdated list of adopters). One key reason why TiDB can process complex SQL queries so quickly is a Coprocessor API layer between TiDB and TiKV, which takes advantage of the distributed nature of a distributed database to "push down" partial queries in parallel, where partial results are generated and reassembled for the client. This is a key differentiator between TiDB and other distributed databases.
So far, TiDB can only push down some simple expressions to TiKV to be processed, e.g. fetching the value in a column and doing comparison or arithmetic operations on simple data structures. To get more juice out of distributed computing resources, we need to include more expressions to push down. The first type is MySQL builtin functions. How do we accomplish that in short order? That's where youour intrepid systems hacker, Rust lover, and distributed system geekcome in!
So follow along this guide to contribute a builtin MySQL function to TiKV in Rust in 30 minutes. And when you do, you will receive some special gifts from us reserved just for our beloved contributors, as a small token of our gratitude. Let's get started!
How the Coprocessor Works
Before diving into our stepbystep guide on how to contribute, it's worth understanding how TiDB's Coprocessor works at a highlevel. After TiDB receives a SQL statement, it parses the statement into an abstract syntax tree (AST), then generates an optimal execution plan using its CostBased Optimizer. (Learn more details on how TiDB generates a query plan HERE.) The execution plan is split into multiple subtasks and the Coprocessor API pushes down these subtasks to different TiKV nodes to be processed in parallel.
Here's an illustration on how a statement like select count(*) from t where a + b > 5
gets pushed down:
After TiKV receives these subtask expressions, the following steps are performed in a loop:
 Obtain the complete data of the next row, parse and decode the data record based on the requested columns.
 Use the predicate specified in the
where
clause to filter data.  If the data passes the filter predicate, the aggregation result will be computed.
After different TiKV nodes compute and return results of their respective subtasks, they are returned to TiDB. TiDB then aggregates on all the results sent from TiKV and sends the final result to the client.
How to add a MySQL builtin function to TiKV
Now that you have an overview of how Coprocessor in TiDB/TiKV works, here's how to contribute MySQL builtin functions to further strengthen TiKV's coprocessing power!
Step 1: Select a function for pushdown
Go to the push down scalar functions
issue page, choose a function you like from the unimplemented function signature list, then tell us so we can create an issue and assign it to you to prevent duplication of work.
Step 2: Find the logic of corresponding implementation in TiDB
Search the related builtinXXXSig
(XXX is the function signature you want to implement) in the expression
directory of TiDB.
Take MultiplyIntUnsigned
as an example, which we will use throughout this guide, you can find the corresponding function signature (builtinArithmeticMultiplyIntUnsignedSig
) and its implementation.
Step 3: Define the function

The name of the file where the builtin function exists in TiKV should correspond to the same name in TiDB.
For example, since all the pushdown files in the
expression
directory in TiDB are namedbuiltin_XXX
, in TiKV the corresponding file name should bebuiltin_XXX.rs
. In this example, the current function is in the builtin_arithmetic.go file in TiDB, so the function should be placed in builtin_arithmetic.rs in TiKV.Note: If the corresponding file in TiKV does not exist, you need to create a new file in the corresponding directory with the same name as in TiDB.

The function name should follow the Rust snake_case naming conventions.
For this example,
MultiplyIntUnsigned
will be defined asmultiply_int_unsigned
. 
For the return value, you can refer to the
Eval
functions which are implemented in TiDB and their corresponding return value types, as shown in the following table:Eval
Function in TiDBReturn Value Type in TiKV evalInt Result<Option<i64>> evalReal Result<Option<i64>> evalString Result<Option<Cow<'a, [u8]>>> evalDecimal Result<Option<Cow<'a, Decimal>>> evalTime Result<Option<Cow<'a, Time>>> evalDuration Result<Option<Cow<'a, Duration>>> evalJSON Result<Option<Cow<'a, Json>>> Thus, in TiDB's
builtinArithmeticMultiplyIntUnsignedSig
, it implements theevalInt
method, so the return value type of this functionmultiply_int_unsigned
should beResult<Option<i64>>
. 
All the arguments of the
builtinin
function should be consistent with that of theeval
function of the expression: The statement context is
ctx:&StatementContext
 The value of each column in this row is
row: &[Datum]
 The statement context is
Putting all this together, the definition of the pushdown function multiply_int_unsigned
should look like this:
pub fn multiply_int_unsigned(
&self,
ctx: &mut EvalContext,
row: &[Datum],
) > Result<Option<i64>>
Step 4: Implement the function logic
Implement the function logic based on the corresponding logic in TiDB.
For this example, you can see that the implementation of builtinArithmeticMultiplyIntUnsignedSig
of TiDB is:
func (s *builtinArithmeticMultiplyIntUnsignedSig) evalInt(row types.Row) (val int64, isNull bool, err error) {
a, isNull, err := s.args[0].EvalInt(s.ctx, row)
if isNull  err != nil {
return 0, isNull, errors.Trace(err)
}
unsignedA := uint64(a)
b, isNull, err := s.args[1].EvalInt(s.ctx, row)
if isNull  err != nil {
return 0, isNull, errors.Trace(err)
}
unsignedB := uint64(b)
result := unsignedA * unsignedB
if unsignedA != 0 && result/unsignedA != unsignedB {
return 0, true, types.ErrOverflow.GenByArgs("BIGINT UNSIGNED", fmt.Sprintf("(%s * %s)", s.args[0].String(), s.args[1].String()))
}
return int64(result), false, nil
}
To implement the same function in Rust for TiKV, it should be:
pub fn multiply_int_unsigned(
&self,
ctx: &mut EvalContext,
row: &[Datum],
) > Result<Option<i64>> {
let lhs = try_opt!(self.children[0].eval_int(ctx, row));
let rhs = try_opt!(self.children[1].eval_int(ctx, row));
let res = (lhs as u64).checked_mul(rhs as u64).map(t t as i64);
// TODO: output expression in error when column's name pushed down.
res.ok_or_else( Error::overflow("BIGINT UNSIGNED", &format!("({} * {})", lhs, rhs)))
.map(Some)
}
Step 5: Add argument check
When TiKV receives a pushdown request, it checks all the expressions first including the number of the expression arguments.
In TiDB, there is a strict limit for the number of arguments in each builtin function. For the number of arguments, see builtin.go
in TiDB.
To add argument check:
 Go to
scalar_function.rs
in TiKV and find thecheck_args
function ofScalarFunc
.  Add the check for the number of the expression arguments as the implemented signatures do.
Step 6: Add pushdown support
TiKV calls the eval
function to evaluate a row of data and the eval
function executes the subfunction based on the returned value type. This operation is done in scalar_function.rs
by dispatch_call
.
For our example function, MultiplyIntUnsigned
, the final return value type is Int
, so INT_CALLS
can be found in dispatch_call
. Then take the code of other signatures in INT_CALLS
as reference and add MultiplyIntUnsigned => multiply_int_unsigned
. It indicates that when the function signature MultiplyIntUnsigned
is being parsed, the implemented function multiply_int_unsigned
will be called. The pushdown logic of MultiplyIntUnsigned
is now implemented.
Step 7: Add at least one test
 Go to
builtin_arithmetic.rs
where themultiply_int_unsigned
function resides.  Add the unit test for the function signature in the
test
module which is at the end ofbuiltin_arithmetic.rs
. Make sure that the unit test covers all the code which is added above. You can see the related test code in TiDB for reference.
For this example, the test code implemented in TiKV is as follows:
#[test]
fn test_multiply_int_unsigned() {
let cases = vec![
(Datum::I64(1), Datum::I64(2), Datum::U64(2)),
(
Datum::I64(i64::MIN),
Datum::I64(1),
Datum::U64(i64::MIN as u64),
),
(
Datum::I64(i64::MAX),
Datum::I64(1),
Datum::U64(i64::MAX as u64),
),
(Datum::U64(u64::MAX), Datum::I64(1), Datum::U64(u64::MAX)),
];
let mut ctx = EvalContext::default();
for (left, right, exp) in cases {
let lhs = datum_expr(left);
let rhs = datum_expr(right);
let mut op = Expression::build(
&mut ctx,
scalar_func_expr(ScalarFuncSig::MultiplyIntUnsigned, &[lhs, rhs]),
).unwrap();
op.mut_tp().set_flag(types::UNSIGNED_FLAG as u32);
let got = op.eval(&mut ctx, &[]).unwrap();
assert_eq!(got, exp);
}
// test overflow
let cases = vec![
(Datum::I64(1), Datum::I64(2)),
(Datum::I64(i64::MAX), Datum::I64(i64::MAX)),
(Datum::I64(i64::MIN), Datum::I64(i64::MIN)),
];
for (left, right) in cases {
let lhs = datum_expr(left);
let rhs = datum_expr(right);
let mut op = Expression::build(
&mut ctx,
scalar_func_expr(ScalarFuncSig::MultiplyIntUnsigned, &[lhs, rhs]),
).unwrap();
op.mut_tp().set_flag(types::UNSIGNED_FLAG as u32);
let got = op.eval(&mut ctx, &[]).unwrap_err();
assert!(check_overflow(got).is_ok());
}
}
Step 8: Run the test
Run make expression
and ensure that all the test cases can pass the test.
Step 9: File a PR for TiKV
After you finish the above steps, you can file a PR for TiKV! After we merge, you are now a honored TiKV contributor! See our Contribution Guide for a more comprehensive rundown of becoming a contributor.
Wrapping Up
We hope this guide provides an easy entry point to contributing to our Coprocessor, one of TiDB and TiKV's core features. If you run into any issues or problems with this guide, please let us know on our Twitter, Reddit, Stack Overflow, or Google Group. Look forward to seeing your PR, and once it's merged, expect a special gift of gratitude from our team!
Ready to get started with TiDB?
What's on this page
 How the Coprocessor Works

How to add a MySQL builtin function to TiKV
 Step 1: Select a function for pushdown
 Step 2: Find the logic of corresponding implementation in TiDB
 Step 3: Define the function
 Step 4: Implement the function logic
 Step 5: Add argument check
 Step 6: Add pushdown support
 Step 7: Add at least one test
 Step 8: Run the test
 Step 9: File a PR for TiKV
 Wrapping Up