Backup and Restore


You can't be a DBA without knowing how to backup and restore. It is one of the fundamental skills of database administration, and in this lesson we are going to be taking a closer look.

But before we dive too deep, I want to take a step back and describe some of the fundamentals behind a High Availability and Disaster Recovery plan.

TiKV provides High Availability (HA) natively out of the box, ensuring three copies of each Region are retained. It can also provide Disaster Recovery, or to differentiate between the terms HA and DR, it is common to have one of the copies located remotely. TiDB itself will only read from the primary Region, so there is some PD configuration that we will look into later to assist in this setup. Additionally, range partitioning can be used so that the primaries are pushed to where the data is needed. For example, a database shared across many branches and partitioned by branch.

Now that we've made that digression, let's clarify that you still need a backup. Your HA plan may not have accounted for a particular catastrophe. But even perfect HA cannot protect against human errors or malicious actions.

The most common case of restoring data is accidental deletes. For example, someone in your organization updated the wrong row, and information has been lost. Or a bug in a recently released update has started corrupting records, and you want to intelligently restore earlier versions of data. In MySQL, you have to start fixing this problem by restoring a backup to a new MySQL instance and then rolling forward through binary logs to find the lost data. Then you usually dump the lost data out of the old version and insert/update it into your active customer facing MySQL instance. Sometimes you will need to do the same in TiDB. However, you may not even need to go to bakup to find your lost data: it supports the ability to flashback and read the data as it appeared at an earlier point in time up to your retention window.

The success of any backup plan is usually measured on two key criteria: the Recovery Time Objective, meaning how quickly service can be restored after a failure has occurred, and the Recovery point objective, meaning how much data was lost between the last backup and when the disaster occurred. Backups in TiDB are hot, in that they can be taken on the running system without blocking either reads or writes from occurring.

Let's jump forward and practice flashback in our first lab.