Importing a large database usually takes hours or days, and if such long running processes spuriously crashes, it can be very time-wasting to redo the previously completed tasks. To solve this, Lightning uses checkpoints to store the import progress, so that
tidb-lightning continues importing from where it lefts off after restarting.
This document describes how to enable, configure, store, and control checkpoints.
[checkpoint] # Whether to enable checkpoints. # While importing data, Lightning records which tables have been imported, so # even if Lightning or some other component crashes, you can start from a known # good state instead of redoing everything. enable = true # The schema name (database name) to store the checkpoints schema = "tidb_lightning_checkpoint" # Where to store the checkpoints. # - file: store as a local file (requires v2.1.1 or later) # - mysql: store into a remote MySQL-compatible database driver = "file" # The data source name (DSN) indicating the location of the checkpoint storage. # # For the "file" driver, the DSN is a path. If the path is not specified, Lightning would # default to "/tmp/CHECKPOINT_SCHEMA.pb". # # For the "mysql" driver, the DSN is a URL in the form of "USER:PASS@tcp(HOST:PORT)/". # If the URL is not specified, the TiDB server from the [tidb] section is used to # store the checkpoints. You should specify a different MySQL-compatible # database server to reduce the load of the target TiDB cluster. #dsn = "/tmp/tidb_lightning_checkpoint.pb" # Whether to keep the checkpoints after all data are imported. If false, the # checkpoints are deleted. Keeping the checkpoints can aid debugging but # might leak metadata about the data source. #keep-after-success = false
Lightning supports two kinds of checkpoint storage: a local file or a remote MySQL-compatible database.
driver = "file", checkpoints are stored in a local file at the path given by the
dsn setting. Checkpoints are updated rapidly, so we highly recommend placing the checkpoint file on a drive with very high write endurance, such as a RAM disk.
driver = "mysql", checkpoints can be saved in any databases compatible with MySQL 5.7 or later, including MariaDB and TiDB. By default, the checkpoints are saved in the target database.
While using the target database as the checkpoints storage, Lightning is importing large amounts of data at the same time. This puts extra stress on the target database and sometimes leads to communication timeout. Therefore, it is strongly recommended to install a temporary MySQL server to store these checkpoints. This server can be installed on the same host as
tidb-lightning and can be uninstalled after the importer progress is completed.
tidb-lightning exits abnormally due to unrecoverable errors (e.g. data corruption), it refuses to reuse the checkpoints until the errors are resolved. This is to prevent worsening the situation. The checkpoint errors can be resolved using the
This option allows you to restart importing the table from scratch. The schema and table names must be quoted with backquotes and are case-sensitive.
If importing the table
`schema`.`table` failed previously, this option executes the following operations:
`schema`.`table`from the target database, which means removing all imported data.
If there is no errors involving the table
`schema`.`table`, this operation does nothing.
It is the same as applying the above on every table. This is the most convenient, safe and conservative solution to fix the checkpoint error problem:
tidb-lightning-ctl --checkpoint-error-ignore='`schema`.`table`' tidb-lightning-ctl --checkpoint-error-ignore=all
If importing the table
`schema`.`table` failed previously, this clears the error status as if nothing ever happened. The
all variant applies this operation to all tables.
Use this option only when you are sure that the error can indeed be ignored. If not, some imported data can be lost. The only safety net is the final “checksum” check, and thus you need to keep the “checksum” option always enabled when using
tidb-lightning-ctl --checkpoint-remove='`schema`.`table`' tidb-lightning-ctl --checkpoint-remove=all
This option simply removes all checkpoint information about one table or all tables, regardless of their status.
This option dumps the content of the checkpoint into the given directory, which is mainly used for debugging by the technical staff. This option is only enabled when
driver = "mysql".