This document collects the frequently asked questions (FAQs) about TiDB Data Migration (DM).
invalid connectionerror returned?
invalid connection error indicates that anomalies have occurred in the connection between DM and the downstream TiDB database (such as network failure, TiDB restart, TiKV busy and so on) and that a part of the data for the current request has been sent to TiDB.
Because DM has the feature of concurrently replicating data to the downstream in replication tasks, several errors might occur when a task is interrupted. You can check these errors by using
invalid connectionerror occurs during the incremental replication process, DM retries the task automatically.
stop-taskto stop the task and then use
start-taskto restart the task.
driver: bad connectionerror returned?
driver: bad connection error indicates that anomalies have occurred in the connection between DM and the upstream TiDB database (such as network failure, TiDB restart and so on) and that the data of the current request has not yet been sent to TiDB at that moment.
When this type of error occurs in the current version, use
stop-task to stop the task and then use
start-task to restart the task. The automatic retry mechanism of DM will be improved later.
get binlog error ERROR 1236 (HY000)and
binlog checksum mismatch, data may be corruptedreturned?
During the DM incremental replication process, this error might occur if the binlog file in the upstream exceeds 4 GB, and DM encounters a replication interruption when processing this binlog file (including interruption caused by anomalies in ordinary pause or stop tasks).
Cause: DM needs to store the replicated binlog position, and MySQL officially uses
uint32 to store it, so the binlog position of the file with offset exceeding 4 GB overflows and an incorrect binlog position is stored. After the task or DM-worker is restarted, this incorrect binlog position is used to re-parse the binlog or relay log.
In this case, manually recover replication using the following solution:
Determine whether the error occurs during the write of the relay log or replication of Binlog replication/Syncer unit (according to the component information in the log error message).
Take the following step according to the place where the error occurs.
4, and restart the DM-worker to re-pull the relay log. If the relay log is written without error, the replication automatically resumes from the checkpoint after the task is restarted.
4. Note that you also need to adjust the binlog position of both the global checkpoint and each table checkpoint. Set the safe-mode of the task to
trueto ensure reentrant execution. Then, you can restart the replication task and observe the status. The task resumes after the oversized (larger than 4 GB) file is replicated.