TiDB Binlog FAQ

This document collects the frequently asked questions (FAQs) about TiDB Binlog.

When can I pause or close a Pump or Drainer node?

Refer to TiDB Binlog Cluster Operations to learn the description of the Pump or Drainer state and how to start and exit the process.

Pause a Pump or Drainer node when you need to temporarily stop the service. For example:

  • Version upgrade

    Use the new binary to restart the service after the process is stopped.

  • Server maintenance

    When the server needs a downtime maintenance, exit the process and restart the service after the maintenance is finished.

Close a Pump or Drainer node when you no longer need the service. For example:

  • Pump scale-in

    If you do not need too many Pump services, close some of them.

  • Cancelling replication tasks

    If you no longer need to replicate data to a downstream database, close the corresponding Drainer node.

  • Service migration

    If you need to migrate the service to another server, close the service and re-deploy it on the new server.

How can I pause a Pump or Drainer process?

  • Directly kill the process.

    Note:

    Do not use the kill -9 command. Otherwise, the Pump or Drainer node cannot process signals.

  • If the Pump or Drainer node runs in the foreground, pause it by pressing Ctrl+C.

  • Use the pause-pump or pause-drainer command in binlogctl.

Can I use the update-pump or update-drainer command in binlogctl to pause the Pump or Drainer service?

No. The update-pump or update-drainer command directly modifies the state information saved in PD without notifying Pump or Drainer to perform the corresponding operation. Misusing the two commands can interrupt data replication and might even cause data loss.

Can I use the update-pump or update-drainer command in binlogctl to close the Pump or Drainer service?

No. The update-pump or update-drainer command directly modifies the state information saved in PD without notifying Pump or Drainer to perform the corresponding operation. Misusing the two commands interrupts data replication and might even cause data inconsistency. For example:

  • When a Pump node runs normally or is in the paused state, if you use the update-pump command to set the Pump state to offline, the Drainer node stops pulling the binlog data from the offline Pump. In this situation, the newest binlog cannot be replicated to the Drainer node, causing data inconsistency between upstream and downstream.
  • When a Drainer node runs normally, if you use the update-drainer command to set the Drainer state to offline, the newly started Pump node only notifies Drainer nodes in the online state. In this situation, the offline Drainer fails to pull the binlog data from the Pump node in time, causing data inconsistency between upstream and downstream.

When can I use the update-pump command in binlogctl to set the Pump state to paused?

In some abnormal situations, Pump fails to correctly maintain its state. Then, use the update-pump command to modify the state.

For example, when a Pump process is exited abnormally (caused by directly exiting the process when a panic occurs or mistakenly using the kill -9 command to kill the process), the Pump state information saved in PD is still online. In this situation, if you do not need to restart Pump to recover the service at the moment, use the update-pump command to update the Pump state to paused. Then, interruptions can be avoided when TiDB writes binlogs and Drainer pulls binlogs.

When can I use the update-drainer command in binlogctl to set the Drainer state to paused?

In some abnormal situations, the Drainer node fails to correctly maintain its state, which has influenced the replication task. Then, use the update-drainer command to modify the state.

For example, when a Drainer process is exited abnormally (caused by directly exiting the process when a panic occurs or mistakenly using the kill -9 command to kill the process), the Drainer state information saved in PD is still online. When a Pump node is started, it fails to notify the exited Drainer node (the notify drainer ... error), which cause the Pump node failure. In this situation, use the update-drainer command to update the Drainer state to paused and restart the Pump node.

How can I close a Pump or Drainer node?

Currently, you can only use the offline-pump or offline-drainer command in binlogctl to close a Pump or Drainer node.

When can I use the update-pump command in binlogctl to set the Pump state to offline?

You can use the update-pump command to set the Pump state to offline in the following situations:

  • When a Pump process is exited abnormally and the service cannot be recovered, the replication task is interrupted. If you want to recover the replication and accept some losses of binlog data, use the update-pump command to set the Pump state to offline. Then, the Drainer node stops pulling binlog from the Pump node and continues replicating data.
  • Some stale Pump nodes are left over from historical tasks. Their processes have been exited and their services are no longer needed. Then, use the update-pump command to set their state to offline.

For other situations, use the offline-pump command to close the Pump service, which is the regular process.

Warning:

Do not use the update-pump command unless you can tolerate binlog data loss and data inconsistency between upstream and downstream, or you no longer need the binlog data stored in the Pump node.

Can I use the update-pump command in binlogctl to set the Pump state to offline if I want to close a Pump node that is exited and set to paused?

When a Pump process is exited and the node is in the paused state, not all the binlog data in the node is consumed in its downstream Drainer node. Therefore, doing so might risk data inconsistency between upstream and downstream. In this situation, restart the Pump and use the offline-pump command to close the Pump node.

When can I use the update-drainer command in binlogctl to set the Drainer state to offline?

Some stale Drainer nodes are left over from historical tasks. Their processes have been exited and their services are no longer needed. Then, use the update-drainer command to set their state to offline.

Can I use SQL operations such as change pump and change drainer to pause or close the Pump or Drainer service?

No. For more details on these SQL operations, refer to Use SQL statements to manage Pump or Drainer.

These SQL operations directly modifies the state information saved in PD and are functionally equivalent to the update-pump and update-drainer commands in binlogctl. To pause or close the Pump or Drainer service, use the binlogctl tool.

What can I do when some DDL statements supported by the upstream database cause error when executed in the downstream database?

To solve the problem, follow these steps:

  1. Check drainer.log. Search exec failed for the last failed DDL operation before the Drainer process is exited.
  2. Change the DDL version to the one compatible to the downstream. Perform this step manually in the downstream database.
  3. Check drainer.log. Search for the failed DDL operation and find the commit-ts of this operation.
  4. Modify the drainer.toml configuration file. Add the commit-ts in the ignore-txn-commit-ts item and restart the Drainer node.