PingCAP
  • Cloud
  • TiDB Academy
  • Docs
  • Success Stories
  • Blog
  • Free Download
PingCAP
  • Cloud
  • TiDB Academy
  • Docs
  • Success Stories
  • Blog
  • Free Download

Contact

中文
文档
v3.0 (stable) dev v2.1
  • Introduction
    • TiDB Introduction
    • Benchmarks
      • How to Test TiDB Using Sysbench
      • How to Run TPC-C Test on TiDB
      • Sysbench Performance Test - v3.0 vs. v2.1
      • TPC-C Performance Test - v3.0 vs. v2.1
      • Interaction Test on Online Workloads and `ADD INDEX` Operations
      • TiDB in Kubernetes Sysbench Test
      • DM 1.0-alpha Benchmark Report
      • DM 1.0-GA Benchmark Report
  • Concepts
    • Architecture
    • Key Features
      • Horizontal Scalability
      • MySQL Compatible Syntax
      • Replicate from and to MySQL
      • Distributed Transactions with Strong Consistency
      • Cloud Native Architecture
      • Minimize ETL with HTAP
      • Fault Tolerance & Recovery with Raft
      • Automatic Rebalancing
      • Deployment and Orchestration with Ansible, Kubernetes, Docker
      • JSON Support
      • Spark Integration
      • Read Historical Data Without Restoring from Backup
      • Fast Import and Restore of Data
      • Hybrid of Column and Row Storage
      • SQL Plan Management
      • Open Source
      • Online Schema Changes
  • How-to
    • Get Started
      • Start a Cluster
        • From Binary
        • From Homebrew
        • From DBdeployer
      • Explore SQL with TiDB
      • Import Example Database
      • Read Historical Data
      • TiDB Binlog Tutorial
      • TiDB Data Migration Tutorial
      • TiDB Lightning Tutorial
      • TiSpark Quick Start Guide
    • Deploy
      • Hardware Recommendations
      • From Binary Tarball
        • For Testing Environments
        • For Production Environments
      • Orchestrated Deployment
        • Ansible Deployment (Recommended)
        • Ansible Offline Deployment
        • Docker Deployment
      • Geographic Redundancy
        • Overview
        • Configure Location Awareness
      • Data Migration with Ansible
    • Configure
      • Time Zone
      • Memory Control
    • Secure
      • Transport Layer Security (TLS)
        • Enable TLS For MySQL Clients
        • Enable TLS Between TiDB Components
      • Generate Self-signed Certificates
    • Monitor
      • Overview
      • Monitor a TiDB Cluster
    • Migrate
      • Overview
      • Migrate from MySQL
        • Migrate the Full Data
        • Migrate the Incremental Data
      • Migrate from Aurora
      • Migrate from CSV
    • Maintain
      • Common Ansible Operations
      • Backup and Restore
        • Use `mydumper` and `loader`
        • Use BR
      • Identify Slow Queries
    • Scale
      • Scale using Ansible
      • Scale a TiDB Cluster
    • Upgrade
      • Upgrade to TiDB 3.1
    • Troubleshoot
      • Troubleshoot Cluster Setup
      • Troubleshoot TiDB Lightning
  • Reference
    • SQL
      • MySQL Compatibility
      • SQL Language Structure
        • Literal Values
        • Schema Object Names
        • Keywords and Reserved Words
        • User-Defined Variables
        • Expression Syntax
        • Comment Syntax
      • Data Types
        • Overview
        • Default Values
        • Numeric Types
          • `BIT`
          • `BOOL|BOOLEAN`
          • `TINYINT`
          • `SMALLINT`
          • `MEDIUMINT`
          • `INT|INTEGER`
          • `BIGINT`
          • `DECIMAL`
          • `FLOAT`
          • `DOUBLE`
        • Date and Time Types
          • `DATE`
          • `DATETIME`
          • `TIMESTAMP`
          • `TIME`
          • `YEAR`
        • String Types
          • `CHAR`
          • `VARCHAR`
          • `TEXT`
          • `LONGTEXT`
          • `BINARY`
          • `VARBINARY`
          • `TINYBLOB`
          • `BLOB`
          • `MEDIUMBLOB`
          • `LONGBLOB`
          • `ENUM`
          • `SET`
        • JSON Type
      • Functions and Operators
        • Function and Operator Reference
        • Type Conversion in Expression Evaluation
        • Operators
        • Control Flow Functions
        • String Functions
        • Numeric Functions and Operators
        • Date and Time Functions
        • Bit Functions and Operators
        • Cast Functions and Operators
        • Encryption and Compression Functions
        • Information Functions
        • JSON Functions
        • Aggregate (GROUP BY) Functions
        • Window Functions
        • Miscellaneous Functions
        • Precision Math
        • List of Expressions for Pushdown
      • SQL Statements
        • `ADD COLUMN`
        • `ADD INDEX`
        • `ADMIN`
        • `ALTER DATABASE`
        • `ALTER TABLE`
        • `ALTER USER`
        • `ANALYZE TABLE`
        • `BEGIN`
        • `COMMIT`
        • `CREATE DATABASE`
        • `CREATE INDEX`
        • `CREATE TABLE LIKE`
        • `CREATE TABLE`
        • `CREATE USER`
        • `CREATE VIEW`
        • `DEALLOCATE`
        • `DELETE`
        • `DESC`
        • `DESCRIBE`
        • `DO`
        • `DROP COLUMN`
        • `DROP DATABASE`
        • `DROP INDEX`
        • `DROP TABLE`
        • `DROP USER`
        • `DROP VIEW`
        • `EXECUTE`
        • `EXPLAIN ANALYZE`
        • `EXPLAIN`
        • `FLUSH PRIVILEGES`
        • `FLUSH STATUS`
        • `FLUSH TABLES`
        • `GRANT <privileges>`
        • `INSERT`
        • `KILL [TIDB]`
        • `LOAD DATA`
        • `MODIFY COLUMN`
        • `PREPARE`
        • `RECOVER TABLE`
        • `RENAME INDEX`
        • `RENAME TABLE`
        • `REPLACE`
        • `REVOKE <privileges>`
        • `ROLLBACK`
        • `SELECT`
        • `SET [NAMES|CHARACTER SET]`
        • `SET PASSWORD`
        • `SET TRANSACTION`
        • `SET [GLOBAL|SESSION] <variable>`
        • `SHOW CHARACTER SET`
        • `SHOW COLLATION`
        • `SHOW [FULL] COLUMNS FROM`
        • `SHOW CREATE TABLE`
        • `SHOW CREATE USER`
        • `SHOW DATABASES`
        • `SHOW ENGINES`
        • `SHOW ERRORS`
        • `SHOW [FULL] FIELDS FROM`
        • `SHOW GRANTS`
        • `SHOW INDEXES [FROM|IN]`
        • `SHOW INDEX [FROM|IN]`
        • `SHOW KEYS [FROM|IN]`
        • `SHOW PRIVILEGES`
        • `SHOW [FULL] PROCESSSLIST`
        • `SHOW SCHEMAS`
        • `SHOW [FULL] TABLES`
        • `SHOW TABLE REGIONS`
        • `SHOW TABLE STATUS`
        • `SHOW [GLOBAL|SESSION] VARIABLES`
        • `SHOW WARNINGS`
        • `SPLIT REGION`
        • `START TRANSACTION`
        • `TRACE`
        • `TRUNCATE`
        • `UPDATE`
        • `USE`
      • Constraints
      • Generated Columns
      • Partitioning
      • Character Set
      • SQL Mode
      • Views
    • Configuration
      • tidb-server
        • MySQL System Variables
        • TiDB Specific System Variables
        • Configuration Flags
        • Configuration File
      • pd-server
        • Configuration Flags
        • Configuration File
      • tikv-server
        • Configuration Flags
        • Configuration File
    • Security
      • Security Compatibility with MySQL
      • The TiDB Access Privilege System
      • TiDB User Account Management
      • Role-Based Access Control
    • Transactions
      • Overview
      • Transaction Model
      • Isolation Levels
      • Pessimistic Transactions
    • System Databases
      • `mysql`
      • `information_schema`
    • Errors Codes
    • Supported Client Drivers
    • Garbage Collection (GC)
      • GC Overview
      • GC Configuration
    • Performance
      • Overview
      • Understanding the Query Execution Plan
      • Introduction to Statistics
      • Optimizer Hints
      • Follower Read
      • Check the TiDB Cluster Status Using SQL Statements
      • Execution Plan Binding
      • Statement Summary Table
      • Tune TiKV
    • Key Monitoring Metrics
      • Overview
      • TiDB
      • PD
      • TiKV
    • Alert Rules
    • Best Practices
      • Highly Concurrent Write Best Practices
      • HAproxy Best Practices
      • PD Scheduling Best Practices
    • TiSpark
    • TiDB Binlog
      • Overview
      • Deploy
      • Maintain
      • Monitor
      • Upgrade
      • Reparo
      • Binlog Slave Client
      • FAQ
    • Tools
      • Mydumper
      • Syncer
      • Loader
      • TiDB Data Migration
        • Overview
          • DM Overview
          • Restrictions
          • DM-worker
          • DM Relay Log
        • Features
          • Table Routing
          • Black and White Lists
          • Binlog Event Filter
          • Replication Delay Monitoring
          • Sharding Support
            • Introduction
            • Restrictions
            • Handle Sharding DDL Locks Manually
        • Usage Scenarios
          • Simple Scenario
          • Shard Merge Scenario
          • Shard Merge Best Practices
        • Deploy
        • Configure
          • Overview
          • Task Configuration
        • Manage the DM Cluster
          • Cluster Operations
          • Cluster Upgrade
        • Manage Replication Tasks
          • Manage Tasks
          • Precheck Tasks
          • Query Task Status
          • Skip or Replace Abnormal SQL Statements
        • Monitor
        • Migrate from MySQL compatible database
          • Migrate from Aurora
        • Troubleshoot
          • DM Troubleshooting
          • Error Description
          • Error Handling
        • FAQ
      • TiDB Lightning
        • Overview
        • Deployment
        • Checkpoints
        • Table Filter
        • CSV Support
        • Monitor
        • Troubleshoot
        • FAQ
      • sync-diff-inspector
      • PD Control
      • PD Recover
      • TiKV Control
      • TiDB Control
      • Download
  • TiDB in Kubernetes
    • About TiDB Operator
    • Get Started
      • kind
      • GKE
      • Minikube
    • Deploy
      • Prerequisites
      • TiDB Operator
      • TiDB in General Kubernetes
      • TiDB in AWS EKS
      • TiDB in GCP GKE
      • TiDB in Alibaba Cloud ACK
      • Access TiDB in Kubernetes
    • Configure
      • Cluster Initialization
    • Monitor
    • Maintain
      • Destroy a TiDB cluster
      • Maintain a Hosting Kubernetes Node
      • Backup and Restore
      • Restore Data with TiDB Lightning
      • Collect Logs
      • Automatic Failover
      • TiDB Binlog
    • Scale
    • Upgrade
      • TiDB Cluster
      • TiDB Operator
    • Reference
      • Configuration
        • TiDB Cluster
        • Backup
        • PV
        • TiDB Drainer
      • Tools
        • tkctl
        • Tools in Kubernetes
    • Troubleshoot
    • FAQs
  • FAQs
    • TiDB FAQs
    • TiDB Lightning FAQs
    • Upgrade FAQs
  • Support
    • Support Resources
    • Report an Issue
  • Contribute
    • Contribute to TiDB
    • Improve the Docs
  • Adopters
  • Roadmap
  • Releases
    • v3.0
      • 3.0.7
      • 3.0.6
      • 3.0.5
      • 3.0.4
      • 3.0.3
      • 3.0.2
      • 3.0.1
      • 3.0 GA
      • 3.0.0-rc.3
      • 3.0.0-rc.2
      • 3.0.0-rc.1
      • 3.0.0-beta.1
      • 3.0.0-beta
    • v2.1
      • 2.1.18
      • 2.1.17
      • 2.1.16
      • 2.1.15
      • 2.1.14
      • 2.1.13
      • 2.1.12
      • 2.1.11
      • 2.1.10
      • 2.1.9
      • 2.1.8
      • 2.1.7
      • 2.1.6
      • 2.1.5
      • 2.1.4
      • 2.1.3
      • 2.1.2
      • 2.1.1
      • 2.1 GA
      • 2.1 RC5
      • 2.1 RC4
      • 2.1 RC3
      • 2.1 RC2
      • 2.1 RC1
      • 2.1 Beta
    • v2.0
      • 2.0.11
      • 2.0.10
      • 2.0.9
      • 2.0.8
      • 2.0.7
      • 2.0.6
      • 2.0.5
      • 2.0.4
      • 2.0.3
      • 2.0.2
      • 2.0.1
      • 2.0
      • 2.0 RC5
      • 2.0 RC4
      • 2.0 RC3
      • 2.0 RC1
      • 1.1 Beta
      • 1.1 Alpha
    • v1.0
      • 1.0.8
      • 1.0.7
      • 1.0.6
      • 1.0.5
      • 1.0.4
      • 1.0.3
      • 1.0.2
      • 1.0.1
      • 1.0
      • Pre-GA
      • RC4
      • RC3
      • RC2
      • RC1

Key Monitoring Metrics of PD

If you use Ansible to deploy the TiDB cluster, the monitoring system is deployed at the same time. For more information, see Overview of the Monitoring Framework.

The Grafana dashboard is divided into a series of sub dashboards which include Overview, PD, TiDB, TiKV, Node_exporter, Disk Performance, and so on. A lot of metrics are there to help you diagnose.

You can get an overview of the component PD status from the PD dashboard, where the key metrics are displayed. This document provides a detailed description of these key metrics.

Key metrics description

To understand the key metrics displayed on the Overview dashboard, check the following table:

Service Panel name Description Normal range
Cluster PD role The role of the current PD
Cluster Storage capacity The total capacity size of the cluster
Cluster Current storage size The current storage size of the cluster
Cluster Current storage usage The total number of Regions without replicas
Cluster Normal stores The count of healthy stores
Cluster Abnormal stores The count of unhealthy stores The normal value is 0. If the number is bigger than 0, it means at least one instance is abnormal.
Cluster Current peer count The current peer count of the cluster
Cluster Number of Regions The total number of Regions of the cluster
Cluster PD scheduler config The list of PD scheduler configurations
Cluster Region label isolation level The number of Regions in different label levels
Cluster Label distribution The distribution status of the labels in the cluster
Cluster pd_cluster_metadata The metadata of the PD cluster including cluster ID, the timestamp, and the generated ID.
Cluster Region health The health status of Regions indicated via count of unusual Regions including pending peers, down peers, extra peers, offline peers, missing peers, learner peers and incorrect namespaces The number of pending peers should be less than 100. The missing peers should not be persistently greater than 0.
Statistics - Balance Store capacity The capacity size per TiKV instance
Statistics - Balance Store available The available capacity size per TiKV instance
Statistics - Balance Store used The used capacity size per TiKV instance
Statistics - Balance Size amplification The size amplification ratio per TiKV instance, which is equal to (Store Region size)/(Store used capacity size)
Statistics - Balance Size available ratio The size availability ratio per TiKV instance, which is equal to (Store available capacity size)/(Store capacity size)
Statistics - Balance Store leader score The leader score per TiKV instance
Statistics - Balance Store Region score The Region score per TiKV instance
Statistics - Balance Store leader size The total leader size per TiKV instance
Statistics - Balance Store Region size The total Region size per TiKV instance
Statistics - Balance Store leader count The leader count per TiKV instance
Statistics - Balance Store Region count The Region count per TiKV instance
Statistics - Hotspot Leader distribution in hot write Regions The total number of leader Regions in hot write on each TiKV instance
Statistics - Hotspot Peer distribution in hot write Regions The total number of peer Regions under in hot write on each TiKV instance
Statistics - Hotspot Leader written bytes in hot write Regions The total written bytes by Leader regions in hot write on leader Regions for each TiKV instance
Statistics - Hotspot Peer written bytes in hot write Regions The total bytes of hot write on peer Regions per each TiKV instance
Statistics - Hotspot Leader distribution in hot read Regions The total number of leader Regions in hot read per each TiKV instance
Statistics - Hotspot Peer distribution in hot read Regions The total number of Regions which are not leader under hot read per each TiKV instance
Statistics - Hotspot Leader read bytes in hot read Regions The total bytes of hot read on leader Regions per each TiKV instance
Statistics - Hotspot Peer read bytes in hot read Regions The total bytes of hot read on peer Regions per TiKV instance
Scheduler Running schedulers The current running schedulers
Scheduler Balance leader movement The leader movement details among TiKV instances
Scheduler Balance Region movement The Region movement details among TiKV instances
Scheduler Balance leader event The count of balance leader events
Scheduler Balance Region event The count of balance Region events
Scheduler Balance leader scheduler The inner status of balance leader scheduler
Scheduler Balance Region scheduler The inner status of balance Region scheduler
Scheduler Namespace checker The namespace checker’s status
Scheduler Replica checker The replica checker’s status
Scheduler Region merge checker The merge checker’s status
Operator Schedule operator create The number of newly created operators per type
Operator Schedule operator check The number of checked operator per type. It mainly checks if the current step is finished; if yes, it returns the next step to be executed.
Operator Schedule operator finish The number of finished operators per type
Operator Schedule operator timeout The number of timeout operators per type
Operator Schedule operator replaced or canceled The number of replaced or canceled operators per type
Operator Schedule operators count by state The number of operators per state
Operator 99% Operator finish duration The operator step duration (P99)
Operator 50% Operator finish duration The operator duration (P50)
Operator 99% Operator step duration The operator step duration (P99)
Operator 50% Operator step duration The operator step duration (P50)
gRPC Completed commands rate The rate per command type type at which gRPC commands are completed
gRPC 99% Completed commands duration The rate per command type type at which gRPC commands are completed (P99)
etcd Transaction handling rate The rate at which etcd handles transactions
etcd 99% transactions duration The transaction handling rate (P99)
etcd 99% WAL fsync duration The time consumed for writing WAL into the persistent storage (P99) The value is less than 1s.
etcd 99% Peer round trip time seconds The network latency for etcd (P99) The value is less than 1s.
etcd etcd disk wal fsync rate The rate of writing WAL into the persistent storage
etcd Raft term The current term of Raft
etcd Raft committed index The last committed index of Raft
etcd Raft applied index The last applied index of Raft
TiDB Handled requests count The count of TiDB requests
TiDB Request handling duration The time consumed for handling TiDB requests It should be less than 100ms (P99).
Heartbeat Region heartbeat report The count of heartbeats reported reported to PD per instance
Heartbeat Region heartbeat report error The count of heartbeats with the error status
Heartbeat Region heartbeat report active The count of heartbeats with the ok status
Heartbeat Region schedule push The count of corresponding schedule commands sent from PD per TiKV instance
Heartbeat 99% Region heartbeat latency The heartbeat latency per TiKV instance (P99)
Region storage Syncer index The maximum index in the Region change history recorded by the leader
Region storage History last index The last index where the Region change history is synchronized successfully with the follower

PD dashboard interface

Cluster

PD Dashboard - Cluster metrics

Statistics - Balance

PD Dashboard - Statistics - Balance metrics

Statistics - Hotspot

PD Dashboard - Statistics - Hotspot metrics

Scheduler

PD Dashboard - Scheduler metrics

Operator

PD Dashboard - Operator metrics

gRPC

PD Dashboard - gRPC metrics

etcd

PD Dashboard - etcd metrics

TiDB

PD Dashboard - TiDB metrics

Heartbeat

PD Dashboard - Heartbeat metrics

Region storage

PD Dashboard - Region storage

"Key Monitoring Metrics of PD" was last updated Nov 10 2019: v3.1/*: add the v3.1 docs directory (#1639) (d142ada)
Edit this page Request docs changes

What’s on this page

Product

  • TiDB
  • TiSpark
  • Roadmap

Docs

  • Quick Start
  • Best Practices
  • FAQ
  • TiDB Tools
  • Release Notes

Resources

  • Blog
  • Weekly
  • GitHub
  • TiDB Community

Company

  • About
  • Careers
  • News
  • Contact Us
  • Privacy Policy
  • Terms of Service

Connect

  • Twitter
  • LinkedIn
  • Reddit
  • Google Group
  • Stack Overflow

© 2019 PingCAP. All Rights Reserved.

中文