
Indexes are essential for optimizing database query performance, reducing the need to scan large amounts of data. However, as applications evolve, business logic changes, and data volume grows, indexing inefficiencies emerge. This leads to:
- Unused Indexes: These indexes were once relevant but are no longer selected by the query optimizer, consuming storage and adding unnecessary overhead to write operations.
- Inefficient Indexes: Some indexes are used by the optimizer but scan more data than expected, increasing disk I/O and slowing down query performance.
Left unaddressed, these indexing issues can cause higher storage costs, degraded performance, and operational inefficiencies. That’s why regular index audits are crucial for maintaining an optimized database.
In a distributed SQL database like TiDB, indexing inefficiencies have an even greater impact due to the scale of distributed queries and the complexity of multi-node coordination. Proactively identifying and optimizing indexes helps:
- Reduce Storage Overhead: Removing unused indexes frees up disk space and reduces long-term storage costs.
- Improve Write Performance: Write-heavy workloads (
INSERT
,UPDATE
,DELETE
) perform better when unnecessary index maintenance is eliminated. - Optimize Query Execution: Efficient indexes reduce the number of rows scanned, improving query speed and response times.
- Streamline Database Management: Fewer, well-optimized indexes simplify backups, recovery, and schema changes.
TiDB 8.0 introduces dedicated system views to help DBAs and developers track index usage patterns and make data-driven decisions. This blog post explores the tools needed to detect and eliminate unused or inefficient indexes, improving TiDB’s performance and stability.
Why Index Optimization Matters
Unused and inefficient indexes are more than just wasted space — they directly impact database performance and scalability. Proactively managing indexes in TiDB can significantly enhance system efficiency.
Key Benefits of TiDB Index Optimization
- Reduce Storage Costs: Every index consumes disk space. As databases grow, maintaining unused indexes leads to unnecessary storage expansion. Removing them frees up space and lowers operational costs.
- Improve Write Performance: Every
INSERT
,UPDATE
, andDELETE
operation must update associated indexes. Redundant indexes slow down these operations, particularly in high-concurrency workloads. Optimizing index usage reduces write latency and improves overall system responsiveness. - Enhance Query Performance: Inefficient indexes can cause excessive data scans, increasing disk I/O and query latency. By ensuring indexes are highly selective and well-optimized, query execution becomes faster and more efficient.
- Streamline Database Administration: A cluttered database with too many indexes complicates backups, recovery, and schema changes. Reducing unnecessary indexes simplifies database management, making maintenance smoother.
Making Index Optimization a Habit
Because indexes evolve with changing business logic, regular index audits should be a standard part of database maintenance. TiDB provides built-in observability tools to help users detect, evaluate, and optimize indexes without risk.
In the next section, we’ll explore how TiDB 8.0’s TIDB_INDEX_USAGE
and schema_unused_indexes
views help DBAs efficiently track and optimize indexes.
TiDB Index Optimization: A Data-Driven Approach
Indexes are essential for query performance, but removing them without proper analysis can lead to unexpected regressions or even system instability. To ensure safe and effective index management, TiDB provides built-in observability tools that allow users to:
- Track Index Usage in Real-Time: Identify how often an index is accessed and whether it contributes to performance improvements.
- Detect Unused Indexes: Locate indexes that have not been used since the database was last restarted.
- Assess Index Efficiency: Evaluate whether an index filters data effectively or causes excessive I/O overhead.
- Safely Test Index Removal: Temporarily make an index invisible before deleting it to ensure no queries depend on it.
TiDB simplifies index optimization by introducing three powerful tools:
TIDB_INDEX_USAGE
: Monitors index usage patterns and query frequency.schema_unused_indexes
: Lists indexes that have not been used since the last TiDB restart.- Invisible Indexes: Allows DBAs to test the impact of removing an index before permanently deleting it.
By using these observability tools, TiDB users can confidently clean up redundant indexes without risking performance degradation.
TiDB Index Optimization: Tracking Index Usage with TIDB_INDEX_USAGE
Introduced in TiDB 8.0, the TIDB_INDEX_USAGE
system table provides real-time insights into how indexes are used, helping DBAs optimize query performance and remove unnecessary indexes.
Why Use TIDB_INDEX_USAGE
?
This system table enables users to:
- Detect Unused Indexes: Identify indexes that have not been accessed by queries, helping determine which ones can be safely removed.
- Analyze Index Efficiency: Track how frequently an index is used and whether it contributes to efficient query execution.
- Evaluate Query Patterns: Understand how indexes affect read operations, data scans, and key-value (KV) requests.
Since TiDB 8.4, the table also includes primary keys in clustered tables, offering deeper visibility into index performance.
Key Metrics in TIDB_INDEX_USAGE
Column | Description |
---|---|
TABLE_SCHEMA | Database name containing the index. |
TABLE_NAME | Table name containing the index. |
INDEX_NAME | Index name. |
QUERY_TOTAL | Total number of queries that accessed the index. If 0, the index is unused. |
KV_REQ_TOTAL | Total KV requests generated by the index. |
ROWS_ACCESS_TOTAL | Total number of rows scanned using the index. |
PERCENTAGE_ACCESS_0 | Number of times no rows were accessed. |
PERCENTAGE_ACCESS_0_1 | Number of times 0%-1% of table rows were accessed. |
PERCENTAGE_ACCESS_1_10 | Number of times 1%-10% of table rows were accessed. |
PERCENTAGE_ACCESS_10_20 | Number of times 10%-20% of table rows were accessed. |
PERCENTAGE_ACCESS_20_50 | Number of times 20%-50% of table rows were accessed. |
PERCENTAGE_ACCESS_50_100 | Number of times 50%-100% of table rows were accessed. |
PERCENTAGE_ACCESS_100 | Number of times the entire table was scanned. |
LAST_ACCESS_TIME | Timestamp of the most recent query using the index. |
How to Identify Unused and Inefficient Indexes using TIDB_INDEX_USAGE
- Unused Indexes:
- If
QUERY_TOTAL
= 0, the index has not been used by any queries. - If
LAST_ACCESS_TIME
is a long time ago, the index may no longer be relevant.
- If
- Inefficient Indexes:
- High values in
PERCENTAGE_ACCESS_100
suggest full index scans, which may indicate an inefficient index. - Comparing
ROWS_ACCESS_TOTAL
/QUERY_TOTAL
helps determine whether the index scans too many rows relative to its usage.
- High values in
By leveraging TIDB_INDEX_USAGE
, TiDB users can gain detailed insights into index performance, making it easier to remove unnecessary indexes and optimize query execution.
Handling Index Usage Data Efficiently
Delayed Data Updates
- To minimize performance impact,
TIDB_INDEX_USAGE
does not update instantly. Index usage metrics may be delayed by up to 5 minutes, so users should account for this when analyzing queries.
Index Usage Data is Not Persisted
TIDB_INDEX_USAGE
stores data in memory, meaning it does not persist across node restarts.- If a TiDB node is restarted, all index usage statistics from that node will be cleared.
Planned Enhancements for Historical Tracking
- TiDB is developing a Workload Repository to periodically snapshot index usage data, allowing users to review trends over time instead of relying only on real-time metrics.
- Until this feature is available, DBAs can periodically export index usage snapshots using:
SELECT * FROM INFORMATION_SCHEMA.TIDB_INDEX_USAGE INTO OUTFILE '/backup/index_usage_snapshot.csv';
This allows for historical tracking by comparing snapshots over time to detect trends in index usage and make more informed pruning decisions.
TiDB Index Optimization: Consolidating Index Usage Data Across TiDB Nodes
Since TiDB is a distributed SQL database, query workloads are spread across multiple nodes. Each TiDB node tracks its own local index usage, but for a global view of index performance, TiDB provides the CLUSTER_TIDB_INDEX_USAGE
system table.
This view consolidates index usage data from all TiDB nodes, ensuring that distributed query workloads are fully accounted for when optimizing indexing strategies.
How CLUSTER_TIDB_INDEX_USAGE
Helps
Unlike TIDB_INDEX_USAGE
, which provides insights at the node level, this cluster-wide view allows users to:
- Detect inconsistencies in index usage: for example, an index may be frequently used on some nodes but unused on others.
- Analyze global index patterns for distributed queries, ensuring indexing decisions reflect real-world workload distribution.
- Optimize indexing strategies across all nodes, improving query efficiency for multi-node deployments.
Different TiDB nodes may experience different query workloads, so an index that appears unused on some nodes may still be critical elsewhere. To segment index analysis by workload, run:
SELECT INSTANCE, TABLE_NAME, INDEX_NAME, SUM(QUERY_TOTAL) AS total_queries
FROM INFORMATION_SCHEMA.CLUSTER_TIDB_INDEX_USAGE
GROUP BY INSTANCE, TABLE_NAME, INDEX_NAME
ORDER BY total_queries DESC;
This helps determine whether an index is truly unused across all nodes or only for specific instances, allowing DBAs to make informed decisions on index removal.
Key Differences from TIDB_INDEX_USAGE
Feature | TIDB_INDEX_USAGE | CLUSTER_TIDB_INDEX_USAGE |
Scope | Tracks index usage within a single database | Aggregates index usage across the entire TiDB cluster |
Index Tracking | Data is local to each database | Centralized cluster-wide view |
Primary Use Case | Debugging index usage at the database instance level | Analyzing global index patterns and multi-node behavior |
Using CLUSTER_TIDB_INDEX_USAGE
Effectively
Since this system table consolidates data from multiple nodes, consider the following:
Delayed Data Updates
- The data is refreshed periodically to minimize performance impact. If index usage is analyzed immediately after a query execution, allow time for metrics to update.
Memory-Based Storage
- Like
TIDB_INDEX_USAGE
, this system table does not persist data across node restarts. If a node goes down, its recorded index usage data will be lost.
Future Enhancements for Historical Tracking
- TiDB is introducing a Workload Repository that will periodically snapshot index usage metrics, allowing DBAs to analyze trends over time instead of relying solely on real-time data.
By leveraging CLUSTER_TIDB_INDEX_USAGE
, TiDB users can gain a global perspective on index behavior, ensuring indexing strategies are aligned with distributed query workloads.
TiDB Index Optimization: Easily Identifying Unused Indexes with schema_unused_indexes
Manually analyzing index usage data can be time-consuming. To simplify this process, TiDB provides schema_unused_indexes, a system view that lists indexes that haven’t been used since the database was last restarted.
This provides a quick way for DBAs to:
- Identify indexes that are no longer in use, reducing unnecessary storage costs.
- Speed up DML operations by eliminating indexes that add overhead to
INSERT
,UPDATE
, andDELETE
queries. - Streamline index audits without needing to manually analyze query patterns.
How schema_unused_indexes
Works
The schema_unused_indexes
view is derived from TIDB_INDEX_USAGE
, meaning it automatically filters out indexes that have recorded zero query activity since the last TiDB restart.
To retrieve a list of unused indexes, simply run:
SELECT * FROM sys.schema_unused_indexes;
This will return a result set similar to:
object_schema | object_name | index_name |
bookshop | users | nickname |
bookshop | ratings | uniq_book_user_idx |
Important Considerations When Using schema_unused_indexes
Indexes Are Considered “Unused” Only Since the Last Restart
- If a TiDB node restarts, the usage tracking data is reset.
- Ensure the system has been running long enough to capture a representative workload before relying on this data.
Not All Unused Indexes Should Be Dropped Immediately
- Some indexes may be rarely used but still essential for specific queries, batch jobs, or reporting tasks. Before dropping an index, consider whether it supports:
- Rare but essential queries (e.g., monthly reports, analytics)
- Batch processing jobs that don’t run daily
- Ad-hoc troubleshooting queries used by DBAs
- If the index appears in important but infrequent queries, consider keeping it or making it invisible first.
- Use Invisible Indexes (covered in the next section) to safely test whether an index can be removed without impacting performance.
By leveraging schema_unused_indexes
, TiDB users can quickly identify unnecessary indexes and reduce database overhead with minimal effort.
TiDB Index Optimization: Safely Testing Index Removal with Invisible Indexes
Removing an index without proper validation can lead to unexpected performance issues, especially if the index is infrequently used but still critical for certain queries. To mitigate this risk, TiDB provides Invisible Indexes, allowing DBAs to temporarily disable an index without deleting it.
What Are Invisible Indexes?
An Invisible Index remains in the database but is ignored by the TiDB optimizer. This allows users to test whether an index is truly unnecessary without permanently removing it.
Key benefits include:
- Safe Index Testing – Queries will no longer use the index, but it can be quickly restored if needed.
- Zero Disruption to Index Storage – The index remains intact, ensuring no need for costly re-creation.
- Performance Monitoring – DBAs can observe query behavior without the index before making a final decision.
Using Invisible Indexes in TiDB
To make an index invisible (without dropping it), use:
ALTER TABLE bookshop.users ALTER INDEX nickname INVISIBLE;
Monitor Query Performance
After making the index invisible, observe the system’s query performance:
- If performance remains unchanged, the index is likely unnecessary and can be safely removed.
- If query latency increases, the index may still be needed, and removal should be reconsidered.
Best Practices for Using Invisible Indexes
- Test During Off-Peak Hours – Monitor performance impact in a controlled environment.
- Use Query Monitoring Tools – Analyze query execution plans before and after marking an index as invisible.
- Confirm Over Multiple Workloads – Ensure the index isn’t needed for specific reports or scheduled queries.
By leveraging Invisible Indexes, TiDB users can validate index removal decisions without risk, ensuring a more controlled and predictable database optimization process.
How Long Should an Index Remain Invisible?
- OLTP workloads: Monitor for at least one week to account for daily variations.
- Batch processing/ETL workloads: Allow one full reporting cycle (e.g., a monthly financial report run).
- Ad-hoc analytical queries: Use query logs to confirm the index isn’t needed before dropping it.
For safety, keep the index invisible for at least one full business cycle to ensure all workloads have been tested before making a final decision.”*
System Table schema_unused_indexes
To make it easier for users to get the results, TiDB also provides a MySQL-compatible view sys.schema_unused_indexes
, which lists indexes that have not been used since all TiDB instances were last started. The data for this view comes from TIDB_INDEX_USAGE
. Note that since TIDB_INDEX_USAGE
is cleared after a TiDB node restart, you should ensure that the node has been running for a sufficient amount of time before making decisions.
For clusters upgraded from older versions to TiDB 8.0 and higher, the
sys
schema and the included views must be manually created. Please refer to the official documentation for instructions.
Summary & Key Takeaways
Effective index management is crucial for maintaining database performance in TiDB. By leveraging TiDB’s built-in observability tools, DBAs can easily identify, evaluate, and optimize indexes without risking system stability.
By following the below best practices, TiDB users can keep their databases optimized, reduce unnecessary resource consumption, and maintain peak query performance.
- Monitor Index Usage Regularly
- Use
TIDB_INDEX_USAGE
to track index query activity. - Use
CLUSTER_TIDB_INDEX_USAGE
for a cluster-wide view of index behavior.
- Use
- Identify Unused Indexes with Confidence
- Use
schema_unused_indexes
to list indexes that haven’t been used since the last restart. - Be cautious—some indexes may be used infrequently but remain critical for specific queries.
- Use
- Safely Test Index Removal with Invisible Indexes
- Mark an index as
INVISIBLE
before dropping it to validate its necessity. - Restore visibility if query performance is negatively affected.
- Mark an index as
- Optimize Indexes to Reduce Overhead
- Avoid redundant or low-selectivity indexes that consume storage and slow down write operations.
- Optimize index structures to improve query filtering efficiency.
- Prioritize Ongoing Index Maintenance
- Regularly audit indexes after schema changes, application updates, or workload shifts.
- Use TiDB’s execution plan analysis tools to ensure indexes are used effectively.
If you have any questions about TiDB index optimization, please feel free to connect with us on Twitter, LinkedIn, or through our Slack Channel.
Experience modern data infrastructure firsthand.
TiDB Cloud Dedicated
A fully-managed cloud DBaaS for predictable workloads
TiDB Cloud Serverless
A fully-managed cloud DBaaS for auto-scaling workloads