导读

作者:Gopal Shankar

翻译:徐晨亮

Coinciding with the new native data dictionary in MySQL 8.0, we have made a number of useful enhancements to our  INFORMATION_SCHEMA  subsystem design in MySQL 8.0. In this post I will first go through our legacy implementation as it has stood since MySQL 5.1, and then cover what’s changed.

与MySQL 8.0原生数据字典一致,在MySQL 8.0的 INFORMATION_SCHEMA 子系统设计中,我们做了一些很有用的增强。在这篇文章中,我将会介绍自MySQL 5.1以来的旧的实现方式,然后介绍我们做了什么改变。

Background

INFORMATION_SCHEMA  was first introduced into MySQL 5.0, as a standards compliant way of retrieving meta data from a running MySQL server. When we look at the history of  INFORMATION_SCHEMA  there have been a number of complaints about the performance of certain queries, particularly in the case that there are many database objects (schemas, tables etc).

INFORMATION_SCHEMA 首次引入MySQL 5.0,作为一种从正在运行的MySQL服务器检索元数据的标准兼容方式。当我们回顾 INFORMATION_SCHEMA 的历史时,对于某些特定查询性能总是有很多的抱怨,特别是在有许多数据库对象(schema,表等)的情况下。

In an effort to address these reported issues, since MySQL 5.1 we have made a number of performance optimizations to speed up the execution of  INFORMATION_SCHEMA  queries. The optimizations are described in the MySQL manual, and apply when the user provides an explicit schema name or table name in the query.

为了解决这些上报的问题,从MySQL 5.1开始,我们进行了许多性能优化来加快 INFORMATION_SCHEMA 查询的执行速度。MySQL手册 <链接1> 中描述了这些优化,当用户在查询中提供显式schema名称或表名时,将会应用这些。

Alas, despite these improvements  INFORMATION_SCHEMA  performance is still a major pain point for many of our users. The key reason behind these performance issues in the current  INFORMATION_SCHEMA  implementation is that  INFORMATION_SCHEMA  tables are implemented as temporary tables that are created on-the-fly during query execution. These temporary tables are populated via:



Meta data from files, e.g. table definitions from .FRM files.

Details from storage engines, e.g. dynamic table statistics.

Data from global data structures in the MySQL server.



尽管有这些改进, INFORMATION_SCHEMA的 性能仍然是我们许多用户的主要痛点。在当前 INFORMATION_SCHEMA 实现方式下产生的性能问题背后的关键原因是, INFORMATION_SCHEMA 表的查询实现方式是在查询执行期间创建临时表。这些临时表通过以下方式填充:



元数据来自文件,例如:表定义来自FRM文件

细节来自于存储引擎,例如:动态表的统计信息

来自MySQL server层中全局数据结构的数据





For a MySQL server having hundreds of database, each with hundreds of tables within them, the  INFORMATION_SCHEMA query would end-up doing lot of I/O reading each individual FRM files from the file system. And it would also end-up using more CPU cycles in effort to open the table and prepare related in-memory data structures. It does attempt to use the MySQL server table cache (the system variable ‘ table_definition_cache ‘), however in large server instances it’s very rare to have a table cache that is large enough to accommodate all of these tables.

对于一个MySQL实例来说可能有上百个库,每个库又有上百张表, INFORMATION_SCHEMA 查询最终会从文件系统中读取每个单独的FRM文件,造成很多I/O读取。 并且最终还会消耗更多的CPU来打开表并准备相关的内存数据结构。 它确实尝试使用MySQL server层的表缓存(系统变量 table_definition_cache  ),但是在大型实例中,很少有一个足够大的表缓存来容纳所有的表。

One can easily face the above mentioned performance issue if the optimization is not used by the  INFORMATION_SCHEMA query. For example, let us consider the two queries below

如果 INFORMATION_SCHEMA 查询未使用优化,则可以很容易碰到上面的性能问题。 例如,让我们考虑下面的两个查询

EXPLAIN
WHERE
TABLE_SCHEMA
TABLE_NAME
FRM
FRM
EXPLAIN
WHERE
TABLE_SCHEMA
TABLE_NAME
FRM
FRM
FRM
FRM
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
TABLE_SCHEMA
TABLE_NAME
TABLE_TYPE
ENGINE
AUTO_INCREMENT
AVG_ROW_LENGTH
DATA_FREE
DATA_FREE
TABLE_SCHEMA
TABLE_NAME
TABLE_TYPE
ENGINE
AUTO_INCREMENT
AVG_ROW_LENGTH
DATA_FREE
DATA_FREE
information_schema_stats
cached
information_schema_stats=latest
information_schema_stats
缓存
information_schema_stats = latest
ANALYZE TABLE
ANALYZE TABLE
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA
INFORMATION_SCHEMA的
INFORMATION_SCHEMA