MySQL 字符串前缀索引

本文讨论了在 MySQL 中如何为字符串列创建前缀索引。

在 MySQL 中,您可以为字符串列的指定长度的前缀创建前缀索引。

相比于为整个字符串列创建索引,前缀索引能减少磁盘的使用量,并提高索引的写入速度。

MySQL 前缀索引语法

要向表中添加一个前缀索引,请使用 CREATE INDEX 语句:

CREATE INDEX index_name
ON table_name (column_name(length));

要在创建表时定义一个前缀索引,请使用 CREATE TABLE 语句:

CREATE TABLE table_name(
  column_list,
  INDEX(column_name(length))
);

说明:

  • 如果列为 CHARVARCHARTEXT 等非二进制字符类型, length 指定用于前缀索引的字符数。
  • 如果列为 BINARYVARBINARYBLOB 等二进制字符类型, length 指定用于前缀索引的字节数。

MySQL 前缀索引实例

我们将使用Sakila 示例数据库中的 actor 表进行演示。

看一下 actor 表的定义:

DESC actor;
+-------------+-------------------+------+-----+-------------------+-----------------------------------------------+
| Field       | Type              | Null | Key | Default           | Extra                                         |
+-------------+-------------------+------+-----+-------------------+-----------------------------------------------+
| actor_id    | smallint unsigned | NO   | PRI | NULL              | auto_increment                                |
| first_name  | varchar(45)       | NO   |     | NULL              |                                               |
| last_name   | varchar(45)       | NO   | MUL | NULL              |                                               |
| last_update | timestamp         | NO   |     | CURRENT_TIMESTAMP | DEFAULT_GENERATED on update CURRENT_TIMESTAMP |
+-------------+-------------------+------+-----+-------------------+-----------------------------------------------+
4 rows in set (0.00 sec)

看一下 actor 表的中的索引:

SHOW INDEXES FROM actor;
+-------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name            | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+-------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| actor |          0 | PRIMARY             |            1 | actor_id    | A         |         200 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| actor |          1 | idx_actor_last_name |            1 | last_name   | A         |         122 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
+-------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
2 rows in set (0.00 sec)

以下查询查找 first name 以字符串 GE 开头的演员:

SELECT *
FROM actor
WHERE first_name LIKE 'GE%';
+----------+------------+-----------+---------------------+
| actor_id | first_name | last_name | last_update         |
+----------+------------+-----------+---------------------+
|       96 | GENE       | WILLIS    | 2006-02-15 04:34:33 |
|      134 | GENE       | HOPKINS   | 2006-02-15 04:34:33 |
|      151 | GEOFFREY   | HESTON    | 2006-02-15 04:34:33 |
|      177 | GENE       | MCKELLEN  | 2006-02-15 04:34:33 |
+----------+------------+-----------+---------------------+
4 rows in set (0.00 sec)

由于该 first_name 列没有索引 ,查询优化器必须扫描所有行以返回结果,如 EXPLAIN 以下语句的输出所示:

EXPLAIN
SELECT *
FROM actor
WHERE first_name LIKE 'GE%';
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | actor | NULL       | ALL  | NULL          | NULL | NULL    | NULL |  200 |    11.11 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

如果您经常需要按照前缀搜索,可以考虑按照前缀创建一个索引:

CREATE INDEX idx_first_name_prefix_2
ON actor(first_name(2));

然后,通过 EXPLAIN 语句查看执行计划

EXPLAIN
SELECT *
FROM actor
WHERE first_name LIKE 'GE%';
+----+-------------+-------+------------+-------+-------------------------+-------------------------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys           | key                     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+-------------------------+-------------------------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | actor | NULL       | range | idx_first_name_prefix_2 | idx_first_name_prefix_2 | 10      | NULL |    4 |   100.00 | Using where |
+----+-------------+-------+------------+-------+-------------------------+-------------------------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.01 sec)

现在, MySQL 查询优化器已经使用索引了。

创建前缀索引的原则

创建前缀索引时,确定前缀长度需要一定的技巧。

我们确定前缀长度时, 应该尽量让字符串前缀保持唯一性。 唯一性越强,则索引的效率越高。

结论

MySQL 前缀索引有利于提高索引的字符串列的索引效率。