Posted on 1 Comment

On partial indexes for string columns

After reading Fernando Ipar’s interesting post on partial indexes for string columns, there were two things I wanted to note:

First, this trick works quite well, but only if your like clauses only ever use the wildcard on the right hand side (or not at all). MySQL will not be able to use the index if the like contains a wildcard on the left.

Consider the following table definition:

mysql> show create table people\G
*************************** 1. row ***************************
Table: people
Create Table: CREATE TABLE `people` (
`person_id` int(15) NOT NULL default '0',
`username` varchar(255) default NULL,
`email` varchar(255) default NULL,
PRIMARY KEY (`person_id`),
KEY `people_username` (`username`(5))
) ENGINE=MyISAM DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

Now, see the following queries:


mysql> explain select username from people where username like 'jo%';
+----+-------------+--------+-------+-----------------+-----------------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+-----------------+-----------------+---------+------+------+-------------+
| 1 | SIMPLE | people | range | people_username | people_username | 8 | NULL | 394 | Using where |
+----+-------------+--------+-------+-----------------+-----------------+---------+------+------+-------------+
1 row in set (0.00 sec)


mysql> explain select username from people where username like '%jo';
+----+-------------+--------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | people | ALL | NULL | NULL | NULL | NULL | 128928 | Using where |
+----+-------------+--------+------+---------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)


mysql> explain select username from people where username like '%jo%';
+----+-------------+--------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | people | ALL | NULL | NULL | NULL | NULL | 128928 | Using where |
+----+-------------+--------+------+---------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)

As you can see, the first one nicely uses the index. The second and third though, cannot use it at all.
Now, we add an index on the full username field:


mysql> create index people_username_full on people(username);
Query OK, 128928 rows affected (6.72 sec)
Records: 128928 Duplicates: 0 Warnings: 0

Then, rerunning the queries we see the result changing:


mysql> explain select username from people where username like 'jo%';
+----+-------------+--------+-------+--------------------------------------+----------------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+--------------------------------------+----------------------+---------+------+------+--------------------------+
| 1 | SIMPLE | people | range | people_username,people_username_full | people_username_full | 258 | NULL | 143 | Using where; Using index |
+----+-------------+--------+-------+--------------------------------------+----------------------+---------+------+------+--------------------------+
1 row in set (0.01 sec)


mysql> explain select username from people where username like '%jo';
+----+-------------+--------+-------+---------------+----------------------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+---------------+----------------------+---------+------+--------+--------------------------+
| 1 | SIMPLE | people | index | NULL | people_username_full | 258 | NULL | 128928 | Using where; Using index |
+----+-------------+--------+-------+---------------+----------------------+---------+------+--------+--------------------------+
1 row in set (0.00 sec)


mysql> explain select username from people where username like '%jo%';
+----+-------------+--------+-------+---------------+----------------------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+---------------+----------------------+---------+------+--------+--------------------------+
| 1 | SIMPLE | people | index | NULL | people_username_full | 258 | NULL | 128928 | Using where; Using index |
+----+-------------+--------+-------+---------------+----------------------+---------+------+--------+--------------------------+
1 row in set (0.00 sec)

Notice how all queries are now using the index. Even the first query now uses the longer but more detailed index.

The second suggestion I wanted to make is one that is exploiting this index option for right-oriented ‘like’ searches. A common use is for email addresses. Often, they are being searched by e.g. domain name. If you add an extra field to your table that stores the reverse of the email field, you can then use a partial index to index the email field from the right. You will need the MySQL function REVERSE() for this:


mysql> alter table people add email_reverse varchar(255) null;
Query OK, 128928 rows affected (4.63 sec)
Records: 128928 Duplicates: 0 Warnings: 0

mysql> desc people;
+---------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+-------+
| person_id | int(15) | NO | PRI | 0 | |
| username | varchar(255) | YES | MUL | NULL | |
| email | varchar(255) | YES | | NULL | |
| email_reverse | varchar(255) | YES | | NULL | |
+---------------+--------------+------+-----+---------+-------+
4 rows in set (0.00 sec)


mysql> update people set email_reverse = REVERSE(email);
Query OK, 128928 rows affected (1.36 sec)
Rows matched: 128928 Changed: 128928 Warnings: 0


mysql> create index people_email_rev on people(email_rev(10));
Query OK, 128928 rows affected (6.72 sec)
Records: 128928 Duplicates: 0 Warnings: 0

Now, you can execute queries with right handed ‘like’ clauses using the partial index:


mysql> explain select username from people where email_reverse like 'moc.oohay%';
+----+-------------+--------+-------+------------------+------------------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+------------------+------------------+---------+------+------+-------------+
| 1 | SIMPLE | people | range | people_email_rev | people_email_rev | 13 | NULL | 1 | Using where |
+----+-------------+--------+-------+------------------+------------------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> explain select username from people where email_reverse like REVERSE('%yahoo.com');
+----+-------------+--------+-------+------------------+------------------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+------------------+------------------+---------+------+------+-------------+
| 1 | SIMPLE | people | range | people_email_rev | people_email_rev | 13 | NULL | 1 | Using where |
+----+-------------+--------+-------+------------------+------------------+---------+------+------+-------------+
1 row in set (0.00 sec)

Posted on 1 Comment