Using partial indexes in MySQL

After reading Fernando Ipar’s interesting post on partial indexes for string columns, there were two things I wanted to note:

First, this trick works quite well, but only if your like clauses only ever use the wildcard on the right hand side (or not at all). MySQL will not be able to use the index if the like contains a wildcard on the left.

Consider the following table definition:

mysql> show create table people\G *************************** 1. row *************************** Table: people Create Table: CREATE TABLE `people` ( `person_id` int(15) NOT NULL default '0', `username` varchar(255) default NULL, `email` varchar(255) default NULL, PRIMARY KEY (`person_id`), KEY `people_username` (`username`(5)) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 1 row in set (0.00 sec)

Now, see the following queries:

mysql> create index people_username_full on people(username); Query OK, 128928 rows affected (6.72 sec) Records: 128928 Duplicates: 0 Warnings: 0

Then, rerunning the queries we see the result changing:

Notice how all queries are now using the index. Even the first query now uses the longer but more detailed index.

The second suggestion I wanted to make is one that is exploiting this index option for right-oriented ‘like’ searches. A common use is for email addresses. Often, they are being searched by e.g. domain name. If you add an extra field to your table that stores the reverse of the email field, you can then use a partial index to index the email field from the right. You will need the MySQL function REVERSE() for this:

mysql> alter table people add email_reverse varchar(255) null; Query OK, 128928 rows affected (4.63 sec) Records: 128928 Duplicates: 0 Warnings: 0

mysql> update people set email_reverse = REVERSE(email); Query OK, 128928 rows affected (1.36 sec) Rows matched: 128928 Changed: 128928 Warnings: 0

mysql> create index people_email_rev on people(email_rev(10)); Query OK, 128928 rows affected (6.72 sec) Records: 128928 Duplicates: 0 Warnings: 0

Now, you can execute queries with right handed ‘like’ clauses using the partial index:

1 thought on “On partial indexes for string columns”

Fernando Ipar
August 14, 2009

That’s a great use of the reverse function, I never thought of it 🙂

As for the full index, you had me testing that query for a while until I realized it only uses the index if you only project that very field, i.e., your very specific query is the one that uses the index (select from table where LIKE ‘%string%’).

Of course, it still needs to process all of the tuples (if I don’t misunderstand explains output, and I hope I don’t since I don’t think there’s a way for any database to optimize those kind of queries unless we go for something like a FULLTEXT index, or move it out of the database altogether with Sphinx or Lucene), but doing it with an index is certainly more efficient.

Now, in order to take advantage of this index while projecting other columns from the result, this is all I came up with (I had to change the example because I already had data on this table):

mysql> explain select * from tasks where city in (select city from tasks where city like ‘%fi%’)\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: tasks
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 150000
Extra: Using where
*************************** 2. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: tasks
type: index_subquery
possible_keys: city,idx_full_city
key: city
key_len: 253
ref: func
rows: 1
Extra: Using index; Using where
2 rows in set (0.00 sec)

What do you think? Is it any use? Is there a better way? 🙂

Regards

Comments are closed.

Share this:

Related

1 thought on “On partial indexes for string columns”