Tag Archives: predictive caching

Predictive caching in a MySQL-backed infrastructure

Sounds a bit far fetched (pun intended ;-), but we’re doing it. This is not inside of the MySQL server, but rather the overall application design. Let me run you through the logic…

Some key aspects to scaling are: not doing unnecessary queries, and caching what you can. Just a quick baseline. The fastest query is the one you don’t do, or the one you’ve already done before – the latter being caching.

A simple yet brilliant example of this is the Youtube trick where a script reads the relay log, converting updates into appropriate selects and running them so that the InnoDB cache will have the blocks in memory when the slave SQL thread executes the actual update. Maatkit now has a tool for this, so it’s publically available. It’s not quite predictive, but it’s a neat trick anyway that sometimes comes in handy. Search engines use similar tricks.

Extending on this, with certain applications you actually tell what is likely to happen next, sometimes for a particular user and often for many users. Individual user behaviour may sometimes appear random, but as a group it can be highly predictable. The analysis needs to be done properly though, otherwise averaging will make certain interesting behavioural patterns disappear.

Anyway, if you can identify these patterns you can take appropriate measures, such as do some queries so they get cached, and/or schedule other relevant actions (so it’s more than just caching, but it’s a reasonably suitable name anyway). This allows the app to deal with higher peak load, as well as improving response time for individual user.

I might do a talk or article on the predictive caching concept some time, as I appreciate that the short description may appear a bit abstract or obscure. But I assure you it’s entirely practical and real.

It’s one example of how Open Query helps its clients scale well, by design. We focus on preventing emergencies, which includes not just scenarios where stuff fails (and does a safe failover), but also the “oh dear we suddenly have so many more users than a minute ago” type of happening, which should actually be an occasion to enjoy, not stress about.