An insightful exploration by Domas (Facebook) on how some of the MySQL optimiser’s decision logic is sometimes naive, in this case regarding ORDER BY optimisation.
Quite often, “simple” logic can work better than complex logic as chasing all the corner cases can just make things worse – but sometimes, logic can be too simple.
Everything must be made as simple as possible, but no simpler.
— Albert Einstein / Roger Sessions
You might find a warning like the below in your error log:
130522 17:54:18 [Warning] Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. Statements writing to a table with an auto-increment column after selecting from another table are unsafe because the order in which rows are retrieved determines what (if any) rows will be written. This order cannot be predicted and may differ on master and the slave.
Statement: INSERT INTO tbl2 SELECT * FROM tbl1 WHERE col IN (417,523)
What do MariaDB and MySQL mean with this warning? The server can’t guarantee that this exact query, with STATEMENT based replication, will always yield identical results on the slave.
Does that mean that you have to use ROW based (or MIXED) replication? Possibly, but not necessarily.
For this type of query, it primarily refers to the fact that without ORDER BY, rows have no order and thus a result set may show up in any order the server decides. Sometimes it’s predictable (depending on storage engine and index use), but that’s not something you want to rely on. You don’t have to ponder that, as an ORDER BY is never harmful.
Would ORDER BY col solve the problem? That depends!
If col is unique, yes. If col is not unique, then multiple rows could result and they’d still have a non-deterministic order. So in that case you’d need to ORDER BY col,anothercol to make it absolutely deterministic. The same of course applies if the WHERE clause only referred to a single col value: if multiple rows can match, then it’s not unique and it will require an additional column for the sort.
There are other query constructs where going to row based or mixed replication is the only way. But, just because the server tells you it can’t safely replicate a query with statement based replication, that doesn’t mean you can’t use statement based replication at all… there might be another way.