I recently wrote about non-deterministic queries in the replication stream. That’s resolved by using either MIXED or ROW based replication rather than STATEMENT based.
Another thing that’s not fully handled by STATEMENT based replication is temporary tables. Imagine the following:
- Master: CREATE TEMPORARY TABLE rpltmpbreak (i INT);
- Wait for slave to replicate this statement, then stop and start mysqld (not just STOP/START SLAVE)
- Master: INSERT INTO rpltmpbreak VALUES (1);
- Slave: SHOW SLAVE STATUS \G
If for any reason a slave server shuts down and restarts after the temp table creation, replication will break because the temporary table will no longer exist on the restarted slave server. It’s obvious when you think about it, but nevertheless it’s quite annoying.
A long time ago (early 2007, when I was still working at MySQL AB) I filed a bug report on this. It’s important to realise that back then, row based replication did exist but was so buggy that you wouldn’t recommend it, so the topic was quite relevant. For some reason the bug has remained open for over 6 years until some recent activity.
It is not an issue with determinism and most temporary table constructs are technically regarded as “safe” to replicate via statement based replication, so if you use MIXED you will still find replication broken with the above scenario. Important to realise!
http://dev.mysql.com/doc/refman/5.5/en/replication-features-temptables.html (the obvious place to look) doesn’t really explain this well, but http://dev.mysql.com/doc/refman/5.5/en/replication-rbr-usage.html correctly states that ROW based replication doesn’t suffer from this problem as it replicates the values from the temporary table on the master rather than the statement, thus the slave doesn’t have to deal with the temporary table at all. I’ve suggested that the bug be changed to a documentation issue, updating the page on replication and temporary tables to properly explain the issue and point clearly and explicitly to the solution.
So, why would you ever use STATEMENT or MIXED rather than ROW based replication?
- Well, as I mentioned, earlier row based wasn’t particularly reliable. At that time, for non-deterministic scenarios we recommended mixed as a compromise (that only uses row based information in the replication stream when it’s necessary, and statements the rest of the time). Many issues have been fixed over time and now we can generally say that row based replication is ok in recent versions of MySQL and MariaDB (5.5 or above, just to be sure). So if you’re replicating from an older master, STATEMENT or MIXED might still be preferable, as long as you know that the limitations are.
- Non-local replication (outside the datacenter) is vastly more efficient with STATEMENT based replication: if you’re updating 100,000 rows, it’s a single statement whereas it’s a 100,000 row updates. So depending on bandwidth/cost and such, that might also be a relevant.
If none of those considerations apply, ROW based replication might be the way to go now. But the really important thing to realise is that for each of the choices of STATEMENT, MIXED and ROW, there are advantages and consequences.
Do you have any other reasons for using STATEMENT or MIXED in your environment?