Posted on

Replication and “the lost binlog”

Unless you set sync_binlog = 1, a system crash on the master will likely fail any slave with an “Client requested master to start replication from impossible position” error. Generally, this kind of situation requires manual intervention. When we see this, we make sure things indeed failed “past the end” of a binlog (i.e. the bit that didn’t get to the physical platter before the crash), reposition the slave to the next binlog, and use the Maatkit tools to ensure the slave is properly synced.

sync_binlog=1 is a problem in itself, because it makes the server not just do one fsync per commit, but several and that’s serious overhead. sync_binlog is actually not a boolean but a “fsync binlog every N commits” where 0 meaning “never”. So you could set it to 10 (fsync every 10 commits) and thus reduce the loss a little bit while not doing too much harm to performance. But it’s not ideal and won’t always prevent the above error.

Posted on