Posted on

What a Hosting Provider did Today

I found Dennis the Menace, he now has a job as system administrator for a hosting company. Scenario: client has a problem with a server becoming unavailable (cause unknown) and has it restarted. MySQL had some page corruption in the InnoDB tablespace.

The hosting provider, being really helpful, goes in as root and first deletes ib_logfile* then ib* in /var/lib/mysql. He later says “I am sorry if I deleted it. I thought I deleted the log only. Sorry again.”  Now this may appear nice, but people who know what they’re doing with MySQL will realise that deleting the iblogfiles actually destroys data also. MySQL of course screams loudly that while it has FRM files it can’t find the tables. No kidding!

Then, while he’s been told to not touch anything any more, and I’m trying to see if I can recover the deleted files on ext3 filesystem (yes there are tools for that), he goes in again and puts an ibdata1 file back. No, not the logfiles – but he had those somewhere else too. The files get restored and turn out to be two months old (no info on how they were made in the first place but that’s minor detail in this grand scheme). All the extra write activity on the partition would’ve also made potential deleted file recovery more difficult or impossible.

This story will still get a “happy” ending, using a recent mysqldump to load a new server at a different hosting provider. Really – some helpfulness is not what you want. Secondary lesson: pick your hosting provider with care. Feel free to ask us for recommendations as we know some excellent providers and have encountered plenty of poor ones.

21 thoughts on “What a Hosting Provider did Today

  1. I would have to suggest that putting the text ‘log’ in those files is a silly naming convention.

    1. How’s that, Adam? Do you regard something with “log” in the name as disposable?

  2. I regard it as a log file. something that would not be needed if it had to go.
    I do know that it is NOT a disposable file (I learned that when trying to fix an innodb corruption 6 months ago).

    But it is a poor naming convention because if I thought it was a log file originally (and by convention disposable) then others would too.

  3. Adam, anyone who has any background in any sort of transactional database tech will recognise the word “log” as the transactional log and shouldnt be deleted in normal circumstances.

    In a way I feel sorry for the hosting provider. They seem quite inexperienced but tried to do the right thing by helping their customer that went badly (im thinkin a mix of linux inexperience with poor google-fu). When they realised they cocked up badly they probably just wanted to help restore the data back and undo their mistake and not destroy their customer relationship.

    Moral of the story is always know your limits, and know when it is better to stop trying to fix a bad situation that you could make worse.

  4. @Adam – you’ve just illustrated the difference between an experienced, competent sysadmin, and… the rest.

  5. It’s a log file by nature. Any file whether labeled log or not shouldn’t be deleted before it’s understood. We never ‘just’ delete files in the data directory; try moving them away or renaming them so rollbacks can be achieved. The problem here is education and the error of the service provider to let someone without the depth of knowledge near a customer’s data. The outcome of this – hopefully the sa will be more careful when working with customers files. He’ll probably be less inclined to wildcard rm commands before checking what globbing will be done and maybe he’s going to find some time to read about the MySQL product if he’s working closely with it everyday.

    1. Or if they’re time/resource poor, outsource dealing with those technologies… which is exactly what both your company and mine work on (at different scales).

  6. I’ve had a similar experience with a helpful sysadmin deleting log.* (BerkeleyDB write-ahead logs), thus making recovery from backup necessary!

  7. @Adam Sorry, but *no* log files are disposable.

    Binaries can be replaced, the log files tell you the who/what/why/when of went wrong.

  8. Regardless of the naming convention(s) used by apps running on the box, if the “hosting” provider is not contracted to arbitrarily delete files or restore some files from out-of-date backups that they weren’t contracted to make, then they shouldn’t be doing it. And after being explicitly told not to touch any more … “He needed killin’.”

  9. Agreed with Adam, anything with “log” in the name is by convention disposable but useful for diagnosis. Which doesn’t make Dennis the Menace’s actions good, such files should be renamed not deleted when they seem to be causing problems, but it points to a bug in InnoDB as a compounding factor.

  10. I’ve been there and done that, having to try to recover data for a customers server when one of our tech support guys decided to delete those files thinking they were just logs.

    In that particular case there wasn’t much I could do, there were no backups to pull from. In this case the customer was on a dedicated server hosting package and it was clearly stated as part of their terms that they were responsible for backups and data security, the company provided the hardware, the base install and remote hands.

    I’m rather inclined to agree with Adam, though my habit is to compress log files rather than delete them, just in case. logfile is a potentially confusing file name asking for trouble.

  11. So what hosting provider would you suggest? I have a start up and I am really looking out for a good one.

    1. If you need local DCs in Australia, New Zealand and several other countries, NZ based has a good clue – let them know Arjen says hi.
      More globally, is excellent; note that they now have DCs in London and Tokyo also (Tokyo being closer ping-wise to AU than a US based server), although I would stay away from the Fremont DC as that’s where Hurricane Electric lives and they’ve had both power issues as well as DDoS to the DC disrupting access to Linode servers.

  12. If you don’t know what it is, then assuming that you can delete/fix/move it because of its name is worse than stupid, it’s actually harmful. It’s a log of transactions, what do you want it to be called?

    Still, another good lesson on keeping backups.

  13. “Agreed with Adam, anything with “log” in the name is by convention disposable”

    When transactional databases are involved, only under very specific circumstances do you do anything with logs – ie all transactions stored in the logs must have been committed, and you should have a full (tested – backups are worthless, it’s restores that count) consistent backup of the data too. And even then, log files for transactional databases are generally truncated, not deleted.

    1. As others have noted, logs are important for finding out about problems, so they’re definitely not disposable. They could be shifted elsewhere, but never just deleted.
      With regard to a transactional database, the redo log is where the commits go and reside until data pages are flushed to the table space. So the tablespace and the redo logs *combined* contain the current committed dataset. InnoDB uses static length pre-allocated files for its redo logs, so they don’t get fragmented on a filesystem.

  14. I’ve had to look after up to 300 simultaneous Linux servers all operating in different locations for over 10 years. I would call myself experienced. A lack of competency would have shown up years before.
    One of the big rules in IT system administration is to remove any potential problem areas.
    The naming convention for those innodb files is poor and will lead to problems.
    I never lost any mysql data myself, but I did wonder what those ‘log’ files were and they were not what I imagined.
    Log files do build up and unless there is meaningful data in them you can remove them. Basic housekeeping.

  15. @sabik, like Adam, you need to fix your thinking.

  16. @pdf, i think you missed the whole point. The naming convention is going to cause people to make mistakes.
    It is a poor sys admin who is not risk averse.

  17. Hi Arjen,

    Thanks for the suggestion. I am going to take your name as referencee. Imight end up getting some discount 🙂

Comments are closed.