Category Archives: Good practice / Bad practice

Hint of the day: Warning level in Error Log to see Aborted Connections

log_warnings = 2

Yields useful information in the MariaDB or MySQL error log file (or syslog on Debian/Ubuntu) you don’t want to miss out on.

You will know about aborted connections, which are otherwise only visible through global status as Aborted_connects (lost connection before they completed authentication) and Aborted_clients (cut fully authenticated connection).

It looks like

130523 2:14:05 [Warning] Aborted connection 173629 to db: 'unconnected' user: 'someapp' host: '10.2.0.50' (Unknown error)

You will know when, where from, and if for instance a wrong password was used you’ll see the username. Basically you’ll get as much info as the server has available at that point. Useful.

 

InnoDB without PRIMARY KEY

Having an InnoDB table without a PRIMARY KEY is not good. Many have known this for years, but exact opinions as to why have differed. From observation, it was clear to me that it impacted performance.

InnoDB stores its row data in the leaf nodes of the primary key B+tree structure, that means that it can’t work without… so if you don’t specify a PK, it makes one up. Seems pretty innocuous and shouldn’t actually perform any worse than an auto-inc field. Except that in reality the performance can be much much worse. Annoying. Naturally we recommend clients to always have a PK (auto-inc, a composite of foreign keys, or if need be a natural key) but production systems cannot always be quickly changed, depending on the app code adding a column is not something you can just do at the DBA level.

Recently my good friend and former colleague Jeremy Cole, who has been delving into the depths of InnoDB, asked me if I had any open questions on the topic. So I mentioned the above, and after a brief look at the relevant code caught he was definitely interested in exploring the issue. The result is this blog post: http://blog.jcole.us/2013/05/02/how-does-innodb-behave-without-a-primary-key/ for your enlightenment and enjoyment.

Now we know what goes on internally. And it’s clear that performance is negatively affected, and why. Useful.

Thanks Jeremy!

Query pattern: OR across different tables

When a query uses a construct like

SELECT ... FROM a JOIN b ON (...) WHERE a.c1 = X OR b.c2 = Y

execution will inevitably degrade as the dataset grows.

The optimiser can choose to use an index merge when dealing with two relevant indexes over a single table, but that’s obviously of no use in this scenario as the optimiser has to choose which table to access first. And regardless of which table is accessed first, the other one might yield a result. Thus the query will never be efficient.

The real answer is that the query construct is wrong, a JOIN is used inappropriately. The correct approach for this type of query is using a UNION:

SELECT ... FROM a WHERE a.c1 = X
UNION [ALL]
SELECT ... FROM b WHERE b.c2 = Y

This mistake occurs relatively often, because while on the one hand people try to reduce the number of queries necessary to achieve their objective, they are (somewhat) familiar with JOINs and either don’t know about UNION or use it so rarely that they don’t think of it when the question calls for its use. So, this is your reminder ;-)

Serving Clients Rather than Falling Over

Dawnstar Australis (yes, nickname – but I know him personally – he speaks with knowledge and authority) updates on The Real Victims Of The Click Frenzy Fail: The Australian Consumer after his earlier post from a few months ago.

Colourful language aside, I believe he rightfully points out the failings of the organising company and the big Australian retailers. From the Open Query perspective we can just review the situation where sites fall over under load. Contrary to what they say, that’s not a cool indication of popularity. Let’s compare with the real world:

  1. Brick & Mortar store does something that turns out popular and we see a huge queue outside, people need to wait for hours. The people in the queue can chat, and overall the situation can be regarded as positive: it shows passers-by that there’s something special going on, and that’s cool. If you don’t want to be in the crowd, you’ll come back later.
  2. Website is unresponsive/inaccessible. There’s nothing cool or positive about this, as the cause is not only unknown, but in fact irrelevant in the context. Each potential client is on their own. Things fail, so they go elsewhere (if there are substitutes) or potentially away completely (concert, it’ll sell out). The bad taste sticks, so if there are alternatives they will not only move there, but be quite vocal about it so others move also.

So you see, you really don’t want your site to go down because of popularity, or for any other reason. Slashdot years ago created a “degrade gracefully” mechanism, where parts of the site would go static. So where normally users would be able to comment and rate posts, they’d just be able to read. In the worst case, only the front page would remain active. On Sept 11 2001, Slashdot was one of the few big sites that actually remained accessible and provided regular news that people could then read even though the topic was not really in its normal scope. The point is, they proved the approach multiple times.

Contrarily, companies like Ticketek have surely got Enterprise Design architecture, however their site has been seen to fall over with events such as The Wiggles. They might be able to get away with this since they’re essentially a monopoly provider: if you want a ticket for this particular event, you need to go to them. But it’s not good. Generally they acted surprised, even though the huge load was entirely predictable. Is that just naive, or a hope to mislead the public, or negligent? You decide.

It’s really a failure in design of sorts. As to where exactly, only an architectural review would show, and it’ll be different for different sites. However, the real lesson is that it’s not about “Enterprise Design” at all, nor about using any particular high-profile hosting provider or involvement of other buzzwords. It’s about proper architecture and deployment and the database is only one aspects of this. It doesn’t have to end up particularly expensive either, it just has to be done right and there’s no single magical approach – each case is unique. Looking at this is best done early on (it tends to also work our better and cheaper), but we’ve helped clients out at much later stages also.  Ideally, we do like to help before there’s a raging fire.

One-way Password Crypting Flaws

I was talking with a client and the topic of password crypting came up. From my background as a C coder, I have a few criteria to regard a mechanism to be safe. In this case we’ll just discuss things from the perspective of secure storage, and validation in an application.

  1. use a digital fingerprint algorithm, not a hash or CRC. A hash is by nature lossy (generates evenly distributed duplicates) and a CRC is intended to identify bit errors in transmitted data, not compare potentially different data.
  2. Store/use all of the fingerprint, not just part (otherwise it’s lossy again).
  3. SHA1 and its siblings are not ideal for this purpose, but ok. MD5 and that family of “message digests” has been proven flawed long ago, they can be “freaked” to create a desired outcome. Thus, it is possible to manufacture a source string that generates an MD5 of course.
  4. Add a salt of reasonable length (extra string added to password), otherwise dictionary attacks are way to easy. In addition, not using a salt means that two users who have the same password end up with the same encrypted password which is another case of “too much info” for people. Salt should of course be different for each user. Iterate.
  5. Even if someone were to capture your user/pwd table, they should not be able to decode the passwords within a reasonable amount of time. Flaws in any of the above issues can make such attacks easy.

The below code, used in a variety of ecommerce packages (osCommerce prior to v2.3.0, ZenCartCRE Loaded / Loaded Commerce, and other derivatives and descendants of oscommerce) on the surface appears to do something quite smart. Note that this code does not use an external salt (such as the username or other separate field) but instead generates it and adds it to the encrypted password. This enables it to be used in applications where no username or other login constant other than the password is available, although I’d consider that quite rare.

function validateAdminPassword($plain, $encrypted) {
  if (!$plain && !$encrypted) {
    return false;
  }

  $stack = explode(':', $encrypted);
  if (sizeof($stack) != 2)
    return false;

  if (md5($stack[1] . $plain) == $stack[0]) {
    return true;
  }

  return false;
}

function encryptAdminPassword($plain) {
  $password = '';

  for ($i=0; $i<10; $i++) {
    $password .= rand();
  }

  // arjen comment: so the 2 is what you need to increase,
  // as well as the length of the relevant database column.
  $salt = substr(md5($password), 0, 2);

  $password = md5($salt . $plain) . ':' . $salt;

  return $password;
}

This code is flawed. Apart from being confusing (using the $password variable name when calculating the salt) the main problem is that the salt ends up too short. The code generates 10 pseudo-random characters (PHP tends to initialise the random generator from time, so it can be somewhat predictable which is a potential attack vector – for instance when the creation time of the user record is also stored) but then it’s run through MD5() after which only the two first characters of the resulting message digest are used for the actual salt. Since the MD5 comes out as hex digits, the range of each of the two characters is [0-9a-f] and so the total number of possibilities for the salt string is 256. That’s not a lot!

The effort involved in pre-calculating the MD5s (including all salt permutations) is not that high, it’s merely 256 times the size of the dictionary used. Wouldn’t take that much disk space. Since this code is used by lots of sites, the potential for a successful attack is rather high in that sense also. Combined with the lack of iteration, this just makes an attack all too easy.

Finally, if the user table were captured from a site with a large number of users, the chance of finding colliding encrypted passwords is quite a bit higher than it should be. But the above mentioned approach already has sufficient potential for a damaging security breach.

If this code is active on your site, a quick patch would be to increase the length of the salt by changing the substr() call, and make it do iterations. Obviously you’ll also need to similarly increase the length of the password storage column in your database. You then get old and new crypted passwords in your table and you can work out which version by checking the length of the crypted password string. On login you can replace old for new for each user as you’ll have their plain password at that point (since they just filled it in and sent it to your app). That way you can create a clean transition. I grant you it’s not perfect but it’s at least improving an otherwise very insecure situation.

If, rather than pragmatically fixing up an existing environment that you didn’t write, you want to do all this properly, read Password Storage Cheat Sheet from OWASP (the Open Web Application Security Platform). It lists a range of considerations (and reasoning) beyond my basic pragmatic list. If you’re going to code from scratch, please do it right.

Edit: 20120717: added maximum affected osCommerce version thanks to Harald Ponce de Leon. osCommerce as of v2.3.0 uses phpass like WordPress

What a Hosting Provider did Today

I found Dennis the Menace, he now has a job as system administrator for a hosting company. Scenario: client has a problem with a server becoming unavailable (cause unknown) and has it restarted. MySQL had some page corruption in the InnoDB tablespace.

The hosting provider, being really helpful, goes in as root and first deletes ib_logfile* then ib* in /var/lib/mysql. He later says “I am sorry if I deleted it. I thought I deleted the log only. Sorry again.”  Now this may appear nice, but people who know what they’re doing with MySQL will realise that deleting the iblogfiles actually destroys data also. MySQL of course screams loudly that while it has FRM files it can’t find the tables. No kidding!

Then, while he’s been told to not touch anything any more, and I’m trying to see if I can recover the deleted files on ext3 filesystem (yes there are tools for that), he goes in again and puts an ibdata1 file back. No, not the logfiles – but he had those somewhere else too. The files get restored and turn out to be two months old (no info on how they were made in the first place but that’s minor detail in this grand scheme). All the extra write activity on the partition would’ve also made potential deleted file recovery more difficult or impossible.

This story will still get a “happy” ending, using a recent mysqldump to load a new server at a different hosting provider. Really – some helpfulness is not what you want. Secondary lesson: pick your hosting provider with care. Feel free to ask us for recommendations as we know some excellent providers and have encountered plenty of poor ones.

When Clever Goes Wrong & How Etsy Overcame – Arstechnica

In 2007, Etsy made a big bet on homegrown middleware to help with the site’s scalability. A half-year after it was taken live, the company decided to abandon it. As a senior software engineer at Etsy put it, “if you’re doing something ‘clever,” you’re probably doing it wrong.”

Read the full article at Arstechnica.com

I want to focus on the important lessons from this article, about middleware and using stored procedures in this fashion for a public web application, creating unscalable design complexity (smart and “proper” according to the old enterprise design teachings…) – causing infrastructure, development and maintenance hassles.

In the process they did replace PostgreSQL with MySQL but that’s not the critical change that made all the difference. PostgreSQL is a fine database system also.

On Password Strength

XKCD (as usual) makes a very good point – this time about password strength, and I reckon it’s something app developers need to consider urgently. Geeks can debate the exact amount of entropy, but that’s not really the issue: insisting on mixed upper/lower and/or non-alpha and/or numerical components to a user password does not really improve security, and definitely makes life more difficult for users.

So basically, the functions that do a “is this a strong password” should seriously reconsider their approach, particularly if they’re used to have the app decide whether to accept the password as “good enough” at all.

Update: Jeff Preshing has written an xkcd password generator. Users probably should choose their own four words, but it’s a nice example and a similar method could be used by an app to give “password suggestions” that are still safe.

MySQL data backup: going beyond mysqldump

A user on a linux user group mailing list asked about this, and I was one of the people replying. Re-posting here as I reckon it’s of wider interest.

> […] tens of gigs of data in MySQL databases.
> Some in memory tables, some MyISAM, a fair bit InnoDB. According to my
> understanding, when one doesn’t have several hours to take a DB
> offline and do dbbackup, there was/is ibbackup from InnoBase.. but now
> that MySQL and InnoBase have both been ‘Oracle Enterprised’, said
> product is now restricted to MySQL Enterprise customers..
>
> Some quick searching has suggested Percona XtraBackup as a potential
> FOSS alternative.
> What backup techniques do people employ around these parts for backups
> of large mixed MySQL data sets where downtime *must* be minimised?
>
> Has your backup plan ever been put to the test?

You should put it to the test regularly, not just when it’s needed.
An untested backup is not really a backup, I think.

At Open Query we tend to use dual master setups with MMM, other replication slaves, mysqldump, and XtracBackup or LVM snapshots. It’s not just about having backups, but also about general resilience, maintenance options, and scalability. I’ll clarify:

  • XtraBackup and LVM give you physical backups. that’s nice if you want to recover or clone a complete instance as-is. But if anything is wrong, it’ll be all stuffed (that is, you can sometimes recover InnoDB tablespaces and there are tools for it, but time may not be on your side). Note that LVM cannot snapshot between multiple volumes consistently, so if you have your InnoDB ibdata/IBD files and iblog files on separate spindles, using LVM is not suitable.
  • mysqldump for logical (SQL) backups. Most if not all setups should have this. Even if the file(s) were to be corrupted, they’re still readable since it’s plain SQL. You can do partial restores, which is handy in some cases. It’ll be slower to load so having *only* an SQL dump of a larger dataset is not a good idea.
  • some of the above backups can and should *also* be copied off-site. that’s for extra safety, but in terms of recovery speed it may not be optimal and should not be relied upon.
  • having dual masters is for easier maintenance without scheduled outages, as well as resilience when for instance hardware breaks (and it does).
  • slaves. You can even delay a slave (Maatkit has a tool for this), so that would give you a live correct image even in case of a user error, provided you get to it in time. Also, you want enough slack in your infra to be able to initialise a new slave off an existing one. Scaling up at a time when high load is already occurring can become painful if your infra is not prepared for it.

A key issue to consider is this… if the dataset is sufficiently large, and the online requirements high enough, you can’t afford to just have backups. Why? Because, how quickly can you deploy new suitable hardware, install OS, do restore, validate, put back online?

In many cases one or more aspects of the above list simply take too long, so my summary would be “then you don’t really have a backup”. Clients tend to argue with me on that, but only fairly briefly, until they see the point: if a restore takes longer than you can afford, that backup mechanism is unsuitable.

So, we use a combination of tools and approaches depending on needs, but in general terms we aim for keeping the overall environment online (individual machines can and will fail! relying on a magic box or SAN to not fail *will* get you bitten) to vastly reduce the instances where an actual restore is required.
Into that picture also comes using separate test/staging servers to not have developers stuff around on live servers (human error is an important cause of hassles).

In our training modules, we’ve combined the backups, recovery and replication topics as it’s clearly all intertwined and overlapping. Discussing backup techniques separate from replication and dual master setups makes no sense to us. It needs to be put in place with an overall vision.

Note that a SAN is not a backup strategy. And neither is replication on its own.