Tag Archives: scaling

Serving Clients Rather than Falling Over

Dawnstar Australis (yes, nickname – but I know him personally – he speaks with knowledge and authority) updates on The Real Victims Of The Click Frenzy Fail: The Australian Consumer after his earlier post from a few months ago.

Colourful language aside, I believe he rightfully points out the failings of the organising company and the big Australian retailers. From the Open Query perspective we can just review the situation where sites fall over under load. Contrary to what they say, that’s not a cool indication of popularity. Let’s compare with the real world:

  1. Brick & Mortar store does something that turns out popular and we see a huge queue outside, people need to wait for hours. The people in the queue can chat, and overall the situation can be regarded as positive: it shows passers-by that there’s something special going on, and that’s cool. If you don’t want to be in the crowd, you’ll come back later.
  2. Website is unresponsive/inaccessible. There’s nothing cool or positive about this, as the cause is not only unknown, but in fact irrelevant in the context. Each potential client is on their own. Things fail, so they go elsewhere (if there are substitutes) or potentially away completely (concert, it’ll sell out). The bad taste sticks, so if there are alternatives they will not only move there, but be quite vocal about it so others move also.

So you see, you really don’t want your site to go down because of popularity, or for any other reason. Slashdot years ago created a “degrade gracefully” mechanism, where parts of the site would go static. So where normally users would be able to comment and rate posts, they’d just be able to read. In the worst case, only the front page would remain active. On Sept 11 2001, Slashdot was one of the few big sites that actually remained accessible and provided regular news that people could then read even though the topic was not really in its normal scope. The point is, they proved the approach multiple times.

Contrarily, companies like Ticketek have surely got Enterprise Design architecture, however their site has been seen to fall over with events such as The Wiggles. They might be able to get away with this since they’re essentially a monopoly provider: if you want a ticket for this particular event, you need to go to them. But it’s not good. Generally they acted surprised, even though the huge load was entirely predictable. Is that just naive, or a hope to mislead the public, or negligent? You decide.

It’s really a failure in design of sorts. As to where exactly, only an architectural review would show, and it’ll be different for different sites. However, the real lesson is that it’s not about “Enterprise Design” at all, nor about using any particular high-profile hosting provider or involvement of other buzzwords. It’s about proper architecture and deployment and the database is only one aspects of this. It doesn’t have to end up particularly expensive either, it just has to be done right and there’s no single magical approach – each case is unique. Looking at this is best done early on (it tends to also work our better and cheaper), but we’ve helped clients out at much later stages also.  Ideally, we do like to help before there’s a raging fire.

Jetpants: a toolkit for huge MySQL topologies

From a Tumblr engineering blog post:

Tumblr is one of the largest users of MySQL on the web. At present, our data set consists of over 60 billion relational rows, adding up to 21 terabytes of unique relational data. Managing over 200 dedicated database servers can be a bit of a handful, so naturally we engineered some creative solutions to help automate our common processes.

Today, we’re happy to announce the open source release of Jetpants, Tumblr’s in-house toolchain for managing huge MySQL database topologies. Jetpants offers a command suite for easily cloning replicas, rebalancing shards, and performing master promotions. It’s also a full Ruby library for use in developing custom billion-row migration scripts, automating database manipulations, and copying huge files quickly to multiple remote destinations.

Dynamically resizable range-based sharding allows you to scale MySQL horizontally in a robust manner, without any need for a central lookup service or massive pre-allocation of tiny shards. Jetpants supports this range-based model by providing a fast way to split shards that are approaching capacity or I/O limitations. On our hardware, we can split a 750GB, billion-row pool in half in under six hours.

Jetpants can be obtained via GitHub or RubyGems.

Good work Tumblr, excellent move to open up your tools: you’re bound to get good feedback and bug catches/fixes from users in other environments now, making your toolset even better!

MySQL Cluster on Raspberry Pi

Earlier this week, Andrew Morgan wrote a piece on running MySQL Cluster on Raspberry Pi. Since the term “Cluster” is hideously overloaded, I’ll note that we’re talking about the NDB cluster storage engine here, a very specific architecture originally acquired by MySQL AB from Ericsson (telco).

Raspberry Pi is a new single-board computer based on the ARM processor series (same stuff that powers most mobile phones these days), and it can run Linux without any fuss. Interfaces include Ethernet, USB, and HDMI video, and the cost is $25-50. I’m looking to use one for the front-end of a MythTV setup (digital video recorder and TV system), I can just strap the Raspberry Pi to the back of a TV or monitor to do its job.

As Andrew already notes, in practical terms you’re not likely to use Raspberry Pi for a cluster – perhaps for development and certain testing, and it’d be a neat solid state management server. Primarily, it’s “techie cool”.

Knowing the NDB architecture, one of the key issues is that all nodes need to communicate with each other (NxN) so the system is very network intensive, and network latency significantly affects performance. So commonly, a cluster would have at least separate interfaces for direct connections to its siblings (no switch), and possibly Dolphin Interconnect cards to provide a link with much less latency than regular Ethernet offers. And you can’t do either with Raspberry Pi.

However, there are important positive lessons in this setup:

  • Using the open source nature of the software it can be utilised in a new environment with only minimal tweaks. Not everybody needs to or wants to tweak, but the ability to do so is critical to innovation.
  • Overall, scaling out rather than up makes sense. There are cost, power-efficiency and other factors involved. More, cheap, relatively low-powered, systems can deliver a system architecture that would otherwise be unaffordable (and the expensive construct might not scale anyway).
  • Affordable resilience (redundancy).

What if you needed lots of MySQL slaves with a fairly small dataset? Raspberry Pi could well be the solution. Not everybody is “big” or “high performance” in the same way.

When Clever Goes Wrong & How Etsy Overcame – Arstechnica

In 2007, Etsy made a big bet on homegrown middleware to help with the site’s scalability. A half-year after it was taken live, the company decided to abandon it. As a senior software engineer at Etsy put it, “if you’re doing something ‘clever,” you’re probably doing it wrong.”

Read the full article at Arstechnica.com

I want to focus on the important lessons from this article, about middleware and using stored procedures in this fashion for a public web application, creating unscalable design complexity (smart and “proper” according to the old enterprise design teachings…) – causing infrastructure, development and maintenance hassles.

In the process they did replace PostgreSQL with MySQL but that’s not the critical change that made all the difference. PostgreSQL is a fine database system also.

Ladies and gentlemen, check your assumptions

I spent some time earlier this week trying to debug a permissions problem in Drupal.

After a lot of head-scratching, it turned out that Drupal assumes that when you run INSERT queries sequentially on a table with an auto_increment integer column, the values that are assigned to this column will also be sequential, ie: 1, 2, 3, …

This might be a valid assumption when you are the only user doing inserts on a single MySQL server, but unfortunately that is not always the situation in which an application runs.

I run MySQL in a dual-master setup, which means that two sequential INSERT statements will never return sequential integers.  The value will always be determined by the  auto_increment_increment and auto_increment_offset settings in the configuration file.

In my case, one master will only assign even numbers, the other only uneven ones.

My patch was accepted, so this problem is now fixed in the Drupal 7 (and hopefully soon in 6 as well) codebase.

The moral of the story is that your application should never make such assumptions about auto_increment columns.  A user may run the application on a completely different architecture, and it may break in interesting and subtle ways.

If you want to use defined integers like Drupal does, make sure you explicitly insert them. Otherwise, you can retrieve the assigned number via the mysql_insert_id() function in PHP or via SELECT LAST_INSERT_ID() in MySQL itself.

Have you checked your code today?

libmemcached packages

Ronald Bradford last week posted about memcached not being multi-threaded on Ubuntu, something he discovered via some small utilities that are bundled with libmemcached, written by Brian Aker.

When I noticed there were no Ubuntu packages for libmemcached (or the CLI tools) I decided to create some.

For your enjoyment: http://ubuntu.cafuego.net/dists/jaunty-cafuego/memcached/ (Source debs are included)

The repository also contains a memcached that has been re-compiled with multithreading enabled.

Predictive caching in a MySQL-backed infrastructure

Sounds a bit far fetched (pun intended ;-), but we’re doing it. This is not inside of the MySQL server, but rather the overall application design. Let me run you through the logic…

Some key aspects to scaling are: not doing unnecessary queries, and caching what you can. Just a quick baseline. The fastest query is the one you don’t do, or the one you’ve already done before – the latter being caching.

A simple yet brilliant example of this is the Youtube trick where a script reads the relay log, converting updates into appropriate selects and running them so that the InnoDB cache will have the blocks in memory when the slave SQL thread executes the actual update. Maatkit now has a tool for this, so it’s publically available. It’s not quite predictive, but it’s a neat trick anyway that sometimes comes in handy. Search engines use similar tricks.

Extending on this, with certain applications you actually tell what is likely to happen next, sometimes for a particular user and often for many users. Individual user behaviour may sometimes appear random, but as a group it can be highly predictable. The analysis needs to be done properly though, otherwise averaging will make certain interesting behavioural patterns disappear.

Anyway, if you can identify these patterns you can take appropriate measures, such as do some queries so they get cached, and/or schedule other relevant actions (so it’s more than just caching, but it’s a reasonably suitable name anyway). This allows the app to deal with higher peak load, as well as improving response time for individual user.

I might do a talk or article on the predictive caching concept some time, as I appreciate that the short description may appear a bit abstract or obscure. But I assure you it’s entirely practical and real.

It’s one example of how Open Query helps its clients scale well, by design. We focus on preventing emergencies, which includes not just scenarios where stuff fails (and does a safe failover), but also the “oh dear we suddenly have so many more users than a minute ago” type of happening, which should actually be an occasion to enjoy, not stress about.