Posted on

Modular vs Integrated

There’s actually no single “correct” answer! It all depends on

  1. where in a stack the component lives;
  2. the state of the market for that component region;
  3. sometimes even geographic location of the user comes into play.

Yes, for OSS projects modularity is handy in terms of handling contributions, but modularity may not be the best way to deal with a problem in a certain market state and situation!

Research has shown (see, for example, “The Innovator’s Solution” by Clayton Christensen) that the “integrated” region over time actually shifts to a subcomponent of an original integrated component that has since gone modular. An interesting example of this for MySQL its pluggable storage engine interface since version 5.1. MySQL is more modular now, but individual storage engines are tightly integrated for performance reasons, and in some cases they are even proprietary. It’s important to realise that this too is just a phase and not a final state.

But sticking with OSS for this story, distributed version control with (among other things) decent branching/merging make contributions to the core are entirely feasible and even easy. It’s a matter of using a toolchain that supports the most suitable work process, rather than hinder it or force another unsuitable wok process. Good distributed version control systems: bzr (Bazaar), hg (Mercurial), git.

(note: svn/subversion, even with some “distributed” hacks on top of it, does *not* qualify as it’s architecturally unsuitable. The other tools have excellent migration plugins available to get your work process -and your code- out of svn hell. It’s a bit of work but can be a stepwise process to unwarp your work processes and get a sane and more productive development workflow.)

At Open Query we particularly like bzr, because with Launchpad it provides an excellent environment for tracking of bugs, features and ongoing work, merge reviews, and so on. The admin side of Launchpad has some user interface issues, but it is workable.

Sometimes the core of a particular component does need fixing – but the time may not be right for modularity, and it’s important to not get distracted by that geeky desire of “everything modular” as swimming against market forces can be very painful.

Posted on

Measuring HD latency in ways relevant to MySQL

As I described yesterday, Open Query is doing some tests on SSDs and other devices pretending to be harddisks (SANs, battery-backed RAID controllers, etc). To aid this, I wrote a small tool to test the different kind of I/O operations MySQL would/could do, which is not quite the same as what other general purpose apps would do, and also not what other test tools measure. For instance, it tries Direct I/O as well as fsync() after each write, and also it a range of different I/O block sizes.

In a nutshell, it’s aimed to do what MySQL does, without MySQL! Testing lots of different setups for this particular purpose (even with fantastic tools like MySQL Sandbox) is a complete pest, and changing InnoDB page size requires a recompile. While Percona has tried a larger page size in the past and decided it wasn’t worth it (the default is 16K), I thought it worthwhile to include such a test as the situation may change over time with different devices.

So, this is a little tool for a very specific purpose, and it should not grow beyond that – but do feel free to abuse it for whatever other purpose you reckon fits a similar approach. Oh, and it outputs CSV for easy graphing. To grab the code, go to the hdlatency project on Launchpad. It’s plain C, and GPLv3 licensed.