Daniel was tracking down what appeared to be a networking problem….
- server reported 113 (No route to host)
- However, an strace did not reveal the networking stack ever returning that.
- On the other side, IP packets were actually received.
- When confronted with mysteries like this, I get suspicious – mainly of (fellow) programmers.
- I suggested a grep through the source code, which revealed return -EHOSTUNREACH;
- Mystery solved, which allowed us to find what was actually going on.
- Don’t just believe or presume the supposed origin of an error.
- Programmers often take shortcuts that cause grief later. I fully appreciate how the above code came about, but I still think it was wrong. Mapping a “similar” situation onto an existing error code is convenient. But when an error occurs, the most important thing is for people to be able to track down what the root cause is. Reporting this error outside of its original context (error code reported by network stack) is clearly unhelpful, it actually misdirects and requires people to essentially waste time to track it down (as above).
- Horay once again for Open Source, which makes it so much easier to figure these things out. While possibly briefly embarrassing for the programmer, more eyes allows code to improve better and faster – and, perhaps, also entices towards better coding practices from the outset (I can hope!).
What do you think?
Clayton Christensen has some excellent insights on Modularity vs Integration in “The Innovator’s Solution”. I wrote about this for Upstarta.biz. Particularly in the realm of Open Source, modularity is regarded as a panacea – a product, service or design must be modular. But modularity is not better (or worse) than integration. Like tools, they each have their place, depending on the state of the market/ecosystem where the process/product/service operates. Part of a system can be in a modular phase, where another part of the same system needs integration!
In this context, think of an Open Source project or company’s ability to handle contributions. If the process of interaction between a contributor and the core is not (for whatever reason) clearly defined and predictable, it won’t work. Jamming an additional [in this case external, but that's irrelevant to the issue] interface for contributions somewhere in existing business processes can be doomed to fail.
We see the results of this in many projects that are Open Source, but find themselves unable to process contributions, or just don’t get any contributions. It’s quite likely that the underlying cause is not apathy (from the contributor’s end) or malice (from the receipient’s end), but it’s important to understand the underlying processes at work. It’s not necessarily the modularity of the software itself that’s an issue (tightly integrated code can receive contributions too!), but the surrounding business processes.
I had this realisation while camping with my good friend Steve Dalton and our kids this weekend. So a big thanks to Steve! I think it may help with understanding why Sun/MySQL (and MySQL AB before it) have had such difficulty dealing with contributions. And proper understanding could help resolve the problem. Good intent on its own does not suffice, otherwise it’d have been highly effective long ago!