Posted on

Amazon vs Linode in case of hardware failure

One of our clients received an email like this:

Dear Amazon EC2 Customer,

We have important news about your account (…). EC2 has detected degradation of the underlying hardware hosting your Amazon EC2 instance (instance-ID: …) in the […] region. Due to this degradation, your instance could already be unreachable. After […] UTC your instance, which has an EBS volume as the root device, will be stopped.

You can see more information on your instances that are scheduled for retirement in the AWS Management Console

  • How does this affect you?
    Your instance’s root device is an EBS volume and the instance will be stopped after the specified retirement date. You can start it again at any time. Note that if you have EC2 instance store volumes attached to the instance, any data on these volumes will be lost when the instance is stopped or terminated as these volumes are physically attached to the host computer
  • What do you need to do?
    You may still be able to access the instance. We recommend that you replace the instance by creating an AMI of your instance and launch a new instance from the AMI.
  • Why retirement?AWS may schedule instances for retirement in cases where there is an unrecoverable issue with the underlying hardware.

Great to be notified of such things. That’s not the problem. What I find curious is that Amazon tosses the problem resolution entirely at their clients. Time and effort (cost) is required, even just create an AMI (if you don’t already have one) and restart elsewhere from that.
Could it be done differently? I think so, because it has been done for years. At Linode (for example). If something like this happens on Linode, they’ll migrate the entire VM+data to another host – quickly and at no cost. Just a bit of downtime (often <30mins). They’ll even do this on request if you (the client) suspect there is some problem on the host and would just like to make sure by going to another host.
So… considering how much automation and user-convenience Amazon produces, I would just expect better.

Of course it’s nice to have everything scripted so that new nodes can be spun up quickly. In that case you can just destroy an old instance and start a new one, which might then be very low effort. But some systems and individual instances are (for whatever reason) not set up like that, and then a migration like the one that Linode does is eminently sensible and very convenient….

Posted on
Posted on 2 Comments

6to4: Easing the IPv6 transition

With the exhaustion of IPv4 address space looming sometime in 2012; probably earlier rather than later, it makes sense to ease on into IPv6 land.  Without straying into tunnel broking and endpoint shenanigans 6to4 is a method of wrapping up IPv6 inside of IPv4.

6to4 performs three functions:

  1. Allocates an IPv6 address block to any host/network that has a global IPv4 address.
  2. Wraps up IPv6 packets inside IPv4 packets for transmission over IPv4 using 6in4 (traffic is sent over IPv4 inside IPv4 packets whose IP headers have the IP protocol number  set to 41; IPv6-in-IPv4. ) 6to4 makes use of IP protocol 41 too, but instead of static endpoints, the endpoint IPv4 address is sourced from IPv6 addresses within the IPv6 packet header.
  3. Routes traffic between 6to4 and “native” IPv6 networks.

As such its pretty easy to implement, especially on our good friend Debian (and its better looking cousin Ubuntu).

I am going to step through setting up a Debian host at Linode.

Step 1 Check your Kernel

Now, the first caveat is that you must be running a 2.6.20+ kernel (At the time of writing the latest linode kernel for Debian was : 2.6 Paravirt (2.6.34-x86_64-linode)). The default ‘Etch’ release kernel (2.6.18) supports IPv6 but woefully implements IPv6 stateful connection tracking, which is just not good enough for a decent firewall. If you have a look under your Linode Configuration Profile you can see what Kernel you are running, and change it to one that is supported; obviously a reboot would be in order if you change it. The linode kernels have IPV6 support compiled in.

But here is the quick way to check whether IPV6 is compiled in, if the following fails IPv6 is either not compiled in or the module has not been loaded:

 $ cat /proc/net/if_inet6
00000000000000000000000000000001 01 80 10 80       lo
fe80000000000000fcfd4afffecff19f 02 40 20 80     eth0

WARNING UBUNTU Pre 10.04LTS !!!

Modprobe is kind of janky and will stop your interfaces coming up if you follow this guide to the letter. You will need to do this first:

As root/sudo divert the old modprobe, this means that any subsequent upgrade won’t blow away your script

dpkg-divert --add --rename --divert /sbin/modprobe.real /sbin/modprobe

Create a replacement /sbin/modprobe script:

#!/bin/bash
/sbin/modprobe.real "$@"
ret=$?

if [ "$1" == "-Q" ] ; then
        exit 0
fi

exit $ret

Step 2 Calculate your new IPv6 address

Any IPv6 address that begins with the 2002::/16 prefix is known as a 6to4 address, as opposed to a native IPv6 address which does not use that prefix. The Internet Assigned Numbers Authority (IANA: www.iana.org) has set aside this address space just for 6to4. IPv6 addresses are assigned based upon your IPv4 address; for instance, 74.207.254.16 would become 2002:4acf:fe10::/48

We need some tools to help us calculate our IPV6 address, luckily there is a package for this

$ sudo apt-get install ipv6calc

Now its a matter of plugging in your IPv4 address into ipv6calc to determine your reserved IPv6 address range.

$ ipv6calc -q --action conv6to4 --in ipv4 74.207.254.16 --out ipv6

and voila your IPv6 address range appears:

2002:4acf:fe10:: (/48)

You get given an address range with a prefix length of 48 bits, which leaves room for a 16-bit subnet field and a 64 bit host address within the subnet.

Step 3 Update your interface configuration

You now need to edit your network configuration file /etc/network/interfaces file

auto tun6to4
iface tun6to4 inet6 v4tunnel
address 2002:4acf:fe10::1
netmask 16
gateway ::192.88.99.1
endpoint any
local 74.207.254.16 #fits address
auto tun6to4 # make sure this interface comes up on boot
 iface tun6to4 inet6 v4tunnel
 address 2002:4acf:fe10::1 #first host in this address range
 netmask 16
 gateway ::192.88.99.1 #special anycast address for 6to4 (2002:c058:6301::)
 endpoint any
 local 74.207.254.16
 mtu 1472 #The MTU is therefore the normal Ethernet MTU (1500) minus the headers used on the tunnel.
 ttl 255

Restart your interfaces (not recommended):

$sudo /etc/init.d/networking restart

If you want to be a little  but more careful and not wipe out all networking  if something goes wrong (eg you are using Ubuntu or IPv6 is not available), you could just bring up the new interface:

        $sudo ifup tun6to4

Step 4 Update IPv6 Firewall script/rules

Now it’s fairly important (read as critical) to firewall IPv6 stuff as it is with IPv4. Here is a small sample of a firewall that will at the very least not leave you hanging in the breeze. Needless to say you can add your own rules and make this as complex as you need.

# Initialize all the chains by removing all the rules
iptables --flush
iptables -t nat --flush
iptables -t mangle --flush
ip6tables --flush
ip6tables -t mangle --flush
# The loopback interface should accept all traffic
iptables -A INPUT  -i lo -j ACCEPT
iptables -A OUTPUT -o lo -j ACCEPT
ip6tables -A INPUT -i lo -j ACCEPT
ip6tables -A OUTPUT -o lo -j ACCEPT
#Allow IPV6 packets to come over the tunnel
iptables -A INPUT -p ipv6 -i eth0 -j ACCEPT
iptables -A OUTPUT -p ipv6 -o eth0 -j ACCEPT
# Allow outbound DNS queries from the FW and the replies too
iptables -A OUTPUT -p udp -o eth0 --dport 53 --sport 1024:65535 -j ACCEPT
iptables -A INPUT -p udp -i eth0 --sport 53 --dport 1024:65535  -j ACCEPT
ip6tables -A OUTPUT -p udp -o tun6to4 --dport 53 --sport 1024:65535 -j ACCEPT
ip6tables -A INPUT -p udp -i tun6to4 --sport 53 --dport 1024:65535  -j ACCEPT
# Accept and reply to ICMP ping
iptables -A OUTPUT -p icmp --icmp-type echo-request -j ACCEPT
iptables -A OUTPUT -p icmp --icmp-type echo-reply -j ACCEPT
iptables -A INPUT -p icmp --icmp-type echo-request -j ACCEPT
iptables -A INPUT -p icmp --icmp-type echo-reply -j ACCEPT
# IMPORTANT!!!! Allow all icmpv6 because they make IPV6 work
ip6tables -A OUTPUT -p icmpv6 -j ACCEPT
ip6tables -A INPUT -p icmpv6 -j ACCEPT
# Allow previously established connections
iptables -A OUTPUT -o eth0 -m state --state ESTABLISHED,RELATED -j ACCEPT
ip6tables -A OUTPUT -o tun6to4 -m state --state ESTABLISHED,RELATED -j ACCEPT
# Allow port 80 (www) and 51515 (SSH) connections to the firewall
iptables -A INPUT -p tcp -i eth0 --dport 51515 --sport 1024:65535 -m state --state NEW -j ACCEPT
iptables -A INPUT -p tcp -i eth0 --dport 443 --sport 1024:65535 -m state --state NEW -j ACCEPT
iptables -A INPUT -p tcp -i eth0 --dport 80 --sport 1024:65535 -m state --state NEW -j ACCEPT
ip6tables -A INPUT -p tcp -i tun6to4 --dport 51515 --sport 1024:65535 -m state --state NEW -j ACCEPT
ip6tables -A INPUT -p tcp -i tun6to4 --dport 443 --sport 1024:65535 -m state --state NEW -j ACCEPT
ip6tables -A INPUT -p tcp -i tun6to4 --dport 80 --sport 1024:65535 -m state --state NEW -j ACCEPT
# Allow port 80 (www) and 443 (https) connections from the firewall
iptables -A OUTPUT -j ACCEPT -m state --state NEW,ESTABLISHED,RELATED -o eth0 -p tcp -m multiport --dport 51515,80,443 -m multiport --sport 1024:65535
ip6tables -A OUTPUT -j ACCEPT -m state --state NEW,ESTABLISHED,RELATED -o tun6to4 -p tcp -m multiport --dport 51515,80,443 -m multiport  --sport 1024:65535
# Allow previously established connections
iptables -A INPUT -j ACCEPT -m state --state ESTABLISHED,RELATED -i eth0 -p tcp
ip6tables -A INPUT -j ACCEPT -m state --state ESTABLISHED,RELATED -i tun6to4 -p tcp
# The policy should be to drop it
iptables -A INPUT -j DROP
iptables -A OUTPUT -j DROP
iptables -A FORWARD -j DROP
ip6tables -A INPUT -j DROP
ip6tables -A OUTPUT -j DROP
ip6tables -A FORWARD -j DROP

I usually create a directory called /etc/iptables  (owner root:root  / permissions 750) and drop  firewall up and down scripts in there.

Then it is  a simple matter of adding the following scripts to the bottom of your eth0 interface definition stanza in /etc/network/interfaces to invoke them on boot or whenever:

pre-up /etc/iptables/firewall_up.sh
post-down /etc/iptables/firewall_down.sh
pre-up /etc/iptables/firewall_up.sh
post-down /etc/iptables/firewall_down.sh

IMPORTANT: Just a quick note don’t block icmpv6 because it is the glue that holds IPv6 together.

Step 5 Setup Forward DNS

I am not going to over explain this one because everyone has an opinion on how to setup DNS but in essence you need to add a line like this to your zone file. There are plenty of articles outlining this stuff.

hyosine			AAAA	2002:4acf:fe10::1

Step 6 Setup Reverse DNS

You now need to setup  reverse DNS for your address, so using our example of 2002:4acf:fe10 you will have to configure the zone of “0.0.0.0.0.1.e.f.f.c.a.4.2.0.0.2.ip6.arpa” in your name servers.  The zone should have PTR records for your hosts just like an in-addr.arpa zone for IPv4, but with hex digits of the IPv6 address backwards, separated by dots. Using our example, the 6to4 host will have a ::1 suffix, so a reverse DNS record looks like:

1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.1.e.f.f.c.a.4.2.0.0.2.ip6.arpa. PTR hyosine.openquery.com.

You will need to register this zone and its servers with the 6to4 reverse zone authority. eg https://6to4.nro.net/

Step 7  Test

The ping6 utility is probably best to test whether your host is now working. It’s probably best to try the IPv6 address first:


$ ping6 2002:4acf:fe10::1 
 PING 2002:4acf:fe10::1(2002:4acf:fe10::1) 56 data bytes 
 64 bytes from 2002:4acf:fe10::1: icmp_seq=1 ttl=60 time=1.59 ms 
 64 bytes from 2002:4acf:fe10::1: icmp_seq=2 ttl=60 time=1.42 ms 
1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0 PTR 6to4.example.com.
With that record inside the above zone, the full record would be

Now you can try with the DNS name you just setup.

1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.7.c.2.0.8.a.0.c.2.0.0.2.ip6.arpa. PTR 6to4.example.com.
$ ping6 hyosine.cloudcaster.com
PING hyosine.cloudcaster.com(2002:4acf:fe10::1) 56 data bytes
64 bytes from 2002:4acf:fe10::1: icmp_seq=1 ttl=60 time=1.41 ms
64 bytes from 2002:4acf:fe10::1: icmp_seq=2 ttl=60 time=1.34 ms

$ ping6 hyosine.openquery.com

PING hyosine.openquery.com(2002:4acf:fe10::1) 56 data bytes

64 bytes from 2002:4acf:fe10::1: icmp_seq=1 ttl=60 time=1.41 ms

64 bytes from 2002:4acf:fe10::1: icmp_seq=2 ttl=60 time=1.34 ms

Lastly, you need to register this zone and its servers with the 6to4 reverse zone authority. Note that when you visit that site, you’ll get an SSL certificate warning. This is normal. You need to visit this site using IPv6 from the actual 6to4 zone you’re trying to register. Follow the form to set up the nameservers for the zone and that’s it!

Posted on 2 Comments
Posted on

Dogfood: making our systems more resilient

This is a “dogfood” type story (see below for explanation of the term)… Open Query has ideas on resilient architecture which it teaches (training) and recommends (consulting, support) to clients and the general public (blog, conferences, user group talks). Like many other businesses, when we first started we set up our infrastructure quickly and on the cheap, and it’s grown since. That’s how things grow naturally, and is as always a trade-off between keeping your business running and developing while also improving infrastructure (business processes and technical).

Quite a few months ago we also started investing (mostly time) in the technical infrastructure, and slowly moving the various systems across to new servers and splitting things up along the way. Around the same time, the main webserver frequently became unresponsive. I’ll spare you the details, we know what the problem was and it was predictable, but since it wasn’t our system there was only so much we could do. However, systems get dependencies over time and thus it was actually quite complicated to move. In fact, apart from our mail, the public website was the last thing we moved, and that was through necessity not desire.

Of course it’s best for a company when their public website works, it’s quite likely you have noticed some glitches in ours over time. Now running on the new infra, I happened to take a quick peek at our Google Analytics data, and noticed an increase in average traffic numbers of about 40%. Great big auch.

And I’m telling this, because I think it’s educational and the world is generally not served by companies keeping problems and mishaps secret. Nasties grow organically and without malicious intent, improvements are a step-wise process, all that… but in the end, the net results of improvements can be more amazing than just general peace of mind! And of course it’s very important to not just see things happen, but to actively work on those incremental improvements, ongoing.

Our new infra has dual master MySQL servers (no surprise there 😉 but based in separate data centres so that makes the setup a bit more complicated (MMM doesn’t deal with that setup). Other “new” components we use are lighttpd, haproxy, and Zimbra (new in the sense that our old external infra used different tech). Most systems (not all, yet) are redundant/expendable and run on a mix of Linode instances and our own machines. Doing these things for your own infra is particularly educational, it provides extra perspective. The result is, I believe, pretty decent. Failures generally won’t cause major disruption any more, if at all. Of course, it’s still work in progress.

Running costs of this “farm”? I’ll tell later, as I think it’s a good topic for a poll and I’m curious: how much do you spend on server infrastructure per month?

Background for non-Anglophones: “eating your own dogfood” refers to a company doing themselves what they’re recommending to their clients and in general. Also known as “leading by example”, but I think it’s also about trust and credibility. On the other hand, there’s the “dentist’s tooth-ache” which refers to the fact that doctors are their own worst patients 😉

Posted on
Posted on 6 Comments

Your opinion on EC2 and other cloud/hosting options

EC2 is nifty, but it doesn’t appear suitable for all needs, and that’s what this post is about.

For instance, a machine can just “disappear”. You can set things up to automatically start a new instance to replace it, but if you just committed a transaction it’s likely to be lost: MySQL replication is asynchronous, EBS which is slower if you commit your transactions on it, or EBS snapshots which are only periodic (you’d have to add foo on the application end). This adds complexity, and thus the question arises whether EC2 is the best solution for systems where this is a concern.

When pondering this, there are two important factors to consider: a database server needs cores, RAM and reasonably low-latency disk access, and application servers should be near their database server. This means you shouldn’t split app and db servers to different hosting/cloud providers.

We’d like to hear your thoughts on EC2 in this context, as well as options for other hosting providers – and their quirks. Thanks!

Posted on 6 Comments