Fastmail.FM Sux0rs
So, my email over at Fastmail.FM just went down for three days. Note that I pay these people $100 a year to give me essentially the same service as Gmail. I have continued with them simply because I still haven’t yet grokked Google’s don’t-sort-it-we’ll-search-it-for-you way of dealing with email.
Their explanation?
The drives we use have a guaranteed lifetime of 3 years and were only 15 months old. Given that RAID 6 can support up to 2 drives in an array failing, the chance of any 2 drives failing at the same time is an extremely rare occurrence. However in this particular case, 3 drives all failed within a remarkably short period of time! At that moment, we had effectively lost access to all data on the unit, and had to resort to our disaster recovery scenario, our daily incremental backups.
“Extremely rare,” indeed. Given their ratings, the chances that three of these drives would fail within the same six-hour period is 1 in 84,000,000,000. That’s assuming an even distribution in failure probabilities. If failure is more likely at the end of a duty cycle (as expected), their explanation is even less plausible. As soon as a Google offers IMAP support and assurances that their nascent AI isn’t feeding on my email, my hundred dollars is going there.
I know my probability calculations are wonky, btw (i.e. 3 drives failing together vs. 3 drives out of X drives failing together.) Suffice it to say that it was unlikely thing.
Related Posts:
- The Man Watching (April, 2008)
- More Notes on the Helio (July, 2007)
- The Asswipes Abroad (July, 2008)
- Firefox 3 Autocompletion (June, 2008)
- Dad’s Letter to Barack (July, 2008)

November 16th, 2005 at 12:10 pm
Didn’t this happen at 43 a few years back? I thought 3 of 4 drives in Ben’s RAID array crapped out? Maybe not as rare as presumed.
November 16th, 2005 at 12:19 pm
I suppose not. In reality, there’s often a common, event-based cause for the failures (like a power surge or someone pissing in the server.) The probabilities I calculated were for random failure due to operation stresses. So, I guess my accusation is that the Fastmail.FM sysadmins pissed in their server racks or something.
Certainly, that’s what happened at 43