I spend a lot of time thinking about a data preservation and backup redundancy. It’s my job. I recently read Mat Honan’s data recovery post mortem to his well publicized hacking and it scared me into taking a few hours tonight and re-evaluating my personal backup strategy.
I have a Macbook Pro with a 256 GB SSD drive as my main drive and a 750 GB ATA drive in the DVD slot.
The partitions are laid out like this:
The 256 GB SSD is divided into
main (200 GB) and
bootcamp (50 GB). The 750 GB ATA is divided into
delorean (375 GB) and
media (375 GB).
|bootcamp||Windows 7 with Steam for playing games|
|delorean||Time Machine backup of
|media||Music, Movies, Archived Photos, Virtual Machines|
My initial thinking when I first got this laptop was “How do I mitigate against the SSD dying?”, so I split the 750 GB drive in half and dedicated a whopping 375 GB of it (
delorean) to being a backup of
main. The other half is my music collection and whatever movies and TV shows I’m watching at the time.
I’m pretty paranoid about my music collection because of the ridiculous amount of time I’ve spent acquiring and curating it, so it’s been replicated on a NAS and three other hard drives, including one at my parent’s house. I have an old image of
media and since the only thing that changes in that partition is me installing or removing games from Steam, that one is pretty covered.
Assuming that the backup strategy of
media is “roll the dice” and we’re okay with losing the data on it, what’s the most glaring problem here?
If the laptop is lost, stolen, or damaged, then I lose everything.
This scenario has already happened to me before. I had my laptop bag stolen which had both my laptop and the external hard drive that I backed it up to.
"The cloud" mostly saved me that time. My photos were on Flickr, my code was on Github, and my important documents were on Dropbox. Even the code that I hadn’t checked in yet was on our dev server.
I formatted an external hard drive I had lying around and made partitions of the same size as
media. Then I cloned both of them to it using SuperDuper and configured it to automatically sync when I connect my laptop to the hard drive. Since the hard drive is hooked up to my USB hub that has my keyboard and mouse, it syncs every time I dock my laptop.
I originally wanted to have Time Machine back up to it but Time Machine doesn’t natively support multiple drives, so you have to change the preferences to switch which drive it’s backing up to. In case it’s not obvious, a backup solution that has a manual component to it is not a real backup solution. Some people have gotten around this by changing the UUID of the drives to match so it thinks its the same drive, but I can’t do that because one of those drives is always in my machine.
The revised strategy is definite improvement, but it’s far from being complete. If my apartment is broken into and someone steals both my laptop and the hard drive, I’m screwed. If there’s a fire, I’m screwed. I should probably have two external hard drives and alternate them weekly and maybe even a third that I bring over to a friend’s house occasionally (although I guess you would really need four because when you go over there you would be swapping drives). This isn’t really a problem if you have an office because you can have an external drive there and one at home and you can configure SuperDuper to sync with both, but I’m currently working out of coffee shops.
I don’t use any cloud based backup systems for my hard drive like Backblaze, Mozy, or S3 (Jungle Disk, ARQ) because it seems like the restore option would take longer than just a normal rebuild, including Backblaze mailing you a disk.
An amusing thought about the lifecycle of a photo I take of a sandwich
If I’m sitting in a restaurant and I take a photo of a sandwich, this is what happens:
I unlock my iPhone, open up Instagram, take a photo, and post it.
Since I have Instagram linked to Foursquare, it creates a checkin on Foursquare and attaches the photo to the checkin.
Since I have Instagram linked to Flickr, it also posts it to Flickr.
I have IFTTT set up to copy all Flickr photos to Facebook, so it will also be posted to Facebook.
Meanwhile, I have Instagram set up to save both filtered and original photos to my Camera Roll. Dropbox monitors my camera roll and automatically sends a copy up to Dropbox.
My computer syncs with Dropbox so it pulls down a copy onto it’s hard drive. Because of my outlined revised backup strategy, Time Machine copies this to the secondary hard drive, and when I come home and dock my computer, SuperDuper syncs this to my external hard drive.
Meanwhile, back in the cloud, Recollect pulls a copy of the photo from Instagram, Flickr, and Foursquare, including the metadata. Eventually I will download my Recollect archive, which goes on my hard drive, which propagates to my secondary hard drive and external drive.
Confused? Here’s a diagram:
I think that sandwich photo is going to be okay.