Three weeks ago I started to write a blog post about Crashplan. This is not how I expected it to turn out.
This is likely to be quite long, so I'll put the conclusions at the front, and then the information I've used to draw those conclusions follows.
If you're a Crashplan user (quite possibly because I've recommended it to you in the past) you need to be aware that.
- Previous versions of Crashplan have silently corrupted data that has been backed up.
- The team at Crashplan are aware of this. More recent versions of the software do not have this problem.
- However, more recent versions of the software do not fix, acknowledge, or in any way indicate that some of the files in the backup are corrupt.
- Crashplan support appear to wholly unconcerned with this in a manner that means I no longer have faith in the product or their support. I leave you to determine the course of action that's right for you.
I've been an enthusiastic user of the Crashplan backup software for something like two and a half years. I forget how I found it -- probably some blog post or mailing list -- but it seemed to me to be a great example of software that just works. It was flexible enough to handle my backup needs, and easy enough to use that I recommended it to family. friends, and work colleagues. I'm a paying customer, and have purchased Crashplan licenses to give to other people as gifts to encourage them to back up their important data safely.
So for more than two years my main computer at home has been backed up using Crashplan, initially to a locally attached USB drive, and latterly also to a colleague who I convinced to run Crashplan for his backup needs.
One of Crashplan's more useful features is that the software will auto-update, prompting you when a new version is released. So during this period I've very closely tracked whatever the most recent version of Crashplan is.
A couple of weeks ago I purchased a new PC, and the plan was once I'd gone through the somewhat tedious business of reinstalling my software, restoring all my data, and so forth I was going to decommission the old one. To that end, once the new PC was up and running one of the first things I did was install Crashplan on the new PC, make sure the old PC was 100% backed up to the USB drive, and then plug the USB drive in to the new PC.
When you do this, Crashplan can "attach" to the backup. Even though the files in the backup weren't from the new PC I just had to enter the password for the backup so it could decrypt them and restore them to the new PC. I thought this would be the simplest (and probably fastest) way of migrating my data from the old to the new PC.
I let Crashplan chug along doing the restore, which took several hours because of the volume of data. And then, at the end of the process, I saw a warning that 140 files have failed the "integrity check" during the restore, and couldn't be properly restored. All of them were digital photos.
Now this is a bit odd. One of the things that the Crashplan team champion on the website is the following claim:
Once your files are backed up, CrashPlan continuously checks that your files are 100% healthy and ready to restore when you need them. If it finds any problems, CrashPlan fixes them.For me, this is a big benefit. One of the things you should do when backing up data is periodically try and restore it, to ensure that the backup is actually working. The fact that Crashplan tries to do this in the background was an important part of choosing the software.
Now I knew the backup was complete -- I'd verified it before I unplugged it from the old PC, so this is one of those things that should just never happen.
I sent an e-mail to the Crashplan support address. This generated ticket #20145 in their queue, and my message went like this:
About 5h30m later (which is, by the way, fine, we're in very different time zones, so that sort of response time is not only perfectly acceptable it's probably above and beyond what I would normally expect) I get a reply from Renee at Crashplan, asking if I can send logs from the destination computer, and instructions on how to do that. I do so, and over the course of a few days (a short vacation intervened) I send logs from the source computer (i.e., the one that's been doing all the backups over the last few years) as well.
A day and a half after I send the necessary logs I get a reply from Bret at Crashplan. He says:
I do some digging, and reply about five hours later, with
It goes quiet for two days, and then Matt Genelin takes over the ticket, saying:
"[redacted]" was the name of the remote destination I also back up to -- since it's a colleague's name I've removed it from the above.
I should mention at this point that none of my data has been irretrievably lost. My original PC is still here, and with some faffing around I can retrieve the missing files from it (or download them from the [redacted] offsite backup). But that is purely by luck. If all my backups had the same problem, which is not an unreasonable assumption, this data (140 digital photos) would have been lost forever.
I wasn't sure that I'd quite understood Matt correctly. In particular, with the reference to an older version of Crashplan I thought that perhaps he'd misunderstood, and assumed that the backup I was restoring from was only created with an older version of Crashplan. So we had the following exchange. First, me:
At this point I'm still not quite convinced that I have this right. In particular, he's not correcting my assertion that this is a problem they've known about, and fixed with no notice in the release notes, and no mechanism to fix existing-but-broken backups. After all, this is a company that sells backup software (and sells an optional service whereby they'll host external backups for you). They wouldn't be that cavalier about the integrity of their customers' data, would they?
So I replied:
A couple of points here. Matt's skipped over my "Why doesn't Crashplan warn about this, and/or fix the problem automatically?" question. He also seems to think that you can quantify the effectiveness of a backup solution by taking the number of files, and divide that by the number that failed to restore as some sort of useful metric. That takes no account of the relative importance of the files -- these were photos, and irreplaceable, nor the absolute volume of data lost.
He also assumes that I can restore the files from the [redacted] site. While that may be possible (and I haven't tried, I haven't needed to) that backup was created by taking a copy of my local backup archive and giving it to my colleague, so it's entirely possible that that archive has the same problem as my local one.
And finally, the Crashplan site is quite explicit, "100% healthy and ready to restore". There's no equivocating around some-number-of-9s availability. They claim 100%.
My final message to Matt asked:
If not, when will it be fixed?
2. Why wasn't this problem called out in any of the release notes for versions released after the problem was detected?All, I think, reasonable questions, which are ducked in Matt's final reply.
3. Will you inform existing customers of this problem, and the need to wipe and restart existing backups if they're older than date
Looking back through this discussion those three questions are not answered.
- There's no commitment that future versions of Crashplan will detect and fix this problem.
- There's no answer as to why Crashplan weren't honest about this problem in the release notes of the software once they detected and fixed it.
- And there's nothing to suggest that they'll inform existing customers of the problem.
You might also want to start thinking about trusting your data to a different organisation; and in particular one that values honesty when it notices and fixes a mistake that leads to data loss.
Has anyone got any recommendations?