Crashplan

[See Crashplan Part Two for the follow-up]

Three weeks ago I started to write a blog post about Crashplan. This is not how I expected it to turn out.

This is likely to be quite long, so I'll put the conclusions at the front, and then the information I've used to draw those conclusions follows.

If you're a Crashplan user (quite possibly because I've recommended it to you in the past) you need to be aware that.
  1. Previous versions of Crashplan have silently corrupted data that has been backed up.
  2. The team at Crashplan are aware of this. More recent versions of the software do not have this problem.
  3. However, more recent versions of the software do not fix, acknowledge, or in any way indicate that some of the files in the backup are corrupt.
  4. Crashplan support appear to wholly unconcerned with this in a manner that means I no longer have faith in the product or their support. I leave you to determine the course of action that's right for you.
With that out of the way, some background, and the events that lead me to the four points above.

I've been an enthusiastic user of the Crashplan backup software for something like two and a half years. I forget how I found it -- probably some blog post or mailing list -- but it seemed to me to be a great example of software that just works. It was flexible enough to handle my backup needs, and easy enough to use that I recommended it to family. friends, and work colleagues. I'm a paying customer, and have purchased Crashplan licenses to give to other people as gifts to encourage them to back up their important data safely.

So for more than two years my main computer at home has been backed up using Crashplan, initially to a locally attached USB drive, and latterly also to a colleague who I convinced to run Crashplan for his backup needs.

One of Crashplan's more useful features is that the software will auto-update, prompting you when a new version is released. So during this period I've very closely tracked whatever the most recent version of Crashplan is.

A couple of weeks ago I purchased a new PC, and the plan was once I'd gone through the somewhat tedious business of reinstalling my software, restoring all my data, and so forth I was going to decommission the old one. To that end, once the new PC was up and running one of the first things I did was install Crashplan on the new PC, make sure the old PC was 100% backed up to the USB drive, and then plug the USB drive in to the new PC.

When you do this, Crashplan can "attach" to the backup. Even though the files in the backup weren't from the new PC I just had to enter the password for the backup so it could decrypt them and restore them to the new PC. I thought this would be the simplest (and probably fastest) way of migrating my data from the old to the new PC.

I let Crashplan chug along doing the restore, which took several hours because of the volume of data. And then, at the end of the process, I saw a warning that 140 files have failed the "integrity check" during the restore, and couldn't be properly restored. All of them were digital photos.

Now this is a bit odd. One of the things that the Crashplan team champion on the website is the following claim:
Once your files are backed up, CrashPlan continuously checks that your files are 100% healthy and ready to restore when you need them. If it finds any problems, CrashPlan fixes them.
Source: http://www.crashplan.com/consumer/features.html
For me, this is a big benefit.  One of the things you should do when backing up data is periodically try and restore it, to ensure that the backup is actually working. The fact that Crashplan tries to do this in the background was an important part of choosing the software.

Now I knew the backup was complete -- I'd verified it before I unplugged it from the old PC, so this is one of those things that should just never happen.

I sent an e-mail to the Crashplan support address. This generated ticket #20145 in their queue, and my message went like this:
Hi,
I'm migrating to a replacement PC. I decided to migrate my data across by plugging the external hard drive that the original PC backs up to using Crashplan+, and then restoring from that archive on the new PC, running version 3.8.2010. 
29,742 files restored correctly. 140 failed, listing in the History tab as - Integrity check failed for  
First, it would be very useful if I could cut/paste the contents of the History tab. It would make it much easier to figure out which files I'll need to copy over by hand. 
Second, and much more importantly, I'm very concerned by this. From http://b1.crashplan.com/consumer/features.html:

Once your files are backed up, CrashPlan continuously checks that your files are 100% healthy and ready to restore when you need them. If it finds any problems, CrashPlan fixes them. 
This does not appear to have happened. How do I find out what went wrong in this instance, and how do I fix it?
About 5h30m later (which is, by the way, fine, we're in very different time zones, so that sort of response time is not only perfectly acceptable it's probably above and beyond what I would normally expect) I get a reply from Renee at Crashplan, asking if I can send logs from the destination computer, and instructions on how to do that. I do so, and over the course of a few days (a short vacation intervened) I send logs from the source computer (i.e., the one that's been doing all the backups over the last few years) as well.

A day and a half after I send the necessary logs I get a reply from Bret at Crashplan. He says:
Unfortunately these logs don't point to a clear source of this error. A copy of the restored file was preserved with a modified name; it may be useful for you to review this modified file and let us know if the file that was restored appears to be correct or is non-functional. For example, the following file: 
C:/Documents and Settings/Nik Clayton/My Documents/My Pictures/2006/2006 07 14 All Things Gothic/IMG_2301.JPG 
was restored to the following location: 
C:\Users\nik\Documents\My Documents\My Pictures\2006\2006 07 14 All Things Gothic\restore.failed-checksum.IMG_2301.JPG 
Can you attempt to open this file and verify that it is a well-formed JPEG file?
I do some digging, and reply about five hours later, with

It's not a valid file.  Windows Photo Viewer refuses to open it.

The restore.failed-checksum.* files have suspicious file sizes:

09/09/2008  19:42           786,432 restore.failed-checksum.IMG_2296.JPG
09/09/2008  19:42           786,432 restore.failed-checksum.IMG_2297.JPG
09/09/2008  19:42           786,432 restore.failed-checksum.IMG_2298.JPG
09/09/2008  19:42           786,432 restore.failed-checksum.IMG_2299.JPG
09/09/2008  19:42           786,432 restore.failed-checksum.IMG_2301.JPG
09/09/2008  19:42           786,432 restore.failed-checksum.IMG_2302.JPG
09/09/2008  19:42           786,432 restore.failed-checksum.IMG_2303.JPG
09/09/2008  19:42           917,504 restore.failed-checksum.IMG_2305.JPG
09/09/2008  19:42           917,504 restore.failed-checksum.IMG_2306.JPG
09/09/2008  19:42           786,432 restore.failed-checksum.IMG_2307.JPG
09/09/2008  19:42           917,504 restore.failed-checksum.IMG_2308.JPG
09/09/2008  19:42           786,432 restore.failed-checksum.IMG_2309.JPG
09/09/2008  19:42           786,432 restore.failed-checksum.IMG_2310.JPG
09/09/2008  19:42           655,360 restore.failed-checksum.IMG_2311.JPG

They're all exact multiples of 1,024, and far too small.  Compare and contrast with the same files that I restored by direct sync from the source PC to the target PC:

09/09/2008  19:42         2,999,745 IMG_2296.JPG
09/09/2008  19:42         3,029,664 IMG_2297.JPG
09/09/2008  19:42         3,102,390 IMG_2298.JPG
09/09/2008  19:42         2,923,048 IMG_2299.JPG
09/09/2008  19:42         2,939,522 IMG_2301.JPG
09/09/2008  19:42         3,077,000 IMG_2302.JPG
09/09/2008  19:42         2,707,091 IMG_2303.JPG
09/09/2008  19:42         3,478,028 IMG_2305.JPG
09/09/2008  19:42         3,509,851 IMG_2306.JPG
09/09/2008  19:42         2,627,625 IMG_2307.JPG
09/09/2008  19:42         3,169,280 IMG_2308.JPG
09/09/2008  19:42         2,859,546 IMG_2309.JPG
09/09/2008  19:42         2,924,675 IMG_2310.JPG
09/09/2008  19:42         2,518,022 IMG_2311.JPG
It goes quiet for two days, and then Matt Genelin takes over the ticket, saying:
Let me step in here. Thank you for the log files. After checking with several engineers on our staff, our best causation of the 140 files missing / corrupt is as follows: 
The 140 files stored on your external hard drive are inaccessible because they were stored with an older version of the CrashPlan Application that has a known issue with incorrectly checksum-ing stored files in a backup archive. We have corrected this issue in the last 12 months, and the current version of the CrashPlan Client Application backs up files with the correct checksum information. 
Moving forward here, the best recommendation we can make is: 
1. Restore your complete archive from your other backup destination (I believe this is [redacted]).
(verify that your restore is successful.) Then proceed to step 2: 
2. Shutdown the CrashPlan Backup engine on [redacted] like this:
http://support.crashplan.com/doku.php/recipe/stop_and_start_engine 
3. Erase, delete or replace the backup archive that is stored on your external drive named "Folder: External 320G". Simply perform a file copy from [redacted] to your external drive. 
Please note that since your external drive was created on 12/12/2008 and your archive on [redacted] was created on 2/7/2009, you will loose any file version information that was made between December 2008 and Feb. 2009. 
4. Restart (start) the backup on [redacted], again:
http://support.crashplan.com/doku.php/recipe/stop_and_start_engine
If this seems like an unreasonable fix to this issue, please let me know.
"[redacted]" was the name of the remote destination I also back up to -- since it's a colleague's name I've removed it from the above.

I should mention at this point that none of my data has been irretrievably lost. My original PC is still here, and with some faffing around I can retrieve the missing files from it (or download them from the [redacted] offsite backup). But that is purely by luck. If all my backups had the same problem, which is not an unreasonable assumption, this data (140 digital photos) would have been lost forever.

I wasn't sure that I'd quite understood Matt correctly. In particular, with the reference to an older version of Crashplan I thought that perhaps he'd misunderstood, and assumed that the backup I was restoring from was only created with an older version of Crashplan. So we had the following exchange. First, me:

While it is the case that I first started backing up to "External 320G" using an older version of Crashplan, the Crashplan version has been regularly (auto)updated since then.  The specific set of steps I carried out to do the restore was:

1.  Power up PC #1 (runs XP SP3, Crashplan+, and is the machine that "External 320G" has been plugged in to for the last few years).

2.  Verify (through the Crashplan UI) that Crashplan thinks that the backup of PC #1 to "External 320G" is complete.  This is using the latest version of Crashplan (3.8.2010) because it auto updated earlier in the month.

3.  Power down PC #1, power off the external drive, power up PC #2 (Windows 7), plug the external drive in to PC #2 and power it up.  Install the latest version of Crashplan from crashplan.com, import the backup from the external drive, and attempt the restore.

That then generated the checksum errors for 140 files upon restore.

Are you saying that backups that were started with the older version of Crashplan may have this problem, and that simply using the newer version is not sufficient to correct the issue -- the corrupt backups need to be wiped, and the backup started afresh?

I've just reviewed the release notes going back to 12.10.2008, and don't see this mentioned.
 Matt's reply:
Correct. The Backups being the backup archive on your External Drive. I am recommending: 
1. Verifying the [redacted] Backup.
2. Wiping the External Drive.
3. Coping the [redacted] backup archive over to the External Drive. 
Seem reasonable?
At this point I'm still not quite convinced that I have this right. In particular, he's not correcting my assertion that this is a problem they've known about, and fixed with no notice in the release notes, and no mechanism to fix existing-but-broken backups. After all, this is a company that sells backup software (and sells an optional service whereby they'll host external backups for you). They wouldn't be that cavalier about the integrity of their customers' data, would they?

So I replied:

Well, I don't need to do that, because I've moved the data from the old machine by other means -- restoring from the backup was (supposed) to be the simplest way to do this.

However, I want to make sure that I understand you correctly.  Are you saying that the following sequence of events:

1. Install Crashplan in 2008.
2. Tell Crashplan to backup to an external drive.
3. Let Crashplan autoupdate throughout 2008, 2009, and 2010, and continue to backup to the external drive throughout this period.

is sufficient to cause this corruption?  This was not an external backup that I created once using an old version of Crashplan, and then put away -- the external drive has been attached to this PC almost continuously, and Crashplan (from the earlier 2008 version to the most recent March 2010 version) has been backing up to it on pretty much a daily basis.

I must ask why Crashplan doesn't warn about this -- big red flashing letters saying "Warning: You created this backup with a version of Crashplan that had checksum errors.  You must delete this backup and start afresh".

Better still, why don't newer versions of Crashplan detect this and correct it automatically?  http://b4.crashplan.com/consumer/features.html is quite explicit:

Once your files are backed up, CrashPlan continuously checks that your files are 100% healthy and ready to restore when you need them. If it finds any problems, CrashPlan fixes them.

This does not appear to have happened here.

I'm very concerned that based on what I've been told so far it seems as though an older version of Crashplan corrupted my backup, you released a fixed version without noting the fix in the release notes, but the fixed version does not correct prior instances of the problem.

Right now I do not have a warm fuzzy feeling about continuing to trust Crashplan with my data.
 Matt's reply:
Yes, that is correct. This is what I am stating here. 
I am also explaining that an older version of CrashPlan has a known issue -- that has been corrected in our newer versions of the CrashPlan Client. This known issue appears to have passed our nightly archive maint. check: 
And only appears when you attempt to restore files. 
Normally our website is correct; once you back a file up, there is no need to worry about your files. In your case, it appears from your archive that some of your files were backed up with a version of the CrashPlan Client with a known issue, and the newer versions CrashPlan Client's nightly archive maint. did not detect the problem in your archive. The problem here surfaced when you went to restore your external drive's archive, that is 99.996% fine, but 0.004% corrupted. 
I am suggesting a course of action that brings you back to 100% fine, and throws away the archive that is 0.004% corrupted. 


I can understand. Your feelings on CrashPlan are a conclusion you will need to come to on your own. 
Let's keep in mind the facts here: 
* Only one of our multiple-destination archives is having issues here.
* The one archive that has issues restored 29,742 correctly and failed to restore 140 files. That's a failure rate of 0.004%. 
I agree -- this is not perfect. Perfection would be 100% data recovery. This is why CrashPlan Allows you to backup to multiple destinations. You should be able to achieve perfection of recovery by using your second archive; on your [redacted] computer.
A couple of points here. Matt's skipped over my "Why doesn't Crashplan warn about this, and/or fix the problem automatically?" question. He also seems to think that you can quantify the effectiveness of a backup solution by taking the number of files, and divide that by the number that failed to restore as some sort of useful metric. That takes no account of the relative importance of the files -- these were photos, and irreplaceable, nor the absolute volume of data lost.

He also assumes that I can restore the files from the [redacted] site. While that may be possible (and I haven't tried, I haven't needed to) that backup was created by taking a copy of my local backup archive and giving it to my colleague, so it's entirely possible that that archive has the same problem as my local one.

And finally, the Crashplan site is quite explicit, "100% healthy and ready to restore". There's no equivocating around some-number-of-9s availability. They claim 100%.

My final message to Matt asked:
1.  Will the next release of Crashplan detect this problem and fix it.
If not, when will it be fixed? 
2.  Why wasn't this problem called out in any of the release notes for versions released after the problem was detected?

3.  Will you inform existing customers of this problem, and the need to wipe and restart existing backups if they're older than date ?
All, I think, reasonable questions, which are ducked in Matt's final reply.
It has been a pleasure working with you. It's clear that the technical recommendation I have made for you will correct the issue at hand here, and that the quoting of text back and fourth is leading our conversation in a circle. I want to bring you to a place that moves you forward, and the best way to do this is to end our conversation now. 
I believe I have answered your questions repeatedly, and your questions are deviating away from solving your technical problem. By closing this conversation, I am hoping that you will take my recommendation in good faith, and apply it to your unique situation to move your backups with CrashPlan forward.
Looking back through this discussion those three questions are not answered.
  • There's no commitment that future versions of Crashplan will detect and fix this problem.
  • There's no answer as to why Crashplan weren't honest about this problem in the release notes of the software once they detected and fixed it.
  • And there's nothing to suggest that they'll inform existing customers of the problem.
So, if you're using Crashplan you should definitely make sure that your backup is 100% readable by the current client. And if it isn't you'll need to wipe it and start the backup from scratch again.

You might also want to start thinking about trusting your data to a different organisation; and in particular one that values honesty when it notices and fixes a mistake that leads to data loss.

Has anyone got any recommendations?

Useful Android Apps: Listen

Podcasts are something I only recently started to listen to. I can read information far faster than I can listen to it, so spending time listening to (what I imagined would be) not much more than someone reading their blog entries didn't really appeal. And then some friends started recommending a few podcasts, and I decided it was time to find out more about it.

I'll cheerfully admit that this can be a bit tricky on Android phones, certainly trickier than on an iPhone or iPod. There's only really one way to get audio on to an iPhone (and associated iPods), and that's via iTunes. I use iTunes to manage my own music library, but it doesn't sync to Android, so getting music and podcasts on to the phone involves a bit of drag-and-drop, and trying to remember what you have and haven't already synced. That's more effort than I really want to put in to ephemeral entertainment.

Enter Listen, from Google Labs. It lets you subscribe to, download, and listen to arbitrary podcasts without needing to sync the content from a desktop or laptop.


Once installed you'll see this screen. I can search for new podcasts to subscribe to, review the episodes I can listen to ("My listen items") and see the podcasts I'm subscribed to ("My subscriptions"), as well as see popular searches to find new and hopefully interesting podcasts. At the bottom of the screen you can see I'm part way though listening to an episode of "Wait Wait... Don't Tell Me".

 

Subscribing to new podcasts is easy. I can either search from the main screen, or tap through to "Manage my subscriptions" and enter a feed URL directly, as shown here.


Tapping "My listen items" shows me items I've queued up as well as items that are freshly downloaded. Listen normally plays through the queued items first, before moving on to fresh content, so you can control the order in which episodes are played by adding them to the queue. In this example many of these podcasts are new to me, so I've queued up a number of their back-episodes which I'm slowly working my way through.

 

Pressing through to any of the existing subscriptions allows you to unsubscribe, or queue entries, or mark them all listened or unlistened as necessary. Tapping on an episode lets you add it to the queue, listen to it immediately, and manage the subscription.


Actually playing an episode shows a screen like this, where as well as playing or otherwise navigating through the episode some of the options to manage the subscription to this podcast are also provided.


Finally, Listen has a number of thoughtful configuration options, which amongst other things can be used to ensure that you don't rack up a large bill downloading content without a WiFi connection, or consume too much power doing so.

Listen certainly makes it easy to keep up with your podcasts. It's very refreshing to start the morning commute knowing that there's almost certainly something new to listen to that's come in overnight, and the UI does a good job of getting out of your way.

In a stroke of genius, or idiocy, depending on which way you look at it, recent versions of Listen store your podcast feed subscriptions in a folder in Google Reader. This has the distinct advantage that your subscriptions are not tied to the phone, they're tied to your Google account, so they'll follow you as you upgrade your phone, and you can use Reader as another client to manage your subscriptions. You can also use the recommendation feature in Reader to find new podcasts to listen to. The downside to this is that if you're not careful you may inadvertently mark one of these podcasts as 'read' in Reader, at which point Listen will ignore it.

The only thing I still have to get used to is the notion that these aren't automatically available on my home media system. The rest of my audio collection is, and so at the moment I've subscribed to these podcasts twice, once to download so that I can play them when I'm at home, and again for when I'm on the move.

Listen is available for free. Search for it in Android Market, or scan this QR code with your phone to install it. And for more details (including FAQ and discussion groups) see http://listen.googlelabs.com/.



And if you'd like to listen to what I'm listening to, here are the podcasts I'm subscribed to at the moment.

Useful Android Apps: NewsRob

Do you have an Android phone? I've been using one for about 15 months now -- first it was a G1, and now the Google1 Nexus 1 -- and in that time I've discovered a number of applications that have proven indispensable, and over the coming entries I'll share my recommendations.

The first of these is NewsRob. It's a Google Reader client, syncing the content of your news feeds from Google Reader to the phone. It can also sync the web page content of the articles so that they are easily accessible when you're out and about -- handy for RSS feeds that don't include the full content of a post in the feed, or where you want to see comments that were left on the post on the original site.

With a slick, simple UI that gets out of the way, and a wealth of battery and wallet friendly syncing options ("only on WiFi", "only when charging", and so on) that can be set on a per-feed basis if necessary it's easy to stay in control of your feeds, and NewsRob can also share and star interesting posts via Google Reader.

 

On opening NewsRob you're presented with a list of your Google Reader top-level categories that have unread items. Tapping a category (in this case, "tech news") shows you the web site sources in that category that contain unread items. Note the handy "all articles" items at the top of the list, making it easy to just work through all unread articles, or all unread articles in a particular category.

 

On the left is the default article view, showing the article from the selected source. Notice the lack of UI elements for moving through the articles. Tapping an unused portion of the screen brings up 'floating' arrows that allow you to navigate the feed, without constantly consuming precious screen real estate. Notice, too, the star in the top navigation bar. Tapping that marks this entry with a star in Google Reader.

Tapping the article title takes you to the article as it appears on the originating web site, the right-hand screen shot in this example. This is done within NewsRob, your default web browser is not launched, so you can jump to the original site, read the content in context there while still being able to star the item, or use the floating UI to quickly move to the next article.

NewsRob is available for free (ad-supported), or you can pay for the ad-free version. Search for it in Android Market, or scan this QR code with your phone to install it.

NewsRob QR Code

1 My employer. This and other entries are my opinion, not theirs.