Friday, 17 August 2012

Choosing a new file backup "system"

"Hey you're a computer person.  How you backup your stuff?"

"I hate to admit it, but less often and in different ways than I should."
Yep, it's time to get going with rsync, or a package that uses it.  (This is a Linux blog, but Windows and Mac users will find the following useful as well).

In the past, I've duplicated important data to a secondary hard drive on occasion.  At some point, I've burned some of it to cd.  There are problems with this.

For those of you who aren't well-versed in data storage and backup theory:


  • backups should be done regularly
  • backups must be verifiable (a backup that isn't working is useless)
  • backups should probably be done automatically (or else we forget to to them and and cry in our pillows later)
  • backups should be done on a reliable medium (CDs/DVDs may not suffice, neither will the USB stick if you let your cat use it as his toy)
  • backups that save new stuff since the last backup are ok (aka differential backup), but occasionally should save the whole thing too
  • for the paranoid, back up on two different mediums in different locations (if you copy stuff from your hard drive to another hard drive, CD and USB stick and then house catches fire, you're still screwed)
For home use, you may want to consider tradeoffs for some circumstances.  You may accept some risk for the sake of convenience.  For example,
  • If you don't have much that's important, then maybe it's not worth effort of backing up off-site.  Acknowledge that the risk of the hard drive or CD dying is much higher than the computer and disc catching fire.
  • Using CDs for backup can be risky.  CDs/DVDs burned at home have a limited shelf life. Some authorities on the matter estimate 5-10 years before it disintegrates or data loss happens, but a crappy lesser-known branded disc will likely be even more susceptible to failure.  Sometimes you'll get lucky and it will last a long time, but you never know until you need to use it a few years down the road.  If you burn to disc, caveat emptor.
Don't worry, pressed discs from a manufacturer are different, they last a lot longer.  They're not made via the same methods you use at home.

Cloud/Internet storage raises other concerns a lot of people don't realize.  Whether you send something in email to yourself, save something in skydrive, google documents, or elsewhere, or use dropbox, beware the fine print and risk.

- you're trusting another entity to "keep it safe" for you
- some items are illegal to store (movies, music, etc.)
- some items you really shouldn't store (really private info)
- storage can cross international boundaries, which may be subject to other laws.  As a Canadian, I have to give this more consideration as much of my online presence resides on servers in the United States. 

Even if it doesn't cross boundaries, there may be other circumstances you don't expect (ie. a Slashdot user noted that Dropbox will unencrypt your data and hand it over to the law, possibly without a warrant.)

Even some cloud storage claims they encrypt (privatize) the contents so no one can see them, however, consider these places also scan the content to check for illegal content.  Think about that for a minute.  That means the data is private to the general public, but not to thousands that work at that company.  And if you picked a poor password, or if the company that has a security flaw that leaves them open to hacking, your stuff REALLY isn't private at all.  If you're considering internet/cloud, beware the risk (and make sure you pick a good password).

My recommendations for the average schmo:
  • copy exceptionally important information to a trusted family member, or to media you keep in a safe deposit box (A friend of mine uses Quickbooks, as does his Dad.  They swap backup files to each others' computers via USB stick on occasion.)
  • use tape media when possible (tape is the gold standard for industry - it's slow but resilient).  Remember still that tape is susceptible to magnetic fields, so keep it safe from anything that generates a field.
  • use hard drives over CDs.  If buying a hard drive, do your research first, some models for certain years are notorious for kicking the bucket way too early.  If buying CDs, buy brand-name, not some weird crap at the dollar store.  Actually, stay away from the dollar store for media unless you don't mind it being potentially temporary.  Ask yourself how the dollar store got it so cheap in the first place... that's right, either the factory doesn't have high standards, or the some batches frm a better-known company don't meet quality standards.
  •  use USB sticks if you're not a schmuck who always looses stuff.  Buy one that you keep attached to something, like your keys.  Buy one that has an integrated cap that you can't lose (protect the plug end when possible).  And for the Love of God, NEVER wiggle a USB stick when inserting it or pulling it out.  Usually the side that's labelled with a light or the USB log is mean to be up when you insert the thing.  If you break the plastic around the four metal prongs, you can easily end up bending the prongs and  shorting your computer.
The USB sticks that have "U3", "protection" or some advertised variant on them technically do protect the average person from finding your lost stick and accessing the contents.  However, they're hackable to anyone who knows what they're doing, and very annoying otherwise (unnecessary software installed on every computer it touches).  Some are so annoying they actually don't work on a machine you don't have administrative privileges to (i.e. your work computer).  So it's up to you, but I think they're crap.   Then again, I don't ever put anything too personal on my USB drive.  Nah, they're still crap.

  • Store less important stuff in the cloud.  I use Google Docs for some stuff, but it's for stuff I might need to work on elsewhere, but isn't too personal.  The most personal thing I had on there is a documentation about an apartment I used to have (I had to initiate terminating the rental contract.)  Since I was in the middle of moving, I decided creating that document on Google Docs was probably more secure than moving my computer to a new place and having it dropped, etc.  I needed the document accessible because I needed it quickly available in case of a dispute.  If it's moderate importance, store a copy on your computer back home every now and then.
  • I've only known one person to have one of the newer all-in-one hands-off backup systems.  It's basically a box you plug into you computer, and comes with some software.  She used it to back up her karaoke files, but then it seemed to die.  I'm not sure if she doesn't know how to use it, of if it actually died.  This system may be an option, but as I am unfamiliar with them and don't even know what brand/model she had, I have no opinion on their usefulness.

With any backup system, realize that it's not infallible.  Maybe we'll happen to pick a bad hard drive or bad DVD despite our research.  What you have to decide is how much effort you're going to put in to migitate any risks of the medium or methods you choose.

New Methods

I need to get serious about my methods, but I'm still a Canadian with a cheap Ukrainian background, so I won't be running out to buy a tape backup system.   Up to now, I'm been using what I call the PUP method (procrastinate until panic).

Current methods:
  • burned some music a few years ago to DVD
  • burned some old coursework I like to think I should keep to DVD (but haven't touched since then)
  • manually copied a personal 700+ entry LibreOfflice/OpenOffice database to my secondary hard drive a few months ago
  • some things I like to keep readily accessible are in the cloud.  None of it is too personal.
Incidentally, the database is what really precipitated this blog post.   I went to use it the other day and go the message 
"The connection to the data source "" could not be established. The driver class " could not be loaded".
SQL Status: HY000
The connection to the external data source could not be established. No SDBC driver was found for the given URL. 

It turned out the .odb database was corrupt; I had to use a tool to extract my table from the corrupt file.  I had to rebuild my SQL queries and forms, but at least I had an old backup for that.  If the tool hadn't worked, I would have lost 1/4 of my data table.

New methods:
  • move and organize files to one hard drive with a good file folder and tagging system
  • set up rsync or a rsync-based program to use my secondary hard drive to automatically back up those files
  • generate occasional reports that are emailed to me automatically so I know it's working well.
  • maybe burn some stuff on DVD and store it off-site somewhere for a duplicate backup
Next post will be on delving into rsync and possibly ready-to-go programs such as GAdmin-Rync, Grsync, luckyBackup, Unison, Back-In-Time, File Backup Manager, Deja Dup, Nepomuk etc.