Cold storage with Btrfs Sep 13, 2017

Although external 3.5” disks are no longer much in use here (too noisy, power hogs), I still have a bunch of 500G and 750G drives, as well as a <cough> “huge” 2T drive.

They are still useful for keeping old junk files around, off-line. But hard drives fail, and when not used for a long time, they can even refuse to spin up. Bye bye, old data …

Losing that data would not be the end of the world (everything of real value is still on my laptop, with triple rotating backups), but for nostalgic reasons I’d still like to hang on to it. One way to try and postpone problems is to occasionally copy disks to refresh their magnetic storage. Which is exactly what I’ve been doing these days.

So now I have a 2 TB drive, full of old stuff, and 2x 750G + 1x 500G of duplicate data. Good for at least another few years, I hope.

While waiting for all these copies to finish, I stumbled upon Btrfs (usually pronounced as “Butter FS”), a modern COW filesystem for Linux (bit like Apple’s upcoming APFS).

Btrfs has been in development for many years, but then again so has just about any file system. The demands on such software are very high - not only must it work properly under all circumstances, it also has to deal with a huge range of disk sizes and usage scenarios to be of general use. FWIW, Btrfs appears to be stable now.

The perfect setup?

Due to its design, Btrfs could be just the tool I’ve been waiting for. But instead of describing all the features (check its wiki), let me simply describe the setup I created.

Use case: I tend to collect lots of files, but usually I can’t make up my mind how to organise ‘em. So while the content is often stable, the folder structure on disk gets changed quite a bit over time - splitting up things before they become too large, and merging or renaming others.

In a nutshell: I want to save stuff and be reasonably confident it won’t get lost (or even accidentally deleted by me), without having to immediately decide on where to save it or how to name all the new info.

Here’s the setup I’ve created

In case you want to create a similar setup, I’ll describe the (few!) steps it took below, but let me start by describing the result:

One benefit of this approach, is that I can continue to reorganise “/keep/” as I like, without creating copies. In fact, even copies can be handled without increasing disk space use (via cp --reflink).

Due to the snapshots, no data ever gets lots. It’s always available in some snapshot, with a name which will never change again.

A hard disk failure requires immediate replacement, but this can be done without taking the data off-line (btrfs replace).

An occasional (background) btrfs scrub will verify that all stored data is readable, and will even “heal” bad files by replacing them with a copy from the other disk.

Tools such as duperemove and rmlint can be used to scan for redundant file segments and de-duplicate them (i.e. adopt a single copy on disk) - periodically, on-the-fly, and transparently - to reduce disk usage.

There’s also snapper to automate the process of taking periodic snapshots.

And lastly: I’m not auto-mounting these disks or keeping that machine powered up all the time (WOL is your friend). This data is easy to reach, even if mostly off-line.

Getting started

Here are the commands to set this up. Feel free to adapt as needed and season to taste.

Everything below is done as root (use “sudo -i” to get a root shell):

Create a RAID1 Btrfs volume of two disks, connected as /dev/sdb and /dev/sdc:

mkfs.btrfs -f -d raid1 -m raid1 /dev/sdb /dev/sdc

Create the mount points:

mkdir /archive /keep

Mount the main volume (if it doesn’t pick up both drives, run btrfs device scan):

mount /dev/sdb /archive

Create a subvolume, for snapshotting:

btrfs subvolume create /archive/KEEP

Mount that subvolume as well:

mount -o subvol=KEEP /dev/sdb /keep

That’s it, more or less. You can now place whatever you like on /keep/. To create a dated snapshot, use something like:

btrfs subvolume snapshot -r /keep /archive/`date +%Y%m%d-%H%M`

One last step is to make all this persistent, i.e. to simplify remounting after a reboot. First, let’s label the volume:

btrfs filesystem label /archive KEEP-2T+2T

Now let’s add these lines to “/etc/fstab“:

LABEL=KEEP-2T+2T  /archive  btrfs  noauto,noatime,user              0 0
LABEL=KEEP-2T+2T  /keep     btrfs  noauto,noatime,user,subvol=KEEP  0 0

The noauto flag prevents the volume from being mounted on startup, and - far more importantly - so it won’t break if the drives happen to be disconnected at boot time.

The noatime flag disables saving file access times, which improves disk performance (every change, even updating a timestamp, causes COW disk copying).

The user flag allows you to mount and unmount these volumes without sudo:

mount /keep
...
umount /keep

You only have to mount the /archive area to access or create snapshots. For day-to-day use, “mount /keep” is all you’ll need.

There’s an excellent 17-min video by Bubba Lichvar, showing how simple Btrfs is to use.

Time will tell how well it performs, but so far I’m really pleased with this Btrfs setup. Safe and flexible cold storage - at last!

Weblog © Jean-Claude Wippler. Generated by Hugo.