Cold storage with Btrfs Sep 13, 2017
Although external 3.5” disks are no longer much in use here (too noisy, power hogs), I still have a bunch of 500G and 750G drives, as well as a <cough> “huge” 2T drive.
They are still useful for keeping old junk files around, off-line. But hard
drives fail, and when not used for a long time, they can even refuse to spin up.
Bye bye, old data …
Losing that data would not be the end of the world (everything of real value is still on my laptop, with triple rotating backups), but for nostalgic reasons I’d still like to hang on to it. One way to try and postpone problems is to occasionally copy disks to refresh their magnetic storage. Which is exactly what I’ve been doing these days.
So now I have a 2 TB drive, full of old stuff, and 2x 750G + 1x 500G of duplicate data. Good for at least another few years, I hope.
While waiting for all these copies to finish, I stumbled upon Btrfs (usually pronounced as “Butter FS”), a modern COW filesystem for Linux (bit like Apple’s upcoming APFS).
Btrfs has been in development for many years, but then again so has just about any file system. The demands on such software are very high - not only must it work properly under all circumstances, it also has to deal with a huge range of disk sizes and usage scenarios to be of general use. FWIW, Btrfs appears to be stable now.
The perfect setup?
Due to its design, Btrfs could be just the tool I’ve been waiting for. But instead of describing all the features (check its wiki), let me simply describe the setup I created.
Use case: I tend to collect lots of files, but usually I can’t make up my mind how to organise ‘em. So while the content is often stable, the folder structure on disk gets changed quite a bit over time - splitting up things before they become too large, and merging or renaming others.
In a nutshell: I want to save stuff and be reasonably confident it won’t get lost (or even accidentally deleted by me), without having to immediately decide on where to save it or how to name all the new info.
Here’s the setup I’ve created
- two external rotating 2.5” drives, each 2 TB, connected via USB 3
- formatted with Btrfs in RAID1 mode (no partition table, i.e.
/dev/sdX
) - this disk set is labeled “
KEEP-2T+2T
“ - the volume is mounted as
/archive/
- a “
KEEP
” subvolume is mounted as/keep/
- this is the main access path
In case you want to create a similar setup, I’ll describe the (few!) steps it took below, but let me start by describing the result:
- the
/keep/
area is for… keeping stuff - I can copy, move, delete everything on there at will, any time
- every once in a while, I create a new snapshot, with time-stamped names such
as “
/archive/20170910-2207
“ - snapshots are instant and read-only
One benefit of this approach, is that I can continue to reorganise “/keep/
” as
I like, without creating copies. In fact, even copies can be handled without
increasing disk space use (via cp --reflink
).
Due to the snapshots, no data ever gets lots. It’s always available in some snapshot, with a name which will never change again.
A hard disk failure requires immediate replacement, but this can be done without
taking the data off-line (btrfs replace
).
An occasional (background) btrfs scrub
will verify that all stored data is
readable, and will even “heal” bad files by replacing them with a copy from the
other disk.
Tools such as duperemove
and rmlint
can be used to scan for redundant file
segments and de-duplicate them (i.e. adopt a single copy on disk) -
periodically, on-the-fly, and transparently - to reduce disk usage.
There’s also snapper
to automate the process of taking periodic snapshots.
And lastly: I’m not auto-mounting these disks or keeping that machine powered up all the time (WOL is your friend). This data is easy to reach, even if mostly off-line.
Getting started
Here are the commands to set this up. Feel free to adapt as needed and season to taste.
Everything below is done as root (use “sudo -i
” to get a root shell):
Create a RAID1 Btrfs volume of two disks, connected as /dev/sdb
and
/dev/sdc
:
mkfs.btrfs -f -d raid1 -m raid1 /dev/sdb /dev/sdc
Create the mount points:
mkdir /archive /keep
Mount the main volume (if it doesn’t pick up both drives, run btrfs device
scan
):
mount /dev/sdb /archive
Create a subvolume, for snapshotting:
btrfs subvolume create /archive/KEEP
Mount that subvolume as well:
mount -o subvol=KEEP /dev/sdb /keep
That’s it, more or less. You can now place whatever you like on /keep/
. To
create a dated snapshot, use something like:
btrfs subvolume snapshot -r /keep /archive/`date +%Y%m%d-%H%M`
One last step is to make all this persistent, i.e. to simplify remounting after a reboot. First, let’s label the volume:
btrfs filesystem label /archive KEEP-2T+2T
Now let’s add these lines to “/etc/fstab
“:
LABEL=KEEP-2T+2T /archive btrfs noauto,noatime,user 0 0
LABEL=KEEP-2T+2T /keep btrfs noauto,noatime,user,subvol=KEEP 0 0
The noauto
flag prevents the volume from being mounted on startup, and - far
more importantly - so it won’t break if the drives happen to be disconnected at
boot time.
The noatime
flag disables saving file access times, which improves disk
performance (every change, even updating a timestamp, causes COW disk copying).
The user
flag allows you to mount and unmount these volumes without sudo
:
mount /keep
...
umount /keep
You only have to mount the /archive
area to access or create snapshots. For
day-to-day use, “mount /keep
” is all you’ll need.
There’s an excellent 17-min video by Bubba Lichvar, showing how simple Btrfs is to use.
Time will tell how well it performs, but so far I’m really pleased with this Btrfs setup. Safe and flexible cold storage - at last!