Music Library Management Chapter Four: Securing
Over many years of music curation, you may have purchased, stored and re-organised thousands of albums, taking many hours, days or even weeks of total work. It's therefore important to consider how your music library can be secured; that is, how you can stop accidental or intentional deletion or vandalism of your music library and, if the worst happens, how you can restore your music to its former glory.
This is the fourth in a series of posts serializing the Music Library Management ebook.
Threats to your music library come in many forms. The most obvious are those you hear about most often; virii and hacking with the intention to infect and possibly destroy your music files. It's also worth considering how 'idiot proof' you can make your music library, because it's also possible to cause damage to your collection yourself in weaker moments, whether you are just tired or if it's just "one of those days". There's also the possibility of hardware failure. Finally, because the worst can happen no matter how robust your plans, the backstop of data redundancy, otherwise known as backups, are the absolute minimum consideration you should take.
What's the worst that can happen?
Thinking through the scenarios that can befall your music library, there are a number of possibilities that you should take steps to guard against.
Perhaps most obvious is the mass deletion of your music files. This is most common when the result of an accident, that "whoops" moment when you ignore a warning and say you really do want to delete that folder. It's only a few seconds later you realise that folder contained your music...
A common theme, though, is that the most obvious issues are not the ones you need to be careful about. Indeed, it's the insidious, partial and gradual destruction of your music library that you should be watchful for, precisely because its inconspicuousness leads to a lack of action and resolution on your part.
For this reason, partial deletion of files and the slow corruption of file contents are evils you need to watch out for. This can have many causes such as intentional or non-intentional effects of software, hardware failure and more.
It's not just your music files either. Your home music network is also made of music players that have their own configuration, normally separate to your files. This may contain music player specific information that you want to keep, such as play counts. It's also worth noting that music players normally operate their own music library database, their own internal view of your music library. This can sometimes help to mislead you as to the health of your music library, because if files are deleted it may take a while for this to be reflected in your music player, or at least until the next time it is synchronised against your library.
Stopping intruders, human and otherwise
Human intrusion into your home music network is a real possibility. In today's Internet connected world, your home music network is probably connected, at least physically, to the wider Internet. That's for good reason. Internet connectivity improves our music consumption even more by providing access to information about our favourite artists and albums. The Internet is the source of the information discussed in the information chapter, so it's invaluable for organising your library.
You need to sit down and think about all of the possible sources of attack, and how they can access your music.
The first place to start are common computer interfaces. Ask yourself whether someone, for example, could log into your computer from outside your network? Once logged in, it's possible that an intruder could corrupt, steal or delete your music library. The possible routes into your network depend on the computers and devices on the network. A standard desktop or laptop will have familiar login screens, but NASes can also be accessed, as can music servers.
A three step plan for minimising intrusion into your network is thus:
- Block the paths into your network with firewalls
- Enforce adequate authorisation at each stage of your network
- Configure access control to your most precious files
Blocking the paths in your network means stopping network traffic before it even gets to the point where it could potentially access your music. The most absolute way of all to achieve this is to unplug your Internet connection, but knowing this is inpractical in most cases the next best is to install a firewall. Your router (your connection to the Internet) probably has a firewall, and it's probably pre-configured to disallow outside traffic to initiate a connection into your network. This is what you want. If you don't know whether you have a firewall or what its rules are, find them out now.
So if you've blocked incoming traffic, that's everything sorted, right? Wrong. Human threats to your music library can exist within your network too, from your family members or people visiting the house. Hopefully these attacks are not intentional (if they are you are reading the wrong book) but mistakes can be just as costly as attacks.
The next step, then, is to ensure access to your network is authorised and that operations on your music library (such as 'delete') are authenticated. This means someone must log in successfully to be identified as a given user, and your music library must be guarded as such that only certain users are able to access or change (or delete) your music. In computing, this is known as 'access control' and the standard ways of achieving it are via 'file permissions'.
Any given file or folder can be granted permissions which say "WHO can perform WHAT operation on WHICH files", e.g. "Sally can read and write all music files" and "Brian can only read all music files". In computing, 'read' access means the files can be opened and their contents inspected. This is required for listening to music. 'Write' access means the ability to update a file's or folder's contents. This is required for tagging, file-level re-organisation and adding or removing music.
That covers the standard computer interfaces: login and the file system. You should also consider other applications that have access to your music and are capable of altering your collection. Many pieces of software provide Web interfaces, for example. Without authorisation (such as a username and password challenge) intruders may use such software to attack your music library. The software may provide ways of deleting files or changing tags in files which could damage your music's organisation.
Also consider how synchronisation of your music library works. If you synchronise music from mobile players, for instance, synchronising with a music player without much of your music may delete your library if it is considered the 'authority copy'. This could happen when people visit and you attempt to synchronise some of their music with your own.
It's not just malicious humans or software that can cause trouble and wreck your your hard work. Sometimes, accidents can occur with the best of intentions. Humans are liable to make mistakes, press the wrong key and wipe out data; and software can sometimes go berserk, deleting or corrupting data.
Human induced accidents come in different flavours but they all have one thing in common: to the computer or device on which your music files reside the human's behaviour looks intentional. Put another way: computers don't understand "whoops". Considering what problems can befall your collection works in a similar way as when thinking through the entrance points for malicious attack: start at the lowest level of abstraction (in this discussion, probably the music files themselves) and then think about what harm other services and applications could cause.
As mentioned, some accidental damage can often be quite subtle, and not show up immediately. One example is character encoding and transfer between different devices. Our rich world of music means that artists may have names with characters not in the 'standard' western character sets. Over time, computers have started to support these different character sets, but not always in the same way. Transferring files from one device to another sometimes loses this information, particularly when the transfer is via USB keys and the like which are often formatted with lowest common denominator filesystems. Your folder and file names which used to look read like Esbjörn_Svensson_Trio now reads Esbj?rn_Svensson_Trio or similar.
Just as with malicious attack, accidental problems can occur on different levels, the solutions can be developed on the same level, and thankfully the solution for malicious attack is often the same as that for accidental. Employing user authentication so that any given individual can be identified, and then using access control so various actions can be enabled/disabled on a per-user basis is a sensible approach. Making sure the ability to delete or even simply update files is only possible from a special account is a useful way to stop accidental damage on a day to day basis.
These approaches can support the prevention of nasty accidents. But what if a calamity does befall your music library? The absolute minimum you should do is to take regular, ideally automated (so they stay regular) backups. Read below for more on these.
Entropy and the dead hard drive
It's tempting to think of one of the benefits of digital music as being one of permanence. Unlike the days of vinyl, bits and bytes never wear out, right?
Well, conceptually speaking that's correct. But just considering the low level data stream as the be all and end all of your music collection is a mistake. There's much more to your musical enjoyment than that, and, put simply, each bit of it can break.
Most obviously, there's hardware failure. Music players can die and the various components of your music server can degrade over time. Probably the most common hardware failure is a hard drive that has come to the end of its life. Hard drives are mechanical devices and so inevitably fail, but that failure can occur abruptly or gradually over time.
The back-stop to protect you against hardware failure is redundancy. Redundancy is basically having duplicated devices, components so one can take over when another fails.
At this point it's important to consider practicalities. It's unlikely to be practical to have redundant music players ready to go if they cost a month's salary. Similarly for music servers... and you also have to consider the cost of maintaining the redundant copy. For instance, if you add a new piece of software to one server, you need to do it for your redundant servers too.
It's best to consider the cost of failure of your music system and marry that to the cost involved, both financial and effort, in maintaining redundant components. In most home music networks, this will lead to little, if any, redundancy because the cost is prohibitive to do otherwise. The most common forms of redundancy in these cases are low level redundancy for aspects of hardware, for instance using RAID to improve hard drive availability, and data redundancy in the case of backups. A discussion of RAID is beyond the scope of this book.
A recovery plan
I've said it before, but data backups are essential for a home music network. They are your ultimate saviour when things go wrong, whether it's hardware failure, errant software or malicious damage caused by hackers. Keeping a backup of your music library is something all digital music collectors should do. But what's the best way?
There are a number of approaches, but the underlying truism is that something is better than nothing,
On-site or off-site?
The minimum acceptable backup strategy is to make copies of your music library onto DVD, a removable hard disk, or some other removable media and store it in the same building as your music library.
If you store your backups on the same computer or on the same hard drive you do not protect yourself from many of the potential problems and disasters that may occur. Storing elsewhere in the same building at least protects from hard drive failure and the like, although it doesn't protect from something happening to the house (fire, flood, burglary etc).
Nowadays, more and more 'cloud' backup providers have appeared to make storing your music library online easier and easier (in case you were wondering, 'cloud' is really just a modern buzzword for the use of the Internet in more transparent, pervasive ways). You can choose generic storage providers or specific backup providers. Of course, this is not free. You incur an ongoing data cost and also sometimes bandwidth cost in the initial upload. The costs are relatively low, however, and having off-site backup is the gold standard, protecting you from yet more risks.
I mentioned in chapter 1 how lossless music is the best choice for serious music collectors. Well, uploading lossless music files to online backup is likely to take a long time because of the size of the files. You may find you have to use on-site backups until upload speeds improve, and data storage costs lower still further.
Automating your backups
It's better to automate your backups. The minimum backup strategy is to manually copy your music library from one source to a target backup location. But having to remember to do this is a pain, and if there's one thing humans are not good at it's committing to perform repetitive tasks. All too soon you'll begin to forget to take the backups and you'll gradually have a lower quality set of backups to use in the unfortunate case that disaster strikes.
It's best to set up automatic, scheduled tasks to backup your music library. This is possible on most computers. On Windows, look at Scheduled Tasks, and on Linux look at cron or anacron. There's a specific solution for Macs called Time Machine. These just cover the basic services provided by the OS however; there are further products available for purchase that perform similar functions.
How soon is now?
You should backup your collection on a schedule related to how often you update it. If you use automated backups, it's likely that this decision is low cost, as such, because the backups are performed for you. Aiming for once per week, though, is probably good enough for most music collectors.
If you use off-site backups you should watch out for high bandwidth costs. Depending on the way your backups are modelled (see below) then backing up too often may use too much bandwidth and off-site storage space. In these cases, you should look for solutions that store only the difference between two different sets of backups. Such approaches are normally only supported by specific backup software.
Modelling your backups
Consider two main approaches to storing backups: mirrors and grandparent, parent, child.
Mirrors are the most simple approach. Simply copy the music library to the backup location, and store a certain number of previous backups. It's useful to keep a few old copies in case problems are introduced into backups which are then stored; you should then always be able to trace back until you find a copy where the problem does not exist.
Mirrors are a good first step, but they do have high bandwidth and storage costs. An alternative is to take differences between the backups.
Taking differences between backups saves on a lot of space. Consider the following music collection:
The Beatles/ Abbey Road/ Revolver/ The Rolling Stones/ Let It Bleed/
If we store mirrors, the exact same data will be backed up each time, regardless of whether it has changed. If we store differences, however, if the library is unchanged there is nothing to be backed up, saving lots of space. If a new album is purchased:
The Beatles/ Abbey Road/ Revolver/ The Beatles/ The Rolling Stones/ Let It Bleed/
Then we need only store the additional album. This applies to files just as it applies to entire albums. If we change the tags in one of the tracks for one of the existing album tracks, that update too will be backed up. Only this time, only the difference inside the file which encapsulates the tagging change will be stored.
Any way you look at it, storing differences saves space.
Keeping it in the family
Grandparent, parent, child is a way of structuring your backups to keep multiple old copies of backups but have the most recent ones using differencing to lower storage cost.
First, decide your backup period. Let's say we backup once per week. Next, decide where the 'generations' fall. The parent backup can be designated once per month, and the grandparent once per year. On the same day each year the grandparent is taken as a full backup and copied to the backup location. Same for the parent on the same day each month. Child backups are taken as incremental backups, describing the difference between the child and the last parent. The result is you only ever have two full backups consuming space.
This approach is more complicated, and typically requires software support to implement it efficiently.
The most imporant thing of all
... is to actually create a backup. There are lots of ways of performing backups but the most important thing is to have something, anything! If you don't have a copy of your music library on separate media to where your music library currently resides, do it, right now! You never know when problems can occur and you will be thankful for having the backup for all of your hard work and favourite music.
Finally, this might not need saying, but once you have created your backups, test them out! It's not much use having a backup that cannot be restored.
Thanks to david.nikonvscanon for the image above.