How digital music becomes disorganised


Disorganised music collections are an all too common problem. I've lost count of the number of iTunes libraries I've seen with compilation albums split by track artist. In my own case, I have a plethora of genres and before I ran bliss album art was a very hit and miss affair.

There are many examples of disorganisation:

The result is a music collection that is harder to navigate. It's more difficult to find and choose music to play. And, after all, having music is about playing it!

Music is organised by 'tags', stored within music files. It is possible to re-organise your music by changing the tags, and therefore you can organise your music collection to be more consistent and complete. However, most music fans want to add to their music library over time. It is the acquisition of digital music which contains tags inconsistent with the existing music in your library and its transfer to your computer that disorganises a library over time.

How music is [dis]organised

Digital music is organised by tags. Tags are textual information inside each music file. A tag will describe the name of a song, its artist and the album it belongs to.

Consider the following example of an album with five tracks (thanks to this track/band/album name generator!):

Track Track name Album name Artist name
1 Beside Tubes Continental Imp The Cool Turbulences
2 Fords Above Chicks Continental Imp The Cool Turbulences
3 Axe Headers Continental Imp The Cool Turbulences
4 Beach Aggression Cacophony Continental Imp The Cool Turbulences
5 Goofy Perfumes Continental Imp The Cool Turbulences

This is straightforward; a music player will find it easy to display the artist and album for navigation, and finding the tracks within the album for playback.

Consider a new album being added:

1 Mugging Crushing Scroll Medieval Warnings The Cool Turbulencs
2 The Cattle Of Duo- Plucks Medieval Warnings The Cool Turbulencs
3 Chesterfield Virgin Sr. Medieval Warnings The Cool Turbulencs
4 Rocket Skate Hullabaloos Medieval Warnings The Cool Turbulencs

The mistake in the artist name presents a problem to the music player. Should the mistake be represented verbatim, or should the artists somehow be merged? Understandably, music players take the first option.

Tags are powerful tools for managing at the micro level. The trouble is, it's easy to introduce macro level disorganisation just by having slight differences in the format of a tag.

The three forms of tag disorganisation

Very generally speaking, there are three ways in which tags can become disorganised.

Syntactic disorganisation. This is where the syntax of a tag is misused or a common pattern for a tag syntax is broken. Because tags are just textual data they can contain anything. Sometimes, multiple data are stored in one tag. A common example is in multi disc albums. There exist specific tags for denoting music disc albums but sometimes the album name is changed to include the disk number. For instance, "All Things Must Pass (Disc 1/2)". If you have a common approach for how to treat multi disk albums, this is ok. However, if some of your music uses one syntax, and other music uses another (for instance, "All Things Must Pass (Disc II)") then this is syntactic disorganisation.

Semantic disorganisation. This is where tags are used correctly and tag formats are consistent, but the meaning of the tag is inconsistent. I think the big example of this is genre. Genre can be represented at many different levels of granularity. 'Rock' music could be seen a super-genre of 'grunge' but it doesn't necessarily make music easy to find, when some grunge music is tagged 'grunge' and some tagged 'rock'. Getting on top of semantic organisation means understanding what tag contents mean and representing them in a form comparable to each other. Without this, the music is less navigable.

Incomplete disorganisation. There are several tags that are always populated - track name, album name, artist name. Some, such as year, are useful for navigation and selecting music to play but are not always present. If the data doesn't exist, the music will not be navigable by these means.

Different sources, different problems

There are different sources of music, and each can provide their own problems and potentially disorganise your music library.

When ripping CDs, tags are typically autopopulated by the ripping software. The software may not populate all tags. This introduces incomplete disorganisation. The software will lookup whatever data it can find online. This information comes from difference sources, some of which are unreliable and all of which are populated by human beings who have different ways in which they manage their own music. This introduces semantic and syntactic disorganisation. Other music fans may have a different disk number scheme to you, or employ different genres.

Matters are slightly better when purchasing online. Tags are generally complete (all the common ones, anyway) with scope for incompleteness if you have more exotic requirements. When purchased all from one online source you have a greater chance of consistent tag formats, but semantic and syntactic disorganisation can still creep in if purchasing from multiple online sources.

On the bright side...

The only answer is eternal vigilance. I've written before about how digital music management is a little like gardening and I think the metaphor holds; the disorganisation introduced by acquiring new music is a little like weeds in your garden. Every so often they need to be tilled, and it's best not to let them seed and sprout new weeds! So, keep on top of your music library organisation, because then keeping it in good shape is much easier.

For a small music collection, one can use a tagger to enforce tagging policies yourself.. However, for large collections I think rule based automated management is the only scalable way to keep control of your music library.

Thanks to Orin Zebest and Ryan Leighty for the images above.
tags: organisation large_collection

The Music Library Management blog

Dan Gravell

I'm Dan, the founder and programmer of bliss. I write bliss to solve my own problems with my digital music collection.