Three causes of tag scheme inconsistency

"Tagging scheme" is the term I'm beginning to use to refer to the overall set of semantic, syntactic and structural rules you use to organise your music library. It's the set of guidelines and constraints which you interpret as rules in bliss to make sure your music library achieves one of the three Cs; consistency.

Consistency is less spoken about than completeness and correctness. Plenty of CD rippers and music taggers allow the finding and correcting of metadata and cover art, but fixing intra collection consistency is generally relegated to a few functions hidden behind a menu.

To me, consistency is all-important, and so working out how inconsistency can be introduced is important to keeping control of your music collection. Here are three causes of inconsistency which you should consider when you work out your digital music workflow.

Not all tagging formats are equal

The most fundamental blocker to achieving a consistent tagging scheme is if the tagging format employed by your music files do not allow all of the tag fields you require.

The tag format is the way metadata tags are stored inside your music files. You may initially assume this is set by the audio file format, but in reality different audio file formats borrow from each other. ID3, for example, is the tagging format most famously used in MP3s, but other file formats also support ID3. FLAC files borrow from Ogg Vorbis files in using Vorbis Comments.

Some tag formats may not support all the metadata you wish to store. A commonly cited example is the ALBUM ARTIST field, which is not officially supported by ID3v2.x, yet is almost universally implemented, de facto, by overloading the TPE2 field.

In that example, you have a fallback. But two other problems remain: (1) what if the tagging format still doesn't provide a place to store the metadata you want, and (2) even if a replacement field is commonly agreed upon, what if you also want to use the replacement field in the "correct" way?

The answer to this is to use a custom field. The downside to this solution is having to maintain that field and the knowledge of the mapping from field to field. Music players will also have to be configured to show that field.

There's a final twist to differences between tag formats. These are syntactic differences: where the field is available in all tagging formats, but the contents are formatted differently. An example of this is the release date for a particular album; these formats are generally not prescribed by the tagging format, but they can still be present in multiple formats, either just the year, or the full date of release in DD/MM/YYYY format.

PEBCAK

But it's not all the machines' fault. Sometimes people get in the way too, and well meaning short term re-organising of a music library can lead to problems later.

There are plenty of examples of this, but a common one is the release date for an album. Some collectors tag this with the release date of the specific release purchased, and some tag with the release date for the original release. Both have a use, but there's a clear inconsistency (which amounts to many decades in some cases!)

This is the reason that it's important to complete all such re-organisation, and also to document how and why your tag fields are used in the way they are. This is one of the reasons I work on bliss; it's a way of codifying and remembering how your music files are organised.

And in case you were wondering what PEBCAK means...

Over-zealous software

So we've looked at tag formats and user... "issues"... so far. The final part of this unholy trinity is software; what it does to music files, and how it interprets their metadata.

By and large software such as music taggers can be used as a tool for organising music libraries. As the software attempts to deal with the complexities of music library management, however, inadvertent changes can be introduced.

Fields can be written with values that disobey the semantic intent of the field, at least from the perspective of the collector. I often see COMPOSER tags of modern music being written with the name(s) of the song writer(s). Semantically correct, maybe, but maybe not what the collector wants if the collector happens to want to only store COMPOSER information for classical recordings.

It's not just taggers updating fields; music players may also use tags in unexpected ways. This could either be an inability to understand the data in a field (e.g. not parsing a date in a specific date format) or the use of the tag in a semantically incorrect way (whether that be according to collector's use of the tag field, or the more generally agreed way).


Beware of these sources of tag inconsistency!

Thanks to Libertas Academica and striatic for the images above.
tags: software ripping

The Music Library Management blog

Dan Gravell

I'm Dan, the founder and programmer of bliss. I write bliss to solve my own problems with my digital music collection.