Horrible edge cases when dealing with music

maxbendick · on April 3, 2022

My time to shine! I work in music distribution.

The Apple Music style guide is a great and accessible place to look if you're interested in music metadata: https://help.apple.com/itc/musicstyleguide/en.lproj/static.h...

DDEX is the set of metadata standards used across the music industry: https://ddex.net/

Also worth knowing that not all content on streaming services is music! Some of it is spoken word, ASMR, non-music field recordings, etc. The difference between sound and music is subjective of course though.

On an artistic note, music can of course be presented in an infinite amount of ways. Not all these can be represented (i.e., re-presented) on streaming platforms. Installations and generative music for example. That's ok! To be able to represent all this music (and non-music) requires restrictions, otherwise we wouldn't be able to create programs for it in the first place.

Sometimes restrictions on sound content are to make the streaming services friendlier to use, like the formatting of featured artists in titles. Royalty laws put restrictions on the handling of different roles, like songwriters and composers.

BUT nothing is stopping you from building your own way of representing music digitally, so long as you follow relevant laws.

slaymaker1907 · on April 3, 2022

I still think it's pretty bad for classical music. An important question for various pieces is who plays a particular instrument, but there really isn't a nice way to encode this in the metadata (encoding the relation of both person and role where role could be anything).

I also usually want to know who the conductor was and what was the orchestra

inopinatus · on April 3, 2022

I'd expect publishers of classical music to know the difference between a composer, a conductor, a soloist, and an orchestra, but no, apparently not. They freely mash them up across fields. The difference between a symphony and an album is often mangled too, especially on releases containing multiple works. And for fuck's sake, no, a movement is not a song. I don't play, "like", flag, list, cross-fade, or reshuffle them independently.

Music metadata has been optimized to suit the marketing preferences of contemporary record labels. But more general classes of music won't shift units on Spotify so the message is loud and clear, a resounding "get fucked" from the assholes that run this industry.

(One side of my family is in the recording business. I can confirm the whole thing is run by assholes.)

efitz · on April 3, 2022

I like being able to address and like/favorite individual movements. Beethoven Piano Sonata #14 (“Moonlight”) is a masterpiece, but the second movement is <expletive>. I don’t know if I could listen to it hundreds of times like I have the first and third movements.

8bitsrule · on April 3, 2022

Many classical works have more than one 'track'. Musicians, critics, etc. usually refer to them by the tempo marking text. A lot of music software just doesn't get this. These tempo markings (by the composer; often in italian) are the track names. So Moonlight has 3 movements:

I. Adagio sostenuto

II. Allegretto

III. Presto agitato

inopinatus · on April 3, 2022

This is perfectly highlighting that we've got no way to express our actual preferences, and are stuck treating something as a song when it isn't, but it's the only paradigm offered to us.

hulitu · on April 3, 2022

> Beethoven Piano Sonata #14 (“Moonlight”) is a masterpiece,

Which one ? Played by Kissin ? Or by Rubinstein ? Or by ...

8bitsrule · on April 3, 2022

Most consider the composition to be a masterpiece.

Artists/conductors 'perform' 'interpretations' of works. Whether any interpretation is 'masterful' or 'virtuosic', etc., is up to the critical listener.

Three recordings by Horowitz may have been released ... which one is 'best'? Depends on the phase of the moon and your playback equipment!

The 'quality' of a particular performance may depend on the instrument provided to the artist. The year of recording can be very important (reflects techology used). The acoustics of the recording venue are also inherently part of the performance ... e.g. too much reverb.

azalemeth · on April 3, 2022

This, many times over. Id3 tags are one thing but all decent classical record labels (most of whom have always been drm-free) like Hyperion or Naxos give you a pdf cd jewel case booklet. I've got recordings of many friends and it's always a pleasure to see their names in this. I don't use streaming services and one of the unexpected joys of iTunes is that the album art can contain a pdf with a cover image. The rest of the UI mostly assumes that you're listening to non-classical music though. I'd love to able to search for e.g. "Victoria's Requiem" and get albums with sensible names, even if they're in Latin, seemlessly grouped by performance.

eliaspro · on April 3, 2022

A few years ago, we started working on a platform for classical music [1] and took great pride in our metadata and our underlying schema which basically covered 99% of all the weird edge cases classical music has.

Unfortunately, we couldn't make it work economically even though we tried several approaches for monetization and had to shutdown a while ago.

[1] https://grammofy.com/

wolverine876 · on April 3, 2022

Might you release the schema into the public domain? I know that's much more easily said than done, and it's your hard work and you deserve the benefits, so no argument from me if you don't.

Or maybe share some tips on what works with the corpus of classical metadata?

eliaspro · on April 3, 2022

While I'd love to see some of the quite spectacular internal tooling + data to be released under an open license, put to use for enhancing MusicBrainz' existing data etc., I doubt it's going to happen.

All team members have started new opportunities since quite a while and untangling all this, filtering out potentially copyrighted/licensed material etc. would take quite some effort.

999900000999 · on April 3, 2022

I primarily make hip hop music, but I would absolutely love a platform like this where I can listen to a classical piece, and download the score and MIDI files.

Are there any legal reasons preventing this ? I don't have the ability to play classical music myself, ( my weighted 88 key sits taunting me), but I'd love to be able to take these compositions and integrate them into my own music.

colejohnson66 · on April 3, 2022

IANAL, but in the US, copyright on classical pieces tends to apply to the performances, not the music. For example, a Bach piece is in the public domain (the score and probably the MIDI derivative[a]) just from being old enough, but a recording of that public domain piece may be copyrighted itself.

As always, this isn’t universal. Check your local copyright laws to be sure.

8bitsrule · on April 3, 2022

IANAL either, but. For MIDI, a public-domain work might be just mechanically converted into notes of a 'MIDI score'. Probably very flat-sounding, but technically that transcription [0] may (probably) be PD.

(E.g. My handwritten copy of a PD short story is probably PD. But my reading of it isn't.)

BUT if some famous piano player performed for the MIDI recording, then you've got tempo changes, timing, attack velocities, all that data in the file as well. Probably not PD (best to ask). (As with piano-player rolls.)

[0] https://en.wikipedia.org/wiki/Transcription_(music)

999900000999 · on April 3, 2022

Sounds like someone needs to be awesome and release midi files of classic compositions.

You have the rare hip hop song which samples classical music directly, but I really want to learn music theory, like understanding how the notes work.

8bitsrule · on April 7, 2022

The classics are out there, but often they've been performed by someone. Easy to strip down to the notes with a good MIDI editor.

It's been a long while for me, but a search on 'classic midi files' turned this up. (Hard to tell if it's crap without a listen.) https://www.midiworld.com/classic.htm

gvurrdon · on April 3, 2022

This is also important for jazz. Should I wish, for example, to find albums where a particular bassist is playing then this would be rather difficult unless they happen to be the leader for that particular recording. Also, it would be particularly helpful to have subgenres, e.g. jazz -> post bop, jazz -> free, etc.

fivre · on April 3, 2022

Maybe you can explain one of my long-burning questions: why are non-English tracks handled so inconsistently? I'll sometimes see an English translation, sometimes a transliteration (for non-Latin scripts), and sometimes the original language. It's not consistent which gets used even for a single artist, and sometimes changes to a different one.

The albums Ангедония (https://open.spotify.com/album/1HxmgR8wpc1ySplYCTNwaW or https://i.imgur.com/ndOXsgL.png) and Продано! (https://open.spotify.com/album/5kp7j9B4TDA3VfhaXz9XcJ or https://i.imgur.com/QJNjXc1.png), for example, both contain a version of the track "Рижская". The former uses the original Russian while the latter uses Rigas', which is a sort-of translation (it's an adjectival version of "Riga", which doesn't really exist in English).

https://help.apple.com/itc/musicstyleguide/en.lproj/static.h... indicates that there are separate fields for each of the original, the translation, and the transliteration, but it doesn't seem like anyone actually _uses_ these, and instead just picks one arbitrarily to stuff into the main/native field. I'm not sure if Spotify even has a way to display the alternates--I can't recall ever seeing a toggle for it, though idk if I've ever actually encountered a track that includes the variants, since you can't see raw metadata on Spotify.

Is it just that music labels do a lackluster job of handling metadata? I expect this is probably the case, since classical music is similarly messy--it's a crapshoot whether the composer or performer gets used as the artist. For that I've at least seen the composer metadata field populated sometimes, but internationalization fields don't seem to be used ever.

stavros · on April 3, 2022

It's really hard finding Greek songs on Spotify for this reason. There are many ways to transliterate (eg "ευχή" can be "efhi", "euxi", and all the combinations thereof, and that's just with two ambiguous letters) and no names are in Greek, so it's basically a game of trying to guess the transliteration for the track you want.

notimetorelax · on April 3, 2022

I’d say software should handle this issue - you type in Greek and it should match all possible transliterations.

stavros · on April 3, 2022

That's very hard, there are too many combinations. A better solution would be to just use the original Greek name.

YawningAngel · on April 3, 2022

That's a good solution for Greek speakers, but a terrible solution for non-Greek speakers, who _can_ learn a "canonical" transliteration but can't write Greek

stavros · on April 3, 2022

There is no canonical transliteration, though. Also it doesn't make sense to learn Greek like that, and nobody does in practice.

YawningAngel · on April 4, 2022

All of this is true, but I can learn that Spotify romanizes Τα παιδιά του Πειραιά as Ta paidiá tou Piraiá (or whatever) and search for that, whereas I can't type the actual name for love nor money.

I'm not saying this is necessarily a good design decision, just pointing out what might motivate this design

wolverine876 · on April 3, 2022

While I don't doubt your knowledge, note that many aspects of European language, including English, and including the European alphabets, are in a sense transliterations of Greek.

stavros · on April 3, 2022

Certainly, and there isn't a canonical transliteration there either. There is some consistency, but the primary goal was that the words look good in the target language, which isn't a concern for me when I want to type my language in a keyboard that doesn't have Greek letters. For that case, multiple transliterations per letter exist, including some that use numbers (e.g. θέλω = thelo, thelw, 8elv, and any combination of those).

wolverine876 · on April 3, 2022

Oh, I absolutely agree that they should list the original Greek titles. I was just addressing an interesting (to me) linguistic side issue.

stavros · on April 3, 2022

Ah, yeah, I agree that the way that Greek words got transliterated is actually very interesting (doubly so when you speak Greek and can see the intent behind it).

maxbendick · on April 3, 2022

This is an area that many parties handle differently. Streaming services, artists, labels, and distributors don't all handle translation metadata consistently. There are specs in DDEX for this, but it's a matter of support and doing the translation work AFAIK.

bobuk · on April 3, 2022

quick upvote for examples from Yanka. I thought that I'm the only one who still listening her.

Thorrez · on April 3, 2022

>Sometimes restrictions on sound content

Is there some sort of rule against sirens?

The song "Car Alarm" prominently features a car alarm on Youtube, but not on Spotify. This could just be a choice by the artists though, rather than a platform restriction.

https://www.youtube.com/watch?v=xV7nHX2RLjQ

https://open.spotify.com/album/2u4HDb57v96iiJZUC7PqOx?highli...

ratww · on April 3, 2022

That's a music video. They traditionally have sounds and dialog that aren't part of the music itself.

Here's the regular version, same as Spotify, also on Youtube: https://www.youtube.com/watch?v=N4inWiKCt-E

I don't know if Spotify has music videos, but Apple Music for example does, so you can have both versions.

withinboredom · on April 3, 2022

Probably not a rule, but maybe more like a liability thing? If people listen to your music while driving, you don’t want them to freak out from the sound effects and then sue you because they got in a car accident.

moogly · on April 3, 2022

I guess don't put on Steve Reich's "City Life" when driving in NYC.

spicyjpeg · on April 3, 2022

I have noticed that it's fairly common for many songs to feature random car sounds in their YouTube mixes, but not in the final masters that end up on iTunes and streaming services. It is also relatively common to have "watermarks" (e.g. additional short vocal clips at the beginning and end) added to the YT/Soundcloud mix, presumably to make it easier to figure out if a DJ just ripped the song off YT.

hnlmorg · on April 3, 2022

Music videos will often have different versions of the song with added sound effects, a more cinematic intro and outro, and even breaks in the song for a dramatic sequence. These flourishes make sense when you have a video but wouldn’t make any sense in purely audio context.

infofarmer · on April 3, 2022

Pleasant to see Apple has made some progress on their metadata guidelines (it used to be abhorrent for a company saved by the iPod). Looks like a useful one-pager to get a quick overview of the challenges without getting into any of them.

For a proper attempt to preserve metadata, I'd refer to the MusicBrainz Style Guideline — https://musicbrainz.org/doc/Style

seba_dos1 · on April 3, 2022

> Also worth knowing that not all content on streaming services is music! Some of it is spoken word, ASMR, non-music field recordings, etc. The difference between sound and music is subjective of course though.

Even some albums that are clearly about music can contain non-music, take Tenacious D's skits as an example.

infofarmer · on April 3, 2022

There's a reason why MusicBrainz SQL schema looks like what it looks like, after years of careful simplification to achieve an optimum approximation of reality.

https://musicbrainz.org/doc/MusicBrainz_Database/Schema

10+ years ago at Zvooq, during the golden age of music metadata and recsys startups, we tried to really solve entity resolution in the domain. I've never seen another streaming service make an honest attempt at this, even at Spotify with their 2014 acquihire of the Echo Nest. You still get Utada and Hikaru Utada as ~completely separate entities

Helmut10001 · on April 3, 2022

I think MusicBrainz did a great job to find some common agreed denominator between all these styles and formats for music metadata. After setting up Funkwhale [1], I went through all my 16000 MP3s and synced those with the MusicBrainz Library, also adding new entries. It went pretty smooth and my library looks much better now. It is also really fun to listen again, when everything is structured and labeled.

(btw., yes, I bought a large percentage and, nowadays, mostly buy on bandcamp [2]).

[1]: https://funkwhale.audio/ [2]: https://bandcamp.com/

andybak · on April 3, 2022

Ha! I contributed the initial classical style guide many, many moons ago.

I'm still technically an automod although I haven't participated for a long time.

willhinsa · on April 3, 2022

what.cd had a really solid model as a user, but I've certainly been pleasantly pleased with musicbrainz.org

dewey · on April 3, 2022

What.cd had a very simple data model and it had the same issue as many music sites trying to fit it into a simple schema (and not a more sophisticated one like MB). The first issue that comes to mind was artist with the same name, these ended up being the same artist (Just like on Last.fm) if not split by hand by changing the artist name.

The schema is here in case you are curious: https://github.com/WhatCD/Gazelle/blob/master/gazelle.sql

throwaway48375 · on April 3, 2022

I really miss what.cd

isoprophlex · on April 3, 2022

How do you feel about redacted? It seems to operate in the same spirit (if not backend code)

throwaway48375 · on April 3, 2022

Got nothing against them, but it's still not what just like what wasn't oink. Every time there is something lost you just can't get back.

sriku · on April 3, 2022

I understand "horrible" in this case in a somewhat lighthearted sense. But it points to a tendency to simplify cultural complexity in the interest of "legibility" from the system perspective. The same diversity of cases need to be accommodated in much design done for, say, a diverse country like India where though urban populace me be English savvy, rural populace aren't necessarily so ... And even in urban areas local language and culture dominates.

A place this diversity crops up most commonly is the so called "standard" firstname middlename lastname fields. This just doesn't work in India where people may have even 7 components to their names ... and who are you, dear system developer, to dictate what their name should be like.

The name structure diversity I understand is a problem for Europe too with hyphenated names. This brilliant sketch by Fry and Laurie says it all - https://youtu.be/1LopIroSjsU

wincy · on April 3, 2022

I mean, we get to dictate it because we’re writing the code and don’t know? I guess enterprising Indian developers should help it get fixed. As the whole world gets online this stuff gets sorted out, eventually. It’s not as if it’s malice or something. There are Americans with longer names though, I had a friend whose full name was five words, and all of his sisters and brothers also had “three middle names”.

sriku · on April 3, 2022

Not malice ... rather incompetence. If we can't handle this much accessibility, what hope do we have of making "ethical AI"?

DonaldFisk · on April 3, 2022

I've noticed some of these for years. When designing a music database, you do need to be aware of them. You need to use unique identifiers instead of names. last.fm doesn't do this.

There were four bands called Kaleidoscope, all active around the same time: one from England, one from Mexico, one from the USA, and one from Canada. To disambiguate them, generally the country is specified, e.g. Kaleidoscope (UK).

There were two bands called Bulldog Breed, both of which recorded albums called Made in England. They're decades apart, but as albums can be re-released, they're normally disambiguated by record label, e.g. Bulldog Breed: Made in England (Deram).

There are two Nirvanas, the original UK-based band, and the US one. The UK Nirvana recorded a cover of Lithium.

You'll be aware that the Beatles recorded an album called Revolver. A Russian band, called Revolver, recorded an album called Beatles.

Do you include the definite article in the band's name? If so, what about The The?

Tintern Abbey released a single single, the A-side of which was called Beeside.

cpeterso · on April 3, 2022

> Do you include the definite article in the band's name? If so, what about The The?

My SiriusXM radio omits leading articles like “The”, so “The Beatles” is displayed as “Beatles”. I alway long wondered how it would display “The The”. Would it be “The”, empty string “”, “The The”, or something else entirely. I eventually learned the answer: “The The”. Which is good, but then made me frustrated that it wouldn’t allow the “The Beatles” to be “The Beatles”.

mauvehaus · on April 3, 2022

I've always thought it'd be a bit cheeky to start a band and name the first three albums "Self-Titled", "Eponymous"[0], and give the third album the name of the band.

Fortunately, I lack any amount of musical talent, so this will never happen.

[0] I know, REM beat me to it.

porcoda · on April 3, 2022

In addition to all of these edge cases, it doesn’t help that the metadata services have a lot of inconsistency as well. Say I have a song X that was performed live on DATE at some VENUE. It may show up as titled “X”, “X [live]”, “X (Live DATE)”, “X (DATE VENUE)”, and so on. Stuff that should be in metadata ends up in strings embedded in song and album titles, which makes for a big mess. I curate a huge collection and trying to keep my metadata consistent is a massive headache. I think I’ve tried just about every tool out there and still haven’t found one that I’m happy with.

logbiscuitswave · on April 4, 2022

Or the many ways to have “featuring” — “feat.”, “w/“, “ft.”, “(feat.)”, “[feat.]”, etc. MusicBrainz does a laudable job trying to standardize all this crap and as someone who is really particular about how their music metadata looks it’s highly appreciated.

seaish · on April 3, 2022

Related, songs that are meant to be played by accident: https://gimletmedia.com/shows/reply-all/j4he7lv/183-the-veno...

Example album: https://drumkoon.bandcamp.com/album/they-tried-to-ban-this

He has hits like "Hey Gugle Play Music" and "Hey Siri Play Space Music".

mherdeg · on April 3, 2022

I enjoyed his track "Sorry Katie I Spammed Your Smart Speaker".

Lammy · on April 3, 2022

> Most albums have several versions/releases

And some times they're altered without any obviously-visible indication like differing number of tracks or "edition" naming, e.g.:

- MM..FOOD? (2004) [RSE0051-2] where "Kookies" samples Sesame Street: https://www.youtube.com/watch?v=4RYCLfGE-_Q

- MM..FOOD? (2007) [RSE0084-2] where "Kookies" is totally re-recorded with a different and non-infringing beat: https://www.youtube.com/watch?v=8iYSwvdEfeY

If you use a streaming service you're hearing the latter and might not ever realize the former existed. The re-record is a fine song, but the original fits much better with the rest of the album.

dotancohen · on April 3, 2022

This is my problem with Rust in Peace. It was a terrific album, but Dave didn't want to continue paying royalties to people he no longer likes. So in 2004 or so almost the whole album was rerecorded with studio musicians. It is horrible, mechanical, flat. Even some of the vocals were redone, poorly.

I'll sign up for the streaming service that provides the original 1990 Rust in Peace.

scns · on April 3, 2022

Is that the album where they blew the advance on drugs and had none left for good mixing and mastering?

dotancohen · on April 3, 2022

That was Killing is my Business I think. Dave remastered that one too, which is actually not bad, but some things were changed and due to copyright he had to leave off the Nancy Sinatra song.

I'd love a version of Killing is my Business with the remastered content were it is identical to the original, but with none of the content or artistic changes. Something like the despecialized Star Wars videos.

lanfeust6 · on April 3, 2022

Yeah that was particularly obnoxious. According to discogs.com I think there are more recent reissues/remasters of the original version.

dotancohen · on April 3, 2022

The discogs javascript is not playing nice with the current Firefox. I'm specifically referring to the "Show more versions..." control.

OK, found it! It seems that the desirable keyword is "Reissue" and specifically to avoid the keyword "Remaster". There was at least a Reissued but not Remastered LP in 2013. Maybe more, but that javascript is horrible!

lanfeust6 · on April 3, 2022

I had the same problem earlier. Definitely a bug they recently introduced. It was easier sifting through sorting by year descending.

kzrdude · on April 3, 2022

This annoys me in Spotify so much, that we often only get one version of the album, and usually a new one instead of the original.

And maybe I should feel like it's a luxury to have extra bonus tracks with more takes at the end - but I lean over to the album purity side more - the album is one artwork by itself, I want the original track listing and nothing more, as "the album".

iggldiggl · on April 3, 2022

> If you use a streaming service you're hearing the latter and might not ever realize the former existed.

On a number of live Bob Dylan albums, for inexplicable reasons all online versions (not just streaming, but the download releaes, too) are now missing the between-tracks content that on the physical CDs is part of the pregap, which means you're missing out on the funny stage banter on the 1964 Halloween concert, or the infamous heckling on the electric half of the 1966 Manchester concert.

petsounds · on April 3, 2022

There is a band from Pasadena, California called Ozma. Way back in the day (1999), they released an album called “Songs of Audible Trucks and Cars” [0] via MP3.com. I later found out that they really wanted the title to be “Songs of Inaudible Trucks and Cars” [1], but MP3.com had a character limit on album titles, so they had to shorten it

[0] https://www.discogs.com/release/8369169-Ozma-Songs-Of-Audibl...

[1] https://www.discogs.com/release/15809071-Ozma-Songs-Of-Inaud...

revolvingocelot · on April 3, 2022

If, like me, you immediately thought "why not just drop the 'and' and replace it with an ampersand?", Ozma had to do that, too, and parent's quoted the album title incorrectly, alas.

tashi · on April 3, 2022

"Songs of Inaudible Automobiles" would have fit and been fun to say aloud.

mehrzad · on April 3, 2022

A funny edge case I've found in one of the most popular music services (Last.fm). There is an instrumental post-rock ish band called Collections of Colonies of Bees and they have an album where almost all the tracks have the same name; they are all titled "Fun" except for the last track titled "Funeral". https://open.spotify.com/album/0LP8DTy4amlLIVXGOjvCo2?si=HJ3...

This has caused last.fm to think "Fun" is one of my most played songs of the year even though they were all separate tracks.

aasasd · on April 3, 2022

I bet the most popular track name in techno is “Untitled”.

zimmund · on April 5, 2022

Last.fm doesn't allow "untitled", "various artists", "track 01" and other titles that may suggest the song is incorrectly tagged. This leads, of course, to edge cases where an artist has a track named "untitled" and it can't be scrobbled.

Moreover, last.fm also has an issue with unicode characters and some artists/albums may have two or more different profiles. It's a mess!

aasasd · on April 5, 2022

I actually was more traumatized when I discovered that artists with the same name are filed under the same ‘profile’, and just have a textual explanation that they are actually different. You'd think that a service purporting to do ‘big data’ processing could grab discographies off Discogs or something, and compare the album titles.

Anyway, personally I long since abandoned Last.fm in favor of more hands-on discovery.

mehrzad · on April 4, 2022

I would guess that the problem only appears if it's the same artist and same album ID. Not sure if techno artists use "Untitled" multiple times on the same album, haha.

aasasd · on April 4, 2022

Oh, easily—whole careers are probably made of ‘untitled’ tracks. Afaik, in edm a more common identifier for such unnamed records and tracks is the catalog number and track number or the side (for a single).

Here's one of the best albums by Hype Williams the duo: https://www.discogs.com/master/337242-Hype-Williams-Untitled (not edm but still).

xenadu02 · on April 3, 2022

Another famous one is Prince's "Love Symbol no 2".

Fun fact: he adopted it as his stage name to spite Warner Brothers Music, due to the unfavorable contract terms and the creative meddling they kept doing (among other things). By terms of the contract they had to go along with whatever stage name he picked. They had to spend a bunch of money having a custom font created, then mailed on floppy disk to every music reviewer, newspaper, magazine, radio station, etc as part of album promotion. No one was on the internet in those days so it ended up causing them a lot of trouble. I guess he found his own loophole in the contract.

SAI_Peregrinus · on April 3, 2022

One of the simplest data model mistakes that seems nearly universal in music software is pretending that some things are single values when they're really lists.

Take "artist". An album can have a compilation of tracks, each from a different artist. A given track can have multiple artists, eg a guest singer. There can be multiple composers of a track, there can be multiple writers of lyrics, and there are usually several performers (even if they all are part of the same band).

An album can have multiple names, or no name at all (which often means people refer to it in different ways, giving it multiple effective names). Likewise with tracks.

Albums and tracks get re-released, sometimes re-mastered, sometimes re-recorded entirely, without the names changing.

The same recording of a given track might be released on multiple albums, eg as a single in a pre-release and then within the main album.

Tracks might be mastered differently for different media (Radio vs CD vs background use in cinema, etc).

Names of any of these things don't have to be representable in any existing text encoding, or even any combination of encodings.

Artist name should be a list (or other multiple element structure).

Track name should be a list.

Album name should be a list.

Etc. Assume a malicious artist will find a way to make the assumption that any field has a unique value incorrect.

wolverine876 · on April 6, 2022

> or other multiple element structure

Right. It's a relational database with lots of many-to-many relationships. We need to stop trying to make it a flat file. A FOSS SQLite schema that everyone can embed, or even a turnkey little db, would be wonderful.

tzs · on April 3, 2022

It's not much an edge case in the sense of those, which mostly concern software authors making incorrect assumptions about the data in fields, but rather a "not even realizing we need a field" case, but music library managers are often quite annoying when dealing with classical music.

They really want to fit everything into the artist/album/song scheme, and that doesn't really work well for much classical music. In classical music most of us would want something like composer/piece/orchestra/conductor/performers (the "performers" field would be for performers in addition to the normal orchestra, such as soloists or choruses), and it should be possible to change the sort key order among those fields.

CharlesW · on April 3, 2022

> …music library managers are often quite annoying when dealing with classical music.

Apple will apparently be solving this problem this year with a dedicated "Apple Classical" app, the origin of which is their Primephonic (https://www.primephonic.com/) acquihire.

"Why are you shutting down Primephonic?"

"To focus on creating an even better experience for customers around the world. Apple Music plans to launch a dedicated classical music app next year combining Primephonic’s classical user interface that fans have grown to love with more added features."

jay-anderson · on April 3, 2022

Definitely. Pieces end up being listed something like "Symphony No.3, Mvt. II", but they only show the performer which is the orchestra playing rather than the composer so I have no idea which symphony it is. I haven't seen any service get classical music right.

I think your list is pretty good (composer/piece/orchestra/conductor/performers). It'd be nice to also have a field for the movement. Also I totally agree with custom sort order being a must. E.g. for choir pieces swap out 'orchestra' for 'choir', for piano pieces only include 'performer' field, concertos should have the 'soloist' before the orchestra, etc.

TheOtherHobbes · on April 3, 2022

It's a sleeve notes problem. Ideally you'd want a complete copy of all the info in the sleeve/inlay. This includes some combination of headline writer/composer, producer, conductor, headline performer, co-writers, arrangers, session players, engineer, studio, lyrics, and random sleeve notes/essay, possibly in multiple languages.

Per track.

Plus artwork. With credits.

Plus random thank-yous and special thank-yous to sponsors and other contributors.

The importance of each field varies with each project. For classical conductor showcase projects the conductor is the headliner. For classical composers, obviously it's the composer. For compilations it can be a concept like "Classical Chill" or "Early Music from the 15th Century." For pop projects it's usually the band/artist name. For movie tie-ins, it's the movie, or perhaps the franchise.

And so on.

You get all of this for free on CD/vinyl labels, and the fact that it's freeform doesn't matter because the sleeve artwork and visual placing of the elements make it clear who/what matters.

Trying to fit this into a single fixed metadata schema is madness. It can't be done.

This is one of those situations where trying to data-fy the information leads to huge loss of detail. You get some nominal (poor) searchability, but you give up a huge amount of interesting supporting information.

What the industry needs is a digital sleeve note standard which can handle all of this and more.

But metadata is designed for sales and stream tracking, not for user satisfaction. Which is why it's so bad at capturing what users want.

nerdponx · on April 3, 2022

A lot of music library managers (including good old iTunes) let you add a lot of various and esoteric song tag data as columns in the track list. The problem was always that the column arrangement that makes sense for one style of music doesn't make sense for another.

One missing feature that I've never seen in a music library application is the ability to have "presets" for these column headers that you can easily switch between. So you can turn on the "classical" preset which shows Composer, Performer, etc., and you can turn on the "contemporary" preset which shows Artist and Album.

Semaphor · on April 3, 2022

That’s actually something MediaMonkey [0] (Windows-only) had for a long time (I assume the current electron version as well, I use the last native version). Classical is one of the defaults, but you can add whatever.

It’s one of my favorite pieces of software, a true music library manager and not a music player with some management features tacked on.

[0]: https://www.mediamonkey.com/

Aidevah · on April 3, 2022

Agreed, I've tested many self-hosted music streaming servers and decided to settle with navidrome in the end but not without major reservations. I personally don't mind sorting music via album/song, but the lack of composer field is really painful. Unfortunately the omission of composer information occurs in the subsonic api, which have a large number of compatible players, so adding the field is not as straightforward if one wishes to do it in an interoperable way.

I used to be quite happy with a local music library and playing it with mpd, but at some point I got sick of maintaining over 100 GB of music library across multiple devices. The streaming solutions for mpd are quite awkward to use, otherwise I would have stuck with it just to get the composer field.

thaumasiotes · on April 3, 2022

There's an analogous problem going on on musescore.com, where you're supposed to list the work your score is based on. Viewing a score on the site will then include links to other scores based on the same work.

I uploaded several pieces from Swan Lake, and listed each of them as the particular piece that they were. For example, https://musescore.com/user/36584999/scores/6642212 is Swan Lake, No. 4 (Pas de trois), part IV.

But every time I list a piece like this, it gets "corrected" to the canonical reference, which is "Swan Lake". And the "other scores based on the same work" are totally useless, because they can be any piece from anywhere in the ballet. It's more than 600 pages long! Where's the value add from removing my indication of exactly which piece of Swan Lake I just scored?

throwaway684936 · on April 3, 2022

Presenting my new album: SQL Injection'); DROP TABLE Artists;--

Tokkemon · on April 3, 2022

Amazing!

aaaaaaaaata · on April 3, 2022

Provocative, you could say.

f7ebc20c97 · on April 3, 2022

I'm gonna name my band an 18 TB string of random characters, interspersed with an "idea" that you can't even put into text! Take THAT, Unicode!

javajosh · on April 3, 2022

Ooh, that gives me the idea to name my album the base64 encoding of it's tracks. The album itself, of course, would be empty.

akira2501 · on April 3, 2022

I believe you would no longer be able to copyright the track, then. So, you should encrypt the encoded titles, but record an audio track that is a voice recording of you reading out the encryption key for that track. You know, for DRM.

seba_dos1 · on April 3, 2022

> I believe you would no longer be able to copyright the track

Why? I'm not sure about US law, but here in Poland it would definitely still be copyrighted (even if you'd be effectively granting an automatic implicit license to use that name for the purpose of referencing the work).

akira2501 · on April 3, 2022

My logic is that you cannot copyright a fact. It is a fact you released an album, and it is a fact you titled it such. I'm free to reproduce that information, say in a catalog, without violating copyright.

seba_dos1 · on April 4, 2022

Being able to reproduce something in particular context without violating copyright does not mean it's not being copyrighted.

labster · on April 3, 2022

It doesn’t matter if he can copyright it, he can still sell NFTs of each album track.

eyelidlessness · on April 3, 2022

Hi are you Fiona Apple and if so can we be friends?

scns · on April 3, 2022

Is that the limit on Postgres' TEXT column?

FabHK · on April 3, 2022

Meta Comment:

I much prefer this informative post (with a clear title and plenty of examples and explanations) to the uncommented "Falsehoods programmers believe about ..." listicles.

MereInterest · on April 3, 2022

Absolutely. Stating that an assumption is false with no further information doesn't help me update my mental model of the world. Giving an example of how an assumption is incorrect shows the type of errors that can occur, when they can occur, and lets the reader figure out a better way to work around the false assumption.

thristian · on April 3, 2022

Peter Gabriel's first four solo albums were untitled. Well, the fourth was titled in America, but untitled everywhere else. Fans tend to distinguish them by the cover art: there's "Car", "Scratch", "Melt", and "Security".

sharph · on April 3, 2022

It's worse than that, they were self-titled. So if there were some mechanism for numbering untitled albums, it can break down.

kingcharles · on April 3, 2022

LOL. I used to work for PG. I created a schema for storing music metadata for him, so it's somewhat ironic.

eashman · on April 5, 2022

Oh please tell this old Genesis fan some more. Gabriel was a very forward thinking musician from a technology perspective. See this clip for his demonstration of an early sampler [0]

[0] https://youtu.be/7Xfj5n1kYXY

kingcharles · on April 5, 2022

Ha, not many stories. He's a nice guy. I worked for him like 1999-2004? Through the 'Up' album, which I think is really underrated. Mostly Real World stuff though, as that has been his primary focus. Some cool parties at his studio, which is beautiful. I remember one when Martha from Martha and the Muffins had just finished producing an album and was moving on.. there was a party and bands.. and then PG got up on stage and took over the keyboard and started playing the beginning of "Echo Beach" and Martha was laughing in the crowd and saying "Nooo!", but you can't say No to PG, so she had to get up on stage and her and PG gave us an impromptu performance of "Echo Beach" which was awesome.

p.s. that video was awesome. Were those 8" floppies it used?

eashman · on April 5, 2022

Yeah I was starting to use PCs during the Apple II era so we didn’t have 8 inch disks anymore by then in my house. But I do remember seeing them in labs with VAX systems.

But those specialized music workstations all were custom designed so the parts etc. used were based on older standards I’m sure. Pre-ISA bus too I would guess.

Nowadays, even a lot of embedded devices like IOT things are running effectively a Linux OS on them so the economies of scale are enormous compared to the early 80s music gear.

I imagine Roland and other similar companies had their own OS or base platform they used on those earlier workstations that eventually had screens you could attach.

dehrmann · on April 3, 2022

Weezer's self-titled albums are distinguished by background color of the cover art.

can16358p · on April 3, 2022

What is the deal with Japanese exclusive bonus tracks/promo thing?

Interested about why a significant number of albums have Japanese promo versions with extra tracks. Is that a cultural thing about Japan or is there another specific reason? (I know it's not only Japan but mostly Japanese editions that I see throughout the web)

As if, almost even brouillard has a Japanese bonus track named brouillard in their album brouillard.

Hamuko · on April 3, 2022

I heard somewhere that it was an incentive for the Japanese market, which still today is a large physical CD market, to buy the locally distributed releases of albums instead of just buying the original foreign release, which may have been available earlier and for cheaper, or if they already bought the original release, buy it again.

dotancohen · on April 3, 2022

Is this why the super-high quality vinyls were also available only in Japan? I cannot remember the name of the company, but there is (was) some company selling increased dynamic range recordings of popular albums, on very durable (as in, regularly replace the needle) vinyl. I had heard an imported (from Japan) Dark Side of the Moon recording on that once, and it just blew me away.

Hamuko · on April 3, 2022

I imagine the super-high quality vinyls are because of a very special breed of Japanese audiophiles who don't mind spending on their hobby.

https://www.wsj.com/articles/a-gift-for-music-lovers-who-hav...

dotancohen · on April 3, 2022

I was likely wrong, it wasn't a Japanese company. This is most likely the company that produced the high-quality recording that I heard:

https://en.wikipedia.org/wiki/Mobile_Fidelity_Sound_Lab

scns · on April 3, 2022

I once read somehere that CDs are expensive when new, can't confirm though. When i was there i bought them used for cheap. Slayers' Hell Awaits, colored like egg yolk ~6€ IIRC, thank you very much.

nailer · on April 3, 2022

The odd one for me is Pendulum - there’s two different Australian acts, both electronic, with the name Pendulum. For some reason the newer Pendulum really liked the name and weren’t bothered / ethically concerned that the Pendulum that made ‘Coma’ had already existed. As of today Spotify has songs from both artists under the same artist name.

lukeh · on April 3, 2022

Plenty of duplicate artist names, Spotify artist support will split them into separate artists if you contact them, and now aggregators like DistroKid let you specify the actual artist ID when uploading (at least for some DSPs). It’s just another instance of there are only two hard problems :)

lukaslalinsky · on April 3, 2022

There are much worse things than Pendulum. John Williams seems a very popular name for artists. :) Back when I was designing the "new" MusicBrainz database, that's what we have been using as the test case for this.

tomcam · on April 3, 2022

Yeah my favorite is definitely Pendulum.

photon-torpedo · on April 3, 2022

To add to the list: Song titles that cannot be fully represented in Unicode. Example (also to plug my favorite band): Magma, a French progressive rock [*] band that's been (on and off) active for more than 50 years, mostly uses their own invented language "Kobaïan" where some of the characters use diacritics that are (to the best of my knowledge) not present in Unicode. E.g. there's an S with a kind of crown on top.

[*] Actually their style of music is unique enough that it's given rise to its own genre, "Zeuhl".

gorgoiler · on April 3, 2022

Catalogue numbers!

You wouldn’t index a collection of records by the shape of its sonic wave forms, would you?

No more would you index by the artistic representations on the cover which, as we have seen, includes everything including the words purporting to be the artist name and the title of the release.

No, of course not! You use the barcode on the back!

https://en.m.wikipedia.org/wiki/Catalog_number_(music)

For white labels, I guess you have to hide them in a pile until they get a release.

throwaway384813 · on April 3, 2022

Bookmarking this for future reference!

I recently started a fun/side project which aims to improve my personal tedious process of (a) managing a queue of things I want to listen to, (b) keeping track of which of them I listened/liked and (c) easily sharing my "discovery activity" with my friends who share the same passion.

The idea is that you can add items to a "Queue" list, using Discogs Master URLs (e.g. https://www.discogs.com/master/1994809-Alt%C4%B1n-G%C3%BCn-Y...) or Spotify album links, because this is what I use in my day to day.

So I've started by parsing Discogs' data dumps[1] to backfill my database (aka. "release groups" in MusicBrainz), so that I avoid hitting Discogs API with every user query) with all "master" releases. But then realized that (a) they don't include cover images and (b) not every release has a corresponding "master" release and (c) they're not updated frequently enough (maybe monthly).

Then I thought of using MusicBrainz data dumps[2] to augment the entries by Discogs in my database, which do provide cover art images, but then how do I correlate them with the already-inserted Discogs releases? Fortunately there's a Discogs URL for many release groups from MusicBrainz; unfortunately only 50% of them do have it.

Then I could perhaps use solely MusicBrainz data and not Discogs at all, but then what if people are mostly using Discogs links? This will result in two identical records on my database, pointing to the same underlying release but in different sites (Discogs & MusicBrainz). Perhaps this can only be solved by human moderation and providing the ability to "merge two items".

Then I thought of actually contributing to MusicBrainz and add the Discogs URL for every release group.

Another interesting assumption I made, which turned out incorrect, was the every [artist, release title] pair was unique. But that's not the case - an artist may have multiple "master" releases (i.e. completely different tracks inside) with the same title.

[1] http://data.discogs.com/

[2] https://musicbrainz.org/doc/MusicBrainz_Database

egypturnash · on April 3, 2022

I am amused that despite mentioning both Keygen Church and Master Boot Record, this list refrains from mentioning that the former is a side project of the latter.

And, yeah, if I was still collecting my music as CDs? Keygen Church would be filed next to the MBR because of that, not over in the Ks.

rcthompson · on April 3, 2022

AC/DC is another edge case I've seen, mainly because in some contexts slashes are used as separators between multiple artists, so "AC/DC" could be misinterpreted as a list of 2 artists named "AC" and "DC".

amelius · on April 3, 2022

Doesn't work well in filenames either.

SAI_Peregrinus · on April 3, 2022

"/Τĥιs ñåmè įß ą váĺîδ POSIX paτĥ" is my display name on some services at my job. It helps to remind people to consider things like spaces, homoglyph attacks, and non-ASCII characters.

eschneider · on April 3, 2022

And for new bands: Maybe try not to name yourself after a Siri/Alexa/Whatever revered word. Jus try asking Siri to play a song by "Them." Ugh.

rzzzt · on April 3, 2022

"The The" is my go-to example!

tinus_hn · on April 3, 2022

It’s a very ingenious and effective anti-piracy measure!

moogly · on April 3, 2022

Adding to the list:

There are two well-known Avishai Cohens, both jazz musicians (so you can't disambiguate on the broader genre), both from Israel (so you can't disambiguate on country of origin), but luckily they do play different instruments (bass and trumpet).

---

The Australian post-rock/shoegazeish band The Dead Sea https://www.discogs.com/artist/1450625-The-Dead-Sea released a self-titled EP with 3 tracks in 2006.

2009 they released a full self-titled album (only 1 song from the EP featured on it).

I can't even find the "second" release on discogs.

Both were self-released, so no way to disambiguate on ISMN or record label/publisher.

So you'd have to disambiguate only on the year of release, or define what demarcates an EP from a full album and mark them as such (there are some competing definitions out there, since it became less clear after the introduction of the CD, and now digital, but even back then you had complections like maxi singles and mini-LPs, IIRC).

---

Many of John Zorn's releases tend to be headaches:

"Tap"

Composed and produced by: John Zorn

Performed by: Pat Metheny

Subtitle "Book of Angels, vol. 20"

Overarching series "Masada Book 2"

Released on Tzadik (Zorn's personal label) as a John Zorn/Pat Metheny output.

Licensed to Nonesuch (Metheny's label) as a Pat Metheny release with different cover.

adrianmonk · on April 3, 2022

> There are two well-known Avishai Cohens, both jazz musicians (so you can't disambiguate on the broader genre), both from Israel (so you can't disambiguate on country of origin), but luckily they do play different instruments (bass and trumpet).

Along very similar lines, there are also famously two musicians named Bill Evans. Both play jazz (genre) and both are from the US (country). As a bonus, both have been members of bands led by Miles Davis. And again you can differentiate them by instrument (piano and saxophone).

Which is sort of what Wikipedia article titles do to differentiate them:

https://en.wikipedia.org/wiki/Bill_Evans

https://en.wikipedia.org/wiki/Bill_Evans_%28saxophonist%29

I guess Wikipedia also differentiates by level of fame, since the pianist article title doesn't have a qualifier.

lanfeust6 · on April 3, 2022

You also can't find Zorn's music streaming anywhere. I've heard of one store that offered digital purchases, but not for the whole discography.

You can fill a large closet collecting just his stuff so I'm not sure why digital is off the table.

thrdbndndn · on April 3, 2022

> Weird album names:

> † from Justice

> from David Bowie

> ' from Frank Zappa

> () from Sigur Rós

How are these albums names "edge cases"? I feel like ANY software would deal with them just fine. Hell, the last two don't even use non-ascii characters (though you definitely should have unicode support regardless).

Edit: I just realize HN won't the star symbol (U+2605). Which is weird since it does support characters like Chinese/Japanese glyphs あいうえお你好

Beltalowda · on April 3, 2022

There are actually some challenges with stuff like that; how do you sort "'" for example? Most people call the album "Apostrophe"; should it be sorted under "A"? What about the "†"? "D" for dagger? Somewhere else? How do you find things like that in the search box? Should searching for "star" return that Bowie album?

> Edit: I just realize HN can't display [star]

It doesn't do emojis and various graphic characters. This is supposedly a "feature" shrug-emoji-here

Freak_NL · on April 3, 2022

Sorting is usually solved by allowing for optional separate sorting-key tags. So ' :

    album='
    albumsort=Apostrophe

These can act as search input as well, so searching for 'apostrophe' could yield it depending on the software used.

† can either just sort wherever Unicode puts it, or if the album has a known pronunciation like 'star' above, have a sorting key too.

rzzzt · on April 3, 2022

Wikipedia lists it as "Cross": https://en.wikipedia.org/wiki/Cross_(Justice_album)

Beltalowda · on April 3, 2022

> † can either just sort wherever Unicode puts it, or if the album has a known pronunciation like 'star' above, have a sorting key too.

Sure, but it doesn't happen automatically, so it's not "just do [..]". You need to think about some of these edge cases.

I wrote a simple music library/player a bunch of years ago and ran in to some of these issues. I never did figure out how to deal with the band named "" in a sane way.

thrdbndndn · on April 3, 2022

You can "just do" the default string sorting. The system (or programming language you're using, etc.) always have a way to sort them by default.

And it doesn't have to be in A to Z. It could be (and IMO, should be) in "special" which such initial-index catalog typically already have.

kzrdude · on April 3, 2022

The aversion to emojis is strange. I've come around to thinking of them as a net positive. Just text is sterile and the tone often comes off wrong! Smiles lighten up conversation, we need that to not lose track of each other's humanity when arguing in comment threads. :)

kingcharles · on April 3, 2022

This is more of an edge case:

https://cdn.shopify.com/s/files/1/0022/8193/0805/products/im...

When music downloads were first starting up in the late 90s I was responsible for building a schema to store all the world's music. We had to rip millions of CDs. When I got that Aphex Twin one above I gave up. I think we just called it [Track 2] or [Maths symbols] or something.

kingcharles · on April 3, 2022

HN strips emojis. I can't even paste my web site domain on here.

joshspankit · on April 4, 2022

It's worth noting as well, that some of these edge cases are directly because of artists intentionally "messing with" whatever they perceive as hard rules.

The reason I mention this is because you cannot simply look at a list of all edge cases, identify what schema you can use to encapsulate them all, and then build your service around the idea that the schema will never change.

As a very simple (I hope contrived) example: You could decide to use a string delimiter so that you can capture "all the strings" and still use string interpolation. You then look through all the edge cases and see "No one has named anything with U+1F7F6, so we're safe to use that". 100% guarantee that as soon as someone sees that they will name a track or an album in the exact way they need to to break your system.

quickthrower2 · on April 3, 2022

No mention of Prince?

nocman · on April 3, 2022

You mean "The artist formerly known as the artist formerly known as Prince"?

:-D

joshspankit · on April 4, 2022

Another great one for the history books. His symbol broke so many music company systems and the story earned the respect of many.

hamstergene · on April 3, 2022

Electronic music is also full of remixes, for example:

Artist: Moby

Title: Simple Love (B-Motion Remix)

The actual artist is B-Motion and any kind of browser/catalog must group this track with other B-Motion music, or in worst case both, because remixes are in entirely different genres than the original.

So "Artist" is not necessarily the artist.

creshal · on April 3, 2022

- Artist names can become rather long too, especially for collaborations. Some anime OSTs easily break 256 characters.

- And for song titles that need careful escaping, QuelI->EX[cez]->{kranz}; and EXEC_SOL=FAGE/. have tripped up more than one CLI music player in my experience.

morelisp · on April 3, 2022

A simpler explanation is that your computer has an unstable grathnode installed.

kazinator · on April 4, 2022

I don't think it's meaningful to talk about edge cases without a set of requirements connecting inputs to desired outputs: what you want to do with the data, or hope to get out of it.

Features like long album names, empty names, or names with international characters from Unicode, are not hardly edge cases. Edge cases have to have some surprise to them.

Coming from a blank slate, you'd never approach a problem with the assumption that all strings are 32 characters or less, confined to 7 bit ASCII, and are never empty and then paint your face surprised call "edge case!" when they aren't.

whyoh · on April 3, 2022

Tagging challenges are one reason why I sort and browse music by folders/filesystem.

klez · on April 4, 2022

Aren't you just moving the complexity from tags to the filesystem that way? How does that solve the problem in a way that tags don't?

whyoh · on April 4, 2022

Yeah I should have made it clear that I mean specific problems, like musicians who release an album under a different name... With a folder based approach it's easier to put all of those in one place. The filesystem also has some drawbacks, like a more limited character set. But all in all, I like it.

philjohn · on April 3, 2022

Similar problems when dealing with books.

J.K. Rowling has written works under other names, e.g. Newt Scamander.

And then you have the case of the (actually recorded this way by the Library of Congress) author "Mark Twain (ghost, through mediumship)"

inetsee · on April 3, 2022

How about the horrible edge case when trying to read this pale blue text against a white background. It's been a while since I criticized the readability of a web page, but I think this page deserves a critique.

lacrosse_tannin · on April 3, 2022

There's also different releases of the same album... 2020 remaster, vinyl rip, web flac, japan edition...

Sometimes the music app just smooshes all together so the "album" looks like it has 4 of each track

Bud · on April 3, 2022

Pretty disappointing to see an entire article like this go by without a single mention of classical music. Also known as basically all music prior to 1940 plus a lot of the most enduring music since.

zokier · on April 3, 2022

> Also known as basically all music prior to 1940

Sure, if you ignore blues&jazz, traditional folk, and all non-western music.

Karuma · on April 3, 2022

They do mention Tchaikovsky there... Or what do you mean, exactly?

tomcam · on April 3, 2022

Not sure I understand? There are accepted names/sequencing for all major Baroque, Classical, Romantic etc. artists: Mozart K numbers, Bach BWV numbers, etc. And usually there’s a close to unique name like Adagio in D Minor, Piano Sonata in Bb Major, etc.

repsilat · on April 3, 2022

I guess they mean that there are no original recordings for most of it, lots of cover bands etc.

parenthesis · on April 3, 2022

I was searching for the artist Urban Soul on Tidal to discover there's another artist called Urban Soul (I wanted the Roland Clark project) with albums by both mixed up together in the search result.

'Twas all in vain, since I wanted a particular remix of `Alright' — which I have on a vinyl promo less that a metre from where I am sitting (I can remember clear as day buying it more than 25 years ago) — which Tidal don't have. So I ended up listening to it on YouTube.

bobbyi · on April 3, 2022

Amongst major artists, Tupac is good at causing confusion. Is he Tupac, 2Pac or Tupac Shakur? Does it depend on the album? Is his alter ego Makaveli a different artist?

geenew · on April 3, 2022

Richard D James == "Aphex Twin" || "AFX" || "Caustic Window" || "Bradley Strider" || "GAK" || "user48736353001" || "Polygon Window" || "Universal Indicator" || "The Tuss" || "Powerpill" || "Mike & Rich"

(for me, 'artist' is the alias, 'album artist' is 'Richard D James'...)

webmaven · on April 3, 2022

Don't forget Garth Brooks|Chris Gaines and Eminem|Slim Shady

emptybits · on April 3, 2022

Yup. ARTIST_ID does not have a 1:1 relationship with ARTIST_NAME over time.

Another example... from 1976 to present day: Johnny Cougar → John Cougar → John Cougar Mellencamp → John Mellencamp.

Close enough for library software to store/recall them all as "John Mellencamp"? Maybe. But perhaps not to fans or historian or artist liking.

Klaster_1 · on April 3, 2022

Imagine documenting something like this https://www.discogs.com/artist/661099-Filipe-Santos (check out the edit history) where the artist actively sabotages the documentation effort and makes their music unavailable on a whim.

ux · on April 3, 2022

I wrote something similar a while back: http://blog.pkh.me/p/15-the-music-classifying-nightmare.html

buro9 · on April 3, 2022

I've recently wondered whether it's worth periodically reprocessing the metadata on music files with musicbrainz Picard.

It was all done when added to the catalog, but presumably the metadata could be updated and a periodic refresh would be a benefit.

defensem3ch · on April 3, 2022

There's also Goto80's Files in Space https://www.goto80.com/goto80-files-in-space-mcmp3-data-airl...

vermooten · on April 3, 2022

Don’t forget how much streaming services struggle with classical music meta data.

jojojaf · on April 3, 2022

Does anyone recognise the equation from the Aphex Twin track?

∆Mᵢ⁻¹=−α ∑ Dᵢ[η][ ∑ Fjᵢ[η−1]+Fextᵢ [η⁻¹]]

With the ∆M and the sum over Fs, I wondered if it might be something to do with rocket dynamics derived from Newton's second law?

trelliscoded · on April 3, 2022

It's pseudomathematical nonsense. The Fji term might be a hint about the image hidden in the track when plotted on a spectrogram, but the rest of it doesn't make much sense. You can see the original title here:

https://i.discogs.com/YC8CLuli-NW3VB5z7eA9mALjnL2Qef4aGyDYH4...

As an example, why would the index of the last term be inverted like that? Maybe it's supposed to mean something phonetically, but I'm not a music person so I can't figure it out.

jojojaf · on April 3, 2022

Oh I didn't realise the eta was the index, I was reading the sum as over i on the right hand side

shiomiru · on April 3, 2022

Unicode support is one thing. I frequently have to re-tag Japanese music because its tags are encoded in shift-jis which in general doesn't work on Linux.

nescioquid · on April 3, 2022

I wonder what the "You suck, Flying Circus" sounds like.

wdr1 · on April 3, 2022

I feel like musicians were doing Little Bobby Drop Tables long before the XKCD comic.

joshspankit · on April 4, 2022

eyelidlessness · on April 3, 2022

I was all geared up for a “misconceptions software developers have about _” post but this seems mostly like … Unicode exists and data modeling should at least comprehend the domain?

jandrese · on April 3, 2022

Or really: Natural Language Processing still sucks. Trying to fit anything human produced into a reasonable data model is doomed to failure.

cosmotic · on April 3, 2022

Very "creative"

tomcam · on April 3, 2022

They missed a few common ones; Led Zeppelin’s fourth album with Stairway to Heaven on it was named only with Prince-like symbols, and of course the group named The The (which Google indexes well).

deadbyte · on April 3, 2022

The Inertia Variations is fantastically uninspiring.

cgh · on April 3, 2022

Both of these are mentioned in the article, or were you referring to something else?

tomcam · on April 3, 2022

Well I managed to miss them both! Bad headache while I was reading it. Sorry and thank you

recursive · on April 3, 2022

I was hoping this was going to be about the musical part of music rather than the book keeping.

cratermoon · on April 3, 2022

I was hoping it was going to be about music engraving. See e.g. https://twitter.com/ThreatNotation

Tokkemon · on April 3, 2022

Most of this is great, but I'm calling BS on the Tchaikovsky example. Yes, there's been multiple historical spellings of his name, and to be absolutely accurate you would use the Cyrillic. However, the vast, vast majority of use cases should use the modern accepted spelling in English when referring to Tchaikovsky, which is what Wikipedia uses. There's a line you have to draw somewhere between accuracy and comprehension by the user and "Pyotr Ilyich Tchaikovsky" is a pretty damn good one. The only other one I see commonly these days is the slightly more Anglicized "Peter Ilyich Tchaikovsky" but that's it. Most other variants are historical or just have different editorial decisions, but they all refer back to the same guy, not a living artist who is changing his stage name all the time.

blahedo · on April 3, 2022

> ...in English...

Even in English-language contexts I've seen variants of the German-style transliteration on occasion. But more importantly, the author is not restricting himself to English, and indeed the musicbrainz link gets to such a high count precisely by including other languages' transliterations. Only three are listed as English, and they're the two you mention (with/without Anglo first name) plus one that omits the middle name.

Were you saying that all other languages should use "the modern accepted spelling in English", or were you just ignoring the fact that speakers of other languages (or owners of foreign-published albums) might use music databases?

kzrdude · on April 3, 2022

The database might implicitly force a monolingual perspective on the dataset if they can't handle multiple translations of several fields. It's not an easy problem to manage, I imagine.

nescioquid · on April 3, 2022

Here "edge-case" indicates that music is not mere noise, while somehow remaining an instance of "marketing" or "acculturation".

This is a good example of what happens when the insensate take power and pander to those who just want to read the "rule book".

No art. No expression. No bother.

Let's just agree on some fucking fiction that gets record label to bother with classical music meta-data in a way that meets the real world. Can we start there, at least?

edit: click the inverted triangle if you are a jack-ass

thewakalix · on April 3, 2022

You didn’t say “only if”, so it’s still possible to click the inverted triangle without being a jack-ass.

nescioquid · on April 3, 2022

[flagged]

cgh · on April 3, 2022

This is pretty classic r/iamverysmart material right here.