DDEX is the set of metadata standards used across the music industry: https://ddex.net/
Also worth knowing that not all content on streaming services is music! Some of it is spoken word, ASMR, non-music field recordings, etc. The difference between sound and music is subjective of course though.
On an artistic note, music can of course be presented in an infinite amount of ways. Not all these can be represented (i.e., re-presented) on streaming platforms. Installations and generative music for example. That's ok! To be able to represent all this music (and non-music) requires restrictions, otherwise we wouldn't be able to create programs for it in the first place.
Sometimes restrictions on sound content are to make the streaming services friendlier to use, like the formatting of featured artists in titles. Royalty laws put restrictions on the handling of different roles, like songwriters and composers.
BUT nothing is stopping you from building your own way of representing music digitally, so long as you follow relevant laws.
I still think it's pretty bad for classical music. An important question for various pieces is who plays a particular instrument, but there really isn't a nice way to encode this in the metadata (encoding the relation of both person and role where role could be anything).
I also usually want to know who the conductor was and what was the orchestra
I'd expect publishers of classical music to know the difference between a composer, a conductor, a soloist, and an orchestra, but no, apparently not. They freely mash them up across fields. The difference between a symphony and an album is often mangled too, especially on releases containing multiple works. And for fuck's sake, no, a movement is not a song. I don't play, "like", flag, list, cross-fade, or reshuffle them independently.
Music metadata has been optimized to suit the marketing preferences of contemporary record labels. But more general classes of music won't shift units on Spotify so the message is loud and clear, a resounding "get fucked" from the assholes that run this industry.
(One side of my family is in the recording business. I can confirm the whole thing is run by assholes.)
I like being able to address and like/favorite individual movements. Beethoven Piano Sonata #14 (“Moonlight”) is a masterpiece, but the second movement is <expletive>. I don’t know if I could listen to it hundreds of times like I have the first and third movements.
Many classical works have more than one 'track'. Musicians, critics, etc. usually refer to them by the tempo marking text. A lot of music software just doesn't get this. These tempo markings (by the composer; often in italian) are the track names. So Moonlight has 3 movements:
This is perfectly highlighting that we've got no way to express our actual preferences, and are stuck treating something as a song when it isn't, but it's the only paradigm offered to us.
Most consider the composition to be a masterpiece.
Artists/conductors 'perform' 'interpretations' of works. Whether any interpretation is 'masterful' or 'virtuosic', etc., is up to the critical listener.
Three recordings by Horowitz may have been released ... which one is 'best'? Depends on the phase of the moon and your playback equipment!
The 'quality' of a particular performance may depend on the instrument provided to the artist. The year of recording can be very important (reflects techology used). The acoustics of the recording venue are also inherently part of the performance ... e.g. too much reverb.
This, many times over. Id3 tags are one thing but all decent classical record labels (most of whom have always been drm-free) like Hyperion or Naxos give you a pdf cd jewel case booklet. I've got recordings of many friends and it's always a pleasure to see their names in this. I don't use streaming services and one of the unexpected joys of iTunes is that the album art can contain a pdf with a cover image. The rest of the UI mostly assumes that you're listening to non-classical music though. I'd love to able to search for e.g. "Victoria's Requiem" and get albums with sensible names, even if they're in Latin, seemlessly grouped by performance.
A few years ago, we started working on a platform for classical music [1] and took great pride in our metadata and our underlying schema which basically covered 99% of all the weird edge cases classical music has.
Unfortunately, we couldn't make it work economically even though we tried several approaches for monetization and had to shutdown a while ago.
Might you release the schema into the public domain? I know that's much more easily said than done, and it's your hard work and you deserve the benefits, so no argument from me if you don't.
Or maybe share some tips on what works with the corpus of classical metadata?
While I'd love to see some of the quite spectacular internal tooling + data to be released under an open license, put to use for enhancing MusicBrainz' existing data etc., I doubt it's going to happen.
All team members have started new opportunities since quite a while and untangling all this, filtering out potentially copyrighted/licensed material etc. would take quite some effort.
I primarily make hip hop music, but I would absolutely love a platform like this where I can listen to a classical piece, and download the score and MIDI files.
Are there any legal reasons preventing this ? I don't have the ability to play classical music myself, ( my weighted 88 key sits taunting me), but I'd love to be able to take these compositions and integrate them into my own music.
IANAL, but in the US, copyright on classical pieces tends to apply to the performances, not the music. For example, a Bach piece is in the public domain (the score and probably the MIDI derivative[a]) just from being old enough, but a recording of that public domain piece may be copyrighted itself.
As always, this isn’t universal. Check your local copyright laws to be sure.
IANAL either, but. For MIDI, a public-domain work might be just mechanically converted into notes of a 'MIDI score'. Probably very flat-sounding, but technically that transcription [0] may (probably) be PD.
(E.g. My handwritten copy of a PD short story is probably PD. But my reading of it isn't.)
BUT if some famous piano player performed for the MIDI recording, then you've got tempo changes, timing, attack velocities, all that data in the file as well. Probably not PD (best to ask). (As with piano-player rolls.)
The classics are out there, but often they've been performed by someone. Easy to strip down to the notes with a good MIDI editor.
It's been a long while for me, but a search on 'classic midi files' turned this up. (Hard to tell if it's crap without a listen.) https://www.midiworld.com/classic.htm
This is also important for jazz.
Should I wish, for example, to find albums where a particular bassist is playing then this would be rather difficult unless they happen to be the leader for that particular recording.
Also, it would be particularly helpful to have subgenres, e.g. jazz -> post bop, jazz -> free, etc.
Maybe you can explain one of my long-burning questions: why are non-English tracks handled so inconsistently? I'll sometimes see an English translation, sometimes a transliteration (for non-Latin scripts), and sometimes the original language. It's not consistent which gets used even for a single artist, and sometimes changes to a different one.
https://help.apple.com/itc/musicstyleguide/en.lproj/static.h... indicates that there are separate fields for each of the original, the translation, and the transliteration, but it doesn't seem like anyone actually _uses_ these, and instead just picks one arbitrarily to stuff into the main/native field. I'm not sure if Spotify even has a way to display the alternates--I can't recall ever seeing a toggle for it, though idk if I've ever actually encountered a track that includes the variants, since you can't see raw metadata on Spotify.
Is it just that music labels do a lackluster job of handling metadata? I expect this is probably the case, since classical music is similarly messy--it's a crapshoot whether the composer or performer gets used as the artist. For that I've at least seen the composer metadata field populated sometimes, but internationalization fields don't seem to be used ever.
It's really hard finding Greek songs on Spotify for this reason. There are many ways to transliterate (eg "ευχή" can be "efhi", "euxi", and all the combinations thereof, and that's just with two ambiguous letters) and no names are in Greek, so it's basically a game of trying to guess the transliteration for the track you want.
That's a good solution for Greek speakers, but a terrible solution for non-Greek speakers, who _can_ learn a "canonical" transliteration but can't write Greek
All of this is true, but I can learn that Spotify romanizes Τα παιδιά του Πειραιά as Ta paidiá tou Piraiá (or whatever) and search for that, whereas I can't type the actual name for love nor money.
I'm not saying this is necessarily a good design decision, just pointing out what might motivate this design
While I don't doubt your knowledge, note that many aspects of European language, including English, and including the European alphabets, are in a sense transliterations of Greek.
Certainly, and there isn't a canonical transliteration there either. There is some consistency, but the primary goal was that the words look good in the target language, which isn't a concern for me when I want to type my language in a keyboard that doesn't have Greek letters. For that case, multiple transliterations per letter exist, including some that use numbers (e.g. θέλω = thelo, thelw, 8elv, and any combination of those).
Ah, yeah, I agree that the way that Greek words got transliterated is actually very interesting (doubly so when you speak Greek and can see the intent behind it).
This is an area that many parties handle differently. Streaming services, artists, labels, and distributors don't all handle translation metadata consistently. There are specs in DDEX for this, but it's a matter of support and doing the translation work AFAIK.
The song "Car Alarm" prominently features a car alarm on Youtube, but not on Spotify. This could just be a choice by the artists though, rather than a platform restriction.
Probably not a rule, but maybe more like a liability thing? If people listen to your music while driving, you don’t want them to freak out from the sound effects and then sue you because they got in a car accident.
I have noticed that it's fairly common for many songs to feature random car sounds in their YouTube mixes, but not in the final masters that end up on iTunes and streaming services. It is also relatively common to have "watermarks" (e.g. additional short vocal clips at the beginning and end) added to the YT/Soundcloud mix, presumably to make it easier to figure out if a DJ just ripped the song off YT.
Music videos will often have different versions of the song with added sound effects, a more cinematic intro and outro, and even breaks in the song for a dramatic sequence. These flourishes make sense when you have a video but wouldn’t make any sense in purely audio context.
Pleasant to see Apple has made some progress on their metadata guidelines (it used to be abhorrent for a company saved by the iPod). Looks like a useful one-pager to get a quick overview of the challenges without getting into any of them.
> Also worth knowing that not all content on streaming services is music! Some of it is spoken word, ASMR, non-music field recordings, etc. The difference between sound and music is subjective of course though.
Even some albums that are clearly about music can contain non-music, take Tenacious D's skits as an example.
There's a reason why MusicBrainz SQL schema looks like what it looks like, after years of careful simplification to achieve an optimum approximation of reality.
10+ years ago at Zvooq, during the golden age of music metadata and recsys startups, we tried to really solve entity resolution in the domain. I've never seen another streaming service make an honest attempt at this, even at Spotify with their 2014 acquihire of the Echo Nest. You still get Utada and Hikaru Utada as ~completely separate entities
I think MusicBrainz did a great job to find some common agreed denominator between all these styles and formats for music metadata. After setting up Funkwhale [1], I went through all my 16000 MP3s and synced those with the MusicBrainz Library, also adding new entries. It went pretty smooth and my library looks much better now. It is also really fun to listen again, when everything is structured and labeled.
(btw., yes, I bought a large percentage and, nowadays, mostly buy on bandcamp [2]).
What.cd had a very simple data model and it had the same issue as many music sites trying to fit it into a simple schema (and not a more sophisticated one like MB). The first issue that comes to mind was artist with the same name, these ended up being the same artist (Just like on Last.fm) if not split by hand by changing the artist name.
I understand "horrible" in this case in a somewhat lighthearted sense. But it points to a tendency to simplify cultural complexity in the interest of "legibility" from the system perspective. The same diversity of cases need to be accommodated in much design done for, say, a diverse country like India where though urban populace me be English savvy, rural populace aren't necessarily so ... And even in urban areas local language and culture dominates.
A place this diversity crops up most commonly is the so called "standard" firstname middlename lastname fields. This just doesn't work in India where people may have even 7 components to their names ... and who are you, dear system developer, to dictate what their name should be like.
The name structure diversity I understand is a problem for Europe too with hyphenated names. This brilliant sketch by Fry and Laurie says it all - https://youtu.be/1LopIroSjsU
I mean, we get to dictate it because we’re writing the code and don’t know? I guess enterprising Indian developers should help it get fixed. As the whole world gets online this stuff gets sorted out, eventually. It’s not as if it’s malice or something. There are Americans with longer names though, I had a friend whose full name was five words, and all of his sisters and brothers also had “three middle names”.
I've noticed some of these for years. When designing a music database, you do need to be aware of them. You need to use unique identifiers instead of names. last.fm doesn't do this.
There were four bands called Kaleidoscope, all active around the same time: one from England, one from Mexico, one from the USA, and one from Canada. To disambiguate them, generally the country is specified, e.g. Kaleidoscope (UK).
There were two bands called Bulldog Breed, both of which recorded albums called Made in England. They're decades apart, but as albums can be re-released, they're normally disambiguated by record label, e.g. Bulldog Breed: Made in England (Deram).
There are two Nirvanas, the original UK-based band, and the US one. The UK Nirvana recorded a cover of Lithium.
You'll be aware that the Beatles recorded an album called Revolver. A Russian band, called Revolver, recorded an album called Beatles.
Do you include the definite article in the band's name? If so, what about The The?
Tintern Abbey released a single single, the A-side of which was called Beeside.
> Do you include the definite article in the band's name? If so, what about The The?
My SiriusXM radio omits leading articles like “The”, so “The Beatles” is displayed as “Beatles”. I alway long wondered how it would display “The The”. Would it be “The”, empty string “”, “The The”, or something else entirely. I eventually learned the answer: “The The”. Which is good, but then made me frustrated that it wouldn’t allow the “The Beatles” to be “The Beatles”.
I've always thought it'd be a bit cheeky to start a band and name the first three albums "Self-Titled", "Eponymous"[0], and give the third album the name of the band.
Fortunately, I lack any amount of musical talent, so this will never happen.
In addition to all of these edge cases, it doesn’t help that the metadata services have a lot of inconsistency as well. Say I have a song X that was performed live on DATE at some VENUE. It may show up as titled “X”, “X [live]”, “X (Live DATE)”, “X (DATE VENUE)”, and so on. Stuff that should be in metadata ends up in strings embedded in song and album titles, which makes for a big mess. I curate a huge collection and trying to keep my metadata consistent is a massive headache. I think I’ve tried just about every tool out there and still haven’t found one that I’m happy with.
Or the many ways to have “featuring” — “feat.”, “w/“, “ft.”, “(feat.)”, “[feat.]”, etc. MusicBrainz does a laudable job trying to standardize all this crap and as someone who is really particular about how their music metadata looks it’s highly appreciated.
If you use a streaming service you're hearing the latter and might not ever realize the former existed. The re-record is a fine song, but the original fits much better with the rest of the album.
This is my problem with Rust in Peace. It was a terrific album, but Dave didn't want to continue paying royalties to people he no longer likes. So in 2004 or so almost the whole album was rerecorded with studio musicians. It is horrible, mechanical, flat. Even some of the vocals were redone, poorly.
I'll sign up for the streaming service that provides the original 1990 Rust in Peace.
That was Killing is my Business I think. Dave remastered that one too, which is actually not bad, but some things were changed and due to copyright he had to leave off the Nancy Sinatra song.
I'd love a version of Killing is my Business with the remastered content were it is identical to the original, but with none of the content or artistic changes. Something like the despecialized Star Wars videos.
The discogs javascript is not playing nice with the current Firefox. I'm specifically referring to the "Show more versions..." control.
OK, found it! It seems that the desirable keyword is "Reissue" and specifically to avoid the keyword "Remaster". There was at least a Reissued but not Remastered LP in 2013. Maybe more, but that javascript is horrible!
This annoys me in Spotify so much, that we often only get one version of the album, and usually a new one instead of the original.
And maybe I should feel like it's a luxury to have extra bonus tracks with more takes at the end - but I lean over to the album purity side more - the album is one artwork by itself, I want the original track listing and nothing more, as "the album".
> If you use a streaming service you're hearing the latter and might not ever realize the former existed.
On a number of live Bob Dylan albums, for inexplicable reasons all online versions (not just streaming, but the download releaes, too) are now missing the between-tracks content that on the physical CDs is part of the pregap, which means you're missing out on the funny stage banter on the 1964 Halloween concert, or the infamous heckling on the electric half of the 1966 Manchester concert.
There is a band from Pasadena, California called Ozma. Way back in the day (1999), they released an album called “Songs of Audible Trucks and Cars” [0] via MP3.com. I later found out that they really wanted the title to be “Songs of Inaudible Trucks and Cars” [1], but MP3.com had a character limit on album titles, so they had to shorten it
If, like me, you immediately thought "why not just drop the 'and' and replace it with an ampersand?", Ozma had to do that, too, and parent's quoted the album title incorrectly, alas.
A funny edge case I've found in one of the most popular music services (Last.fm). There is an instrumental post-rock ish band called Collections of Colonies of Bees and they have an album where almost all the tracks have the same name; they are all titled "Fun" except for the last track titled "Funeral". https://open.spotify.com/album/0LP8DTy4amlLIVXGOjvCo2?si=HJ3...
This has caused last.fm to think "Fun" is one of my most played songs of the year even though they were all separate tracks.
Last.fm doesn't allow "untitled", "various artists", "track 01" and other titles that may suggest the song is incorrectly tagged. This leads, of course, to edge cases where an artist has a track named "untitled" and it can't be scrobbled.
Moreover, last.fm also has an issue with unicode characters and some artists/albums may have two or more different profiles. It's a mess!
I actually was more traumatized when I discovered that artists with the same name are filed under the same ‘profile’, and just have a textual explanation that they are actually different. You'd think that a service purporting to do ‘big data’ processing could grab discographies off Discogs or something, and compare the album titles.
Anyway, personally I long since abandoned Last.fm in favor of more hands-on discovery.
I would guess that the problem only appears if it's the same artist and same album ID. Not sure if techno artists use "Untitled" multiple times on the same album, haha.
Oh, easily—whole careers are probably made of ‘untitled’ tracks. Afaik, in edm a more common identifier for such unnamed records and tracks is the catalog number and track number or the side (for a single).
Another famous one is Prince's "Love Symbol no 2".
Fun fact: he adopted it as his stage name to spite Warner Brothers Music, due to the unfavorable contract terms and the creative meddling they kept doing (among other things). By terms of the contract they had to go along with whatever stage name he picked. They had to spend a bunch of money having a custom font created, then mailed on floppy disk to every music reviewer, newspaper, magazine, radio station, etc as part of album promotion. No one was on the internet in those days so it ended up causing them a lot of trouble. I guess he found his own loophole in the contract.
One of the simplest data model mistakes that seems nearly universal in music software is pretending that some things are single values when they're really lists.
Take "artist". An album can have a compilation of tracks, each from a different artist. A given track can have multiple artists, eg a guest singer. There can be multiple composers of a track, there can be multiple writers of lyrics, and there are usually several performers (even if they all are part of the same band).
An album can have multiple names, or no name at all (which often means people refer to it in different ways, giving it multiple effective names). Likewise with tracks.
Albums and tracks get re-released, sometimes re-mastered, sometimes re-recorded entirely, without the names changing.
The same recording of a given track might be released on multiple albums, eg as a single in a pre-release and then within the main album.
Tracks might be mastered differently for different media (Radio vs CD vs background use in cinema, etc).
Names of any of these things don't have to be representable in any existing text encoding, or even any combination of encodings.
Artist name should be a list (or other multiple element structure).
Track name should be a list.
Album name should be a list.
Etc. Assume a malicious artist will find a way to make the assumption that any field has a unique value incorrect.
Right. It's a relational database with lots of many-to-many relationships. We need to stop trying to make it a flat file. A FOSS SQLite schema that everyone can embed, or even a turnkey little db, would be wonderful.
It's not much an edge case in the sense of those, which mostly concern software authors making incorrect assumptions about the data in fields, but rather a "not even realizing we need a field" case, but music library managers are often quite annoying when dealing with classical music.
They really want to fit everything into the artist/album/song scheme, and that doesn't really work well for much classical music. In classical music most of us would want something like composer/piece/orchestra/conductor/performers (the "performers" field would be for performers in addition to the normal orchestra, such as soloists or choruses), and it should be possible to change the sort key order among those fields.
> …music library managers are often quite annoying when dealing with classical music.
Apple will apparently be solving this problem this year with a dedicated "Apple Classical" app, the origin of which is their Primephonic (https://www.primephonic.com/) acquihire.
"Why are you shutting down Primephonic?"
"To focus on creating an even better experience for customers around the world. Apple Music plans to launch a dedicated classical music app next year combining Primephonic’s classical user interface that fans have grown to love with more added features."
Definitely. Pieces end up being listed something like "Symphony No.3, Mvt. II", but they only show the performer which is the orchestra playing rather than the composer so I have no idea which symphony it is. I haven't seen any service get classical music right.
I think your list is pretty good (composer/piece/orchestra/conductor/performers). It'd be nice to also have a field for the movement. Also I totally agree with custom sort order being a must. E.g. for choir pieces swap out 'orchestra' for 'choir', for piano pieces only include 'performer' field, concertos should have the 'soloist' before the orchestra, etc.
It's a sleeve notes problem. Ideally you'd want a complete copy of all the info in the sleeve/inlay. This includes some combination of headline writer/composer, producer, conductor, headline performer, co-writers, arrangers, session players, engineer, studio, lyrics, and random sleeve notes/essay, possibly in multiple languages.
Per track.
Plus artwork. With credits.
Plus random thank-yous and special thank-yous to sponsors and other contributors.
The importance of each field varies with each project. For classical conductor showcase projects the conductor is the headliner. For classical composers, obviously it's the composer. For compilations it can be a concept like "Classical Chill" or "Early Music from the 15th Century." For pop projects it's usually the band/artist name. For movie tie-ins, it's the movie, or perhaps the franchise.
And so on.
You get all of this for free on CD/vinyl labels, and the fact that it's freeform doesn't matter because the sleeve artwork and visual placing of the elements make it clear who/what matters.
Trying to fit this into a single fixed metadata schema is madness. It can't be done.
This is one of those situations where trying to data-fy the information leads to huge loss of detail. You get some nominal (poor) searchability, but you give up a huge amount of interesting supporting information.
What the industry needs is a digital sleeve note standard which can handle all of this and more.
But metadata is designed for sales and stream tracking, not for user satisfaction. Which is why it's so bad at capturing what users want.
A lot of music library managers (including good old iTunes) let you add a lot of various and esoteric song tag data as columns in the track list. The problem was always that the column arrangement that makes sense for one style of music doesn't make sense for another.
One missing feature that I've never seen in a music library application is the ability to have "presets" for these column headers that you can easily switch between. So you can turn on the "classical" preset which shows Composer, Performer, etc., and you can turn on the "contemporary" preset which shows Artist and Album.
That’s actually something MediaMonkey [0] (Windows-only) had for a long time (I assume the current electron version as well, I use the last native version). Classical is one of the defaults, but you can add whatever.
It’s one of my favorite pieces of software, a true music library manager and not a music player with some management features tacked on.
Agreed, I've tested many self-hosted music streaming servers and decided to settle with navidrome in the end but not without major reservations. I personally don't mind sorting music via album/song, but the lack of composer field is really painful. Unfortunately the omission of composer information occurs in the subsonic api, which have a large number of compatible players, so adding the field is not as straightforward if one wishes to do it in an interoperable way.
I used to be quite happy with a local music library and playing it with mpd, but at some point I got sick of maintaining over 100 GB of music library across multiple devices. The streaming solutions for mpd are quite awkward to use, otherwise I would have stuck with it just to get the composer field.
There's an analogous problem going on on musescore.com, where you're supposed to list the work your score is based on. Viewing a score on the site will then include links to other scores based on the same work.
I uploaded several pieces from Swan Lake, and listed each of them as the particular piece that they were. For example, https://musescore.com/user/36584999/scores/6642212 is Swan Lake, No. 4 (Pas de trois), part IV.
But every time I list a piece like this, it gets "corrected" to the canonical reference, which is "Swan Lake". And the "other scores based on the same work" are totally useless, because they can be any piece from anywhere in the ballet. It's more than 600 pages long! Where's the value add from removing my indication of exactly which piece of Swan Lake I just scored?
I believe you would no longer be able to copyright the track, then. So, you should encrypt the encoded titles, but record an audio track that is a voice recording of you reading out the encryption key for that track. You know, for DRM.
> I believe you would no longer be able to copyright the track
Why? I'm not sure about US law, but here in Poland it would definitely still be copyrighted (even if you'd be effectively granting an automatic implicit license to use that name for the purpose of referencing the work).
My logic is that you cannot copyright a fact. It is a fact you released an album, and it is a fact you titled it such. I'm free to reproduce that information, say in a catalog, without violating copyright.
I much prefer this informative post (with a clear title and plenty of examples and explanations) to the uncommented "Falsehoods programmers believe about ..." listicles.
Absolutely. Stating that an assumption is false with no further information doesn't help me update my mental model of the world. Giving an example of how an assumption is incorrect shows the type of errors that can occur, when they can occur, and lets the reader figure out a better way to work around the false assumption.
Peter Gabriel's first four solo albums were untitled. Well, the fourth was titled in America, but untitled everywhere else. Fans tend to distinguish them by the cover art: there's "Car", "Scratch", "Melt", and "Security".
Oh please tell this old Genesis fan some more. Gabriel was a very forward thinking musician from a technology perspective. See this clip for his demonstration of an early sampler [0]
Ha, not many stories. He's a nice guy. I worked for him like 1999-2004? Through the 'Up' album, which I think is really underrated. Mostly Real World stuff though, as that has been his primary focus. Some cool parties at his studio, which is beautiful. I remember one when Martha from Martha and the Muffins had just finished producing an album and was moving on.. there was a party and bands.. and then PG got up on stage and took over the keyboard and started playing the beginning of "Echo Beach" and Martha was laughing in the crowd and saying "Nooo!", but you can't say No to PG, so she had to get up on stage and her and PG gave us an impromptu performance of "Echo Beach" which was awesome.
p.s. that video was awesome. Were those 8" floppies it used?
Yeah I was starting to use PCs during the Apple II era so we didn’t have 8 inch disks anymore by then in my house. But I do remember seeing them in labs with VAX systems.
But those specialized music workstations all were custom designed so the parts etc. used were based on older standards I’m sure. Pre-ISA bus too I would guess.
Nowadays, even a lot of embedded devices like IOT things are running effectively a Linux OS on them so the economies of scale are enormous compared to the early 80s music gear.
I imagine Roland and other similar companies had their own OS or base platform they used on those earlier workstations that eventually had screens you could attach.
What is the deal with Japanese exclusive bonus tracks/promo thing?
Interested about why a significant number of albums have Japanese promo versions with extra tracks. Is that a cultural thing about Japan or is there another specific reason? (I know it's not only Japan but mostly Japanese editions that I see throughout the web)
As if, almost even brouillard has a Japanese bonus track named brouillard in their album brouillard.
I heard somewhere that it was an incentive for the Japanese market, which still today is a large physical CD market, to buy the locally distributed releases of albums instead of just buying the original foreign release, which may have been available earlier and for cheaper, or if they already bought the original release, buy it again.
Is this why the super-high quality vinyls were also available only in Japan? I cannot remember the name of the company, but there is (was) some company selling increased dynamic range recordings of popular albums, on very durable (as in, regularly replace the needle) vinyl. I had heard an imported (from Japan) Dark Side of the Moon recording on that once, and it just blew me away.
I once read somehere that CDs are expensive when new, can't confirm though. When i was there i bought them used for cheap. Slayers' Hell Awaits, colored like egg yolk ~6€ IIRC, thank you very much.
The odd one for me is Pendulum - there’s two different Australian acts, both electronic, with the name Pendulum. For some reason the newer Pendulum really liked the name and weren’t bothered / ethically concerned that the Pendulum that made ‘Coma’ had already existed. As of today Spotify has songs from both artists under the same artist name.
Plenty of duplicate artist names, Spotify artist support will split them into separate artists if you contact them, and now aggregators like DistroKid let you specify the actual artist ID when uploading (at least for some DSPs). It’s just another instance of there are only two hard problems :)
There are much worse things than Pendulum. John Williams seems a very popular name for artists. :) Back when I was designing the "new" MusicBrainz database, that's what we have been using as the test case for this.
To add to the list: Song titles that cannot be fully represented in Unicode. Example (also to plug my favorite band): Magma, a French progressive rock [*] band that's been (on and off) active for more than 50 years, mostly uses their own invented language "Kobaïan" where some of the characters use diacritics that are (to the best of my knowledge) not present in Unicode. E.g. there's an S with a kind of crown on top.
[*] Actually their style of music is unique enough that it's given rise to its own genre, "Zeuhl".
You wouldn’t index a collection of records by the shape of its sonic wave forms, would you?
No more would you index by the artistic representations on the cover which, as we have seen, includes everything including the words purporting to be the artist name and the title of the release.
No, of course not! You use the barcode on the back!
I recently started a fun/side project which aims to improve my personal tedious process of (a) managing a queue of things I want to listen to, (b) keeping track of which of them I listened/liked and (c) easily sharing my "discovery activity" with my friends who share the same passion.
So I've started by parsing Discogs' data dumps[1] to backfill my database (aka. "release groups" in MusicBrainz), so that I avoid hitting Discogs API with every user query) with all "master" releases. But then realized that (a) they don't include cover images and (b) not every release has a corresponding "master" release and (c) they're not updated frequently enough (maybe monthly).
Then I thought of using MusicBrainz data dumps[2] to augment the entries by Discogs in my database, which do provide cover art images, but then how do I correlate them with the already-inserted Discogs releases? Fortunately there's a Discogs URL for many release groups from MusicBrainz; unfortunately only 50% of them do have it.
Then I could perhaps use solely MusicBrainz data and not Discogs at all, but then what if people are mostly using Discogs links? This will result in two identical records on my database, pointing to the same underlying release but in different sites (Discogs & MusicBrainz). Perhaps this can only be solved by human moderation and providing the ability to "merge two items".
Then I thought of actually contributing to MusicBrainz and add the Discogs URL for every release group.
Another interesting assumption I made, which turned out incorrect, was the every [artist, release title] pair was unique. But that's not the case - an artist may have multiple "master" releases (i.e. completely different tracks inside) with the same title.
I am amused that despite mentioning both Keygen Church and Master Boot Record, this list refrains from mentioning that the former is a side project of the latter.
And, yeah, if I was still collecting my music as CDs? Keygen Church would be filed next to the MBR because of that, not over in the Ks.
AC/DC is another edge case I've seen, mainly because in some contexts slashes are used as separators between multiple artists, so "AC/DC" could be misinterpreted as a list of 2 artists named "AC" and "DC".
"/Τĥιs ñåmè įß ą váĺîδ POSIX paτĥ" is my display name on some services at my job. It helps to remind people to consider things like spaces, homoglyph attacks, and non-ASCII characters.
There are two well-known Avishai Cohens, both jazz musicians (so you can't disambiguate on the broader genre), both from Israel (so you can't disambiguate on country of origin), but luckily they do play different instruments (bass and trumpet).
2009 they released a full self-titled album (only 1 song from the EP featured on it).
I can't even find the "second" release on discogs.
Both were self-released, so no way to disambiguate on ISMN or record label/publisher.
So you'd have to disambiguate only on the year of release, or define what demarcates an EP from a full album and mark them as such (there are some competing definitions out there, since it became less clear after the introduction of the CD, and now digital, but even back then you had complections like maxi singles and mini-LPs, IIRC).
---
Many of John Zorn's releases tend to be headaches:
"Tap"
Composed and produced by: John Zorn
Performed by: Pat Metheny
Subtitle "Book of Angels, vol. 20"
Overarching series "Masada Book 2"
Released on Tzadik (Zorn's personal label) as a John Zorn/Pat Metheny output.
Licensed to Nonesuch (Metheny's label) as a Pat Metheny release with different cover.
> There are two well-known Avishai Cohens, both jazz musicians (so you can't disambiguate on the broader genre), both from Israel (so you can't disambiguate on country of origin), but luckily they do play different instruments (bass and trumpet).
Along very similar lines, there are also famously two musicians named Bill Evans. Both play jazz (genre) and both are from the US (country). As a bonus, both have been members of bands led by Miles Davis. And again you can differentiate them by instrument (piano and saxophone).
Which is sort of what Wikipedia article titles do to differentiate them:
How are these albums names "edge cases"? I feel like ANY software would deal with them just fine. Hell, the last two don't even use non-ascii characters (though you definitely should have unicode support regardless).
Edit: I just realize HN won't the star symbol (U+2605). Which is weird since it does support characters like Chinese/Japanese glyphs あいうえお你好
There are actually some challenges with stuff like that; how do you sort "'" for example? Most people call the album "Apostrophe"; should it be sorted under "A"? What about the "†"? "D" for dagger? Somewhere else? How do you find things like that in the search box? Should searching for "star" return that Bowie album?
> Edit: I just realize HN can't display [star]
It doesn't do emojis and various graphic characters. This is supposedly a "feature" shrug-emoji-here
> † can either just sort wherever Unicode puts it, or if the album has a known pronunciation like 'star' above, have a sorting key too.
Sure, but it doesn't happen automatically, so it's not "just do [..]". You need to think about some of these edge cases.
I wrote a simple music library/player a bunch of years ago and ran in to some of these issues. I never did figure out how to deal with the band named "" in a sane way.
The aversion to emojis is strange. I've come around to thinking of them as a net positive. Just text is sterile and the tone often comes off wrong! Smiles lighten up conversation, we need that to not lose track of each other's humanity when arguing in comment threads. :)
When music downloads were first starting up in the late 90s I was responsible for building a schema to store all the world's music. We had to rip millions of CDs. When I got that Aphex Twin one above I gave up. I think we just called it [Track 2] or [Maths symbols] or something.
It's worth noting as well, that some of these edge cases are directly because of artists intentionally "messing with" whatever they perceive as hard rules.
The reason I mention this is because you cannot simply look at a list of all edge cases, identify what schema you can use to encapsulate them all, and then build your service around the idea that the schema will never change.
As a very simple (I hope contrived) example: You could decide to use a string delimiter so that you can capture "all the strings" and still use string interpolation. You then look through all the edge cases and see "No one has named anything with U+1F7F6, so we're safe to use that". 100% guarantee that as soon as someone sees that they will name a track or an album in the exact way they need to to break your system.
Electronic music is also full of remixes, for example:
Artist: Moby
Title: Simple Love (B-Motion Remix)
The actual artist is B-Motion and any kind of browser/catalog must group this track with other B-Motion music, or in worst case both, because remixes are in entirely different genres than the original.
- Artist names can become rather long too, especially for collaborations. Some anime OSTs easily break 256 characters.
- And for song titles that need careful escaping, QuelI->EX[cez]->{kranz}; and EXEC_SOL=FAGE/. have tripped up more than one CLI music player in my experience.
I don't think it's meaningful to talk about edge cases without a set of requirements connecting inputs to desired outputs: what you want to do with the data, or hope to get out of it.
Features like long album names, empty names, or names with international characters from Unicode, are not hardly edge cases. Edge cases have to have some surprise to them.
Coming from a blank slate, you'd never approach a problem with the assumption that all strings are 32 characters or less, confined to 7 bit ASCII, and are never empty and then paint your face surprised call "edge case!" when they aren't.
Yeah I should have made it clear that I mean specific problems, like musicians who release an album under a different name... With a folder based approach it's easier to put all of those in one place. The filesystem also has some drawbacks, like a more limited character set. But all in all, I like it.
How about the horrible edge case when trying to read this pale blue text against a white background. It's been a while since I criticized the readability of a web page, but I think this page deserves a critique.
Pretty disappointing to see an entire article like this go by without a single mention of classical music. Also known as basically all music prior to 1940 plus a lot of the most enduring music since.
Not sure I understand? There are accepted names/sequencing for all major Baroque, Classical, Romantic etc. artists: Mozart K numbers, Bach BWV numbers, etc. And usually there’s a close to unique name like Adagio in D Minor, Piano Sonata in Bb Major, etc.
I was searching for the artist Urban Soul on Tidal to discover there's another artist called Urban Soul (I wanted the Roland Clark project) with albums by both mixed up together in the search result.
'Twas all in vain, since I wanted a particular remix of `Alright' — which I have on a vinyl promo less that a metre from where I am sitting (I can remember clear as day buying it more than 25 years ago) — which Tidal don't have. So I ended up listening to it on YouTube.
Amongst major artists, Tupac is good at causing confusion. Is he Tupac, 2Pac or Tupac Shakur? Does it depend on the album? Is his alter ego Makaveli a different artist?
Imagine documenting something like this https://www.discogs.com/artist/661099-Filipe-Santos (check out the edit history) where the artist actively sabotages the documentation effort and makes their music unavailable on a whim.
It's pseudomathematical nonsense. The Fji term might be a hint about the image hidden in the track when plotted on a spectrogram, but the rest of it doesn't make much sense. You can see the original title here:
As an example, why would the index of the last term be inverted like that? Maybe it's supposed to mean something phonetically, but I'm not a music person so I can't figure it out.
Unicode support is one thing. I frequently have to re-tag Japanese music because its tags are encoded in shift-jis which in general doesn't work on Linux.
I was all geared up for a “misconceptions software developers have about _” post but this seems mostly like … Unicode exists and data modeling should at least comprehend the domain?
They missed a few common ones; Led Zeppelin’s fourth album with Stairway to Heaven on it was named only with Prince-like symbols, and of course the group named The The (which Google indexes well).
Most of this is great, but I'm calling BS on the Tchaikovsky example. Yes, there's been multiple historical spellings of his name, and to be absolutely accurate you would use the Cyrillic. However, the vast, vast majority of use cases should use the modern accepted spelling in English when referring to Tchaikovsky, which is what Wikipedia uses. There's a line you have to draw somewhere between accuracy and comprehension by the user and "Pyotr Ilyich Tchaikovsky" is a pretty damn good one. The only other one I see commonly these days is the slightly more Anglicized "Peter Ilyich Tchaikovsky" but that's it. Most other variants are historical or just have different editorial decisions, but they all refer back to the same guy, not a living artist who is changing his stage name all the time.
Even in English-language contexts I've seen variants of the German-style transliteration on occasion. But more importantly, the author is not restricting himself to English, and indeed the musicbrainz link gets to such a high count precisely by including other languages' transliterations. Only three are listed as English, and they're the two you mention (with/without Anglo first name) plus one that omits the middle name.
Were you saying that all other languages should use "the modern accepted spelling in English", or were you just ignoring the fact that speakers of other languages (or owners of foreign-published albums) might use music databases?
The database might implicitly force a monolingual perspective on the dataset if they can't handle multiple translations of several fields. It's not an easy problem to manage, I imagine.
Here "edge-case" indicates that music is not mere noise, while somehow remaining an instance of "marketing" or "acculturation".
This is a good example of what happens when the insensate take power and pander to those who just want to read the "rule book".
No art. No expression. No bother.
Let's just agree on some fucking fiction that gets record label to bother with classical music meta-data in a way that meets the real world. Can we start there, at least?
edit: click the inverted triangle if you are a jack-ass
The Apple Music style guide is a great and accessible place to look if you're interested in music metadata: https://help.apple.com/itc/musicstyleguide/en.lproj/static.h...
DDEX is the set of metadata standards used across the music industry: https://ddex.net/
Also worth knowing that not all content on streaming services is music! Some of it is spoken word, ASMR, non-music field recordings, etc. The difference between sound and music is subjective of course though.
On an artistic note, music can of course be presented in an infinite amount of ways. Not all these can be represented (i.e., re-presented) on streaming platforms. Installations and generative music for example. That's ok! To be able to represent all this music (and non-music) requires restrictions, otherwise we wouldn't be able to create programs for it in the first place.
Sometimes restrictions on sound content are to make the streaming services friendlier to use, like the formatting of featured artists in titles. Royalty laws put restrictions on the handling of different roles, like songwriters and composers.
BUT nothing is stopping you from building your own way of representing music digitally, so long as you follow relevant laws.