If I understand correctly, you want to make playlists portable by associating each item with enough metadata to allow identification even when some metadata, such as filepath, changes.
Playlist and metadata are separate things, and coupling them is not good from an architectural perspective (redundancy/denormalization). The two can still be related, even in a single file: relational database. (plug: I'm working on http://jstimpfle.de/projects/wsl/main.html)
This is more normalized, less redundant. Of course other schemes are thinkable, for example storing each playlist in its own file, but this way metadata and playlists will more quickly get out of sync.
The library I'm working on allows automated conversions of such hierarchical data to/from relational databases, so data can be viewed, and even stored, as JSON, while integrity is checked based on an relational schema - something like
% TABLE Track trackid
% TABLE Filepath trackid filepath
% TABLE SHA512 trackid sha512
% TABLE Title trackid title
% TABLE Composer trackid composer
% [Declare also domains, keys, and foreign keys here...]
Hmm, yes, this is clearly a better approach, architecturally. However, I don't know whether it's worth the tradeoffs, given that it will make the file harder to parse, and playlists tend to have each song only once, so the added complexity seems to come at no benefit...
It's definitely a better approach for a library, though, and it's how one would model the data in a database as well.
A side note: at least in python, I benchmarked MsgPack as 3 times faster than JSON, and a whopping 850 times faster to read than YAML. It seems unlikely that people will ever be editing this file by hand, so for unstructured data, especially where playlists can get very large, I suggest MsgPack over YAML.
BUT... you have a defined schema, so it's probably to your advantage to use a storage format with a defined schema: ProtoBuf or Thrift. That would mean somebody trying to use your code would already have generated objects in their language.
As for the hashing algorithm, this is a good use for MD5 -- cheap and fast. You're unlikely to be concerned about somebody actively trying to generate the same checksum for two music files. For non-security (integrity verification) purposes, MD5 is still very appropriate.
Thank you, I agree. If I'm going to use JSON, I might as well use ProtoBuf.
About MD5, I was worried about a case where a service that serves user-submitted files would be exploited by MD5 collisions, leading users to open files that might exploit decoder bugs to execute code. Far-fetched, I know, but the tradeoff didn't seem worth it. I'm not married to that decision, though.
The question of hash usage made me think of an alternative approach -- what about some kind of audio perceptual hash? P-hash has support for audio hashes [1] (at least it claims to, but I've never used it). The metadata is useful, sure, but coupling it with the playlist seems like a bit of a strange design choice, if it could be avoided. In my mind, an ideal world would have two databases (or equivalent): one for metadata -> perceptual hash, and one for playlist -> List[perceptual hashes].
The downside of course is this requires pre-calculation of the p-hash for every track to use. But I can't think of a music application that doesn't require some kind of "library loading" step, so perhaps this could be accomplished then?
Of course none of this mitigates your concern with decoder bugs resulting in RCE, but I think that's probably best handled elsewhere (for example, sandboxed upload validation in your hypothetical user-uploaded-files service).
The most annoying thing is not dedicating the time to cultivate playlist with your favorite tracks, but to LOSE ALL of them, when you move or rename a file. And that is something that happens in 99,99% (I'd say) of all cases with any music library over time.
Example: I just moved all mp3, m4a files to the microSD card on my android phone, keeping names and folders identical, but my playlists are all empty now. Thank you Samsung! grr..
Making a path independent, p-hash independent (but utilizing, if available, or requested) playlist format is what I'd really want. The metadata should always be saved inside the file because metadata get's lost when you change the program. Filling up an sqlite db with all the metadata saved in your audio files would only speed up meta-data management and sync, but remove it's control from you.
Features I think make sense to expect from a perfect music player (without vendor-lock-in), be it run on mobile, web, desktop, cli or as a daemon:
• Playlist export options for ie. Samsung Music Player, iTunes or whatever crap we're locked-in currently
• Save Metadata incl. rating always inside audio files, because metadata get's lost when you change the program
• Extract Metdata from audio files into a database for speed, management and easier sync into the files
• Audio-Fingerprint may allow detection of: duplicates (hash independant), similars, classify genre and map mood-maps
• Batch-Convert between flac,mp3,ogg,mp4,m4a if user wishes without stupid dialogues. 128Kbit LAME-encoded MP3s don't sound converted to "best quality ogg"..
• Create P-Hashes (or another perceptual hash) initially, when idle, periodically or when requested
The AcoustID fingerprint is exactly what you describe, and is already supported by the format. I agree that my crypto concerns are probably handled elsewhere, and, since I'm not actually doing crypto, I might as well add MD5/SHA1, which are more ubiquitous.
Haha, the bikeshedding argument is fair, but not terribly so. It's good to receive feedback of all sorts, and then the designer (me) thinks about it, distills it and makes something that's (hopefully) better than what was there before.
It's only bikeshedding when the decision has to be made by community!
Nothing's wrong per se (and it has roughly the same goals as mine), it's just that it's pretty much the XML equivalent of M3U, and doesn't have any provisions for stronger identification (it relies on titles), at least as far as I know.
I would really like the MBID to be the main way of identifying songs throughout the industry, and possibly the AcoustID fingerprint, as that's more specific. I think it would be fantastic if each song had its own UUID that computers could refer to, and this spec tries to incorporate that.
Since I don't really gain anything from reusing its format (most players won't support the strong IDs), I thought I might as well create a new format that was better suited to this task.
XSPF is a format to enable sharing, AKA universality. It does this by defining a list of metadata fields to be used for resolving each track in the local context of the listener.
On a surface level you can use XSPF like any other playlist format. Drop a bunch of filenames into an XSPF document, prepend "file://" to each, and you're ready to go. Under the surface there is much more.
The guiding design principle was to separate the functionality of a catalog of files from the functionality of a list of songs. Most software music players on the PC have some sort of cache for file information. This cache stores a list, or catalog, of available files and metadata from ID3 tags and other sources. XSPF is not a catalog format. XSPF exists only to say which songs to play. Almost everything in XSPF is for the purpose of answering the question which resource, rather than the question what is this resource.
If XSPF is not a catalog format, what is it? XSPF is an intermediate format. We expected a new kind of software called a content resolver to do the job of converting XSPF to a plain old list of files or URIs. A content resolver would be smart enough to keep your playlists from breaking when you move your media from /oggs to /music/ogg. It would be able to figure out that a playlist entry by the artist "Hank Williams" with the title "Your Cheating Heart" could be satisfied by the file /vorbis/hankwilliams/yourcheatingheart.ogg. It might even know how to query the iTunes music store or another online provider to locate and download a missing song.
The content resolver maintains the catalog of your songs in whatever format it prefers. It might use a flatfile, a file in the Berkeley DB format, or a SQL database. It might use only ID3 metadata, but it might also know how to query MusicBrainz or another metadata service.
All XSPF user agents are content resolvers, in that they have complete leeway to turn the contents of a track element into a specific set of bytes.
3.5 Fuzzy names
Any given track can be identified in a number of ways. We provided means for absolute identifiers like URIs, filesystem paths and secure hashes, but also for query-based identifiers — free text fields like artist and work title and numeric fields for song length, all of which together should be enough for a good content resolver to turn into files.
That sounds like a noble goal, but in a "modern" format designed for the same thing, I'd expect the role of the "identification key" for the files to be played by a format-standardized-algorithm audio fingerprint. Audio fingerprints are the only thing you can really expect to be "portable" between music libraries, when people can put arbitrary things in the ID3, and both combined tracks and compilation albums exist. And then, as long as you have such a key, the format doesn't need to consist of much else. It's just a list of keys.
MD5 fails many common use cases and sees differences where there are none. The result will be frequent failures to identify potential matches between different files for the same song.
Ah, yeah if ID3 tags change after the initial md5 ran than it would not be recognized. Good point.
The trade off would seem to be if the audio analysis hash needs to scrape though a huge playlist and it's running on a slow arm processor than it would be impractical and no one would want to use it.
MD5 might not be ideal, but how often do people edit their files after making one of these playlists?
This design assumes a bit of standing infrastructure around it: namely, that you have a "music library manager" program, all your music is in it, and it fingerprints tracks on import and keeps a fingerprint-keyed index.
If that's true, then nothing has to happen on export. You just dump the hashes you already know.
And then, when you try to load someone else's playlist, all you're doing is a bunch of hash-table lookups against the index you already have, to see if you have tracks with matching fingerprints.
(And the initial generation can also be made cheaper, if online music stores also adopt the fingerprint format, such that tracks you buy come with their fingerprints pre-calculated and embedded into the ID3 metadata. Then you can just dump those straight into your index on import.)
Spot on on everything, except the "assumes" part. I'm writing a utility that will convert to/from PLS/UPL without the infrastructure (granted, it will do it on the fly, and it will require some tags to be in the files):
We have enough playlists formats to support already (and most of them are half-baked/half-broken already). There is no good reason to not reuse and extend xspf.
(Last time I counted 16 major playlists formats in VLC...)
Isn't one good reason the fact that you avoid the whole confusion of "I imported my XSPF playlist with all my IDs but my player didn't find any songs!" "Oh, that's because your player only supports XSPF 1, not 1.1"?
There is a version 0 and a version 1. They are identical except for minutiae related to date formats.
In version 0 of XSPF, dates were specified as an ISO 8601 date. In version 1 dates were specified as xsd:dateTime. This is the same thing (with better documentation) for almost every date in history, and as there are no playlist creation dates that might be different, there are no real world playlists that would be incompatible.
One of the key things I've looked for in a playlist format for many years is the ability to handle files that contain multiple tracks. My main example of this is continuous DJ mixes that are a single file. You might also see this in live recordings of performances where the interaction with the crowd between each song.
These single file media items generally have 1 or more Album Artists (e.g. the DJ) with each track in the performance blending into the next. Each track may have an associated Artist and Track name. Conventionally the best approach to solving these is to use a Cue sheet and a plugin for whatever media player that can load these simultaneously. Alternatively (though rarely) a container format that supports chapters / sub streams (e.g. mka) can be used.
I don't see a secondary (CUE sheet) format as being the right solution to this problem. To me a playlist and a list of tracks played in a performance are kind of the same thing and should be handled by the playlist.
My proposed extension to the UPL format is to allow each playlist entry to specify a part of the identified file. Now there's two possible ways of doing this:
1. Sub items under the entry (e.g. chapters)
2. Each entry specifies a start and end time (or duration)
You mention at [1] that this format is about display, I feel this fits within that idea by being something that is otherwise difficult to display using any other available info (ID3 tags etc.)
+1, I was going to suggest the same thing. Or at least the ability to note multiple files which are supposed to be chained together in order. It pisses me off when Shuffle doesn't handle these properly:
- Pink Floyd, Brain Damage followed by Eclipse
- Queen, We Will Rock You followed by We Are the Champions
- Yes, Long Distance Runaround followed by The Fish (Schindleria Praematurus)
- the different movements of Beethoven's Fifth Symphony
This is great, thanks for creating and sharing it. I've been working on my own music player of sorts [1] that uses YAML to store playlists, and I'm going to look into switching to this.
Oh, fantastic! Please let me know if you need help with anything or have any feedback. An issue in the project repo is great, or just email me directly (email in profile). Thanks!
Copying the actual songs into a directory that becomes "the playlist".
It's very inefficient and inelegant but after the 100th car stereo I encountered that couldn't make sense of the most basic m3u file, I just gave up ... my "playlists" are just more copies of the actual songs.
I think rather than having a single URI attribute and other identifiers, you ought to allow multiple URIs, and make all other identifiers just be URIs (since, after all, a URI is a Uniform Resource Identifier, and music files are resources).
You can represent hashes using the named-information URI scheme (https://tools.ietf.org/html/rfc6920), e.g. ni:///sha-256;u88lYWn4xAlto-6Bs79KHHYDAu28US71ui5Be6C-ZVw.
Your filepath could be file:///Anciients/Following%20the%20Voice.mp3; your mbtrackid could be musicbrainz:b00a2b97-53f1-485a-9121-1fe76b55e651 (since I don't think an authority makes sense in this case).
This would also permit multiple entries for certain types of identifier, which doesn't make so much sense for hashes (although it's possible for the same recording to have two different MP3 encodings, so … maybe it's not crazy), but would be nice when e.g. a song is found in multiple places in the filesystem, or can be retrieved from multiple locations.
This isn't too different from the current format, except the URI method is pushing the type into the URI, rather than into the name of the key.
I do wonder about what you said, though: Would it be useful to be able to specify two different file paths or URIs for the same item? I'm leaning towards no, because, when clicking on an entry, you want it to start playing, so the choice is made when adding the file, rather than when playing it. The problem with multiple file paths is that both have the same weight, so how do you choose?
I guess you'd choose the same way you choose between the formats already, based on whatever is available/found.
You're right, I have updated the spec to make each item a list. Since the URIs can be anything (including files), this is pretty much equivalent to your proposal.
Is the lack of portability really due to the unavailability of a (or yet another) standard format? Seems it's far more related to the music player decisions about what it's playlist format should be...
I think a few people have open-sourced various audio fingerprinting algorithms. Perhaps we should standardize around one and enable real UUID for audio files? That seems to be the missing piece for a lot of the novel functionality you're looking to hit. Otherwise you're stuck doing fuzzy pattern matching on metadata, which seems tricky especially considering possible variation in live recording etc.
I don't think the two are different goals. I would very much like standardization around the MBID and AcoustID fingerprint, and my playlist format is just a playlist that uses that. There's no way to use those with a PLS file, for example, even if they were standard and ubiquitous.
Not yet, but great idea, I've added it, thank you!
YAML is a superset of JSON, so the file can look like JSON if you want, although that's not mandatory. I don't think there are many languages that don't support YAML these days, and one of my goals was to make the format not-entirely-unreadable by humans.
Hm, yeah, that's true, and I see your point. Right now I think the benefits outweigh the costs, but I'll definitely think about this (for embedded players, etc).
Another consideration is that yaml's sheer complexity creates more possibilities for interoperability issues. No two parsers for JSON behave identically, let alone for something as large as yaml.
Yeah, I looked at pyyaml's documentation last night while writing pls2upl, and the format is so complex that this looks like it's going to be a clusterfuck. I'm thinking of switching to JSON in the end...
In practice there aren't many languages that support anything more than yaml 1.0 from my last look a few years ago.
Ie yaml isn't nearly as supported as it looks once you start actually trying to use it between languages with anything more than 1.0 as most of the purported library support seemed to be roughly 2009 abandonware.
Yaml also precludes easy https://www.sqlite.org/json1.html and postgresql jsonb etc. Not to mention trivial js consumption of the file.
Superset or not, Yaml automagically introduces bugs to and from json through the magic powers of more lines of code involved.
Most projects would have a hard time not having a json library these days? I've not done a survey of music apps though.
In practice, as someone that has been vim/emacs for a few decades I'd put a low priority on yaml vs json text editability. They are about the same for changing a few characters and I'd probably use orgmode+ python to do anything non-trivial but not full program(and probably drive the transform via json data from somewhere...).
In 2017 yaml doesn't seem tasteful for this format.
Really? For me, editing JSON is always a chore, because you have to match the multiple nested brackets, otherwise your whole file is invalid and you get to have fun finding out what you did wrong. YAML, in comparison, can be visually inspected quickly.
However, with all the apprehension in the thread, I'm rethinking this decision. Maybe TOML would be better?
Pretty printed json isn't too bad, but I intentionally never bulk edit it. This format shouldn't end up heavily nested anyway?
Toml is nice for simple configs/Cargo.toml but I think everything I said basically applies to it as well. Toml has recent work on it in 2017 for some (but not broad) library support but what does it look like in 2024?
> I think everything I said basically applies to it as well.
TOML is a much simpler format than YAML, and is a dialect/formalisation of INI files which have yet to go out of style. Direct database storage seems like a red herring (why would you want to store a playlist as a json blob?), so does "trivial JS consumption of the file".
[0] the entire spec is 10 pages, and that's with lots of examples, YAML 1.2 is ~80 pages
I, on the other hand, loathe editing YAML configuration files because the nesting is so obtuse and difficult to figure out based on limited examples in open source projects. JSON is much easier.
I finally changed my mind and switched to JSON after I tried to write pls2upl and realized that YAML is much larger than I thought, and with so many features that incompatibilities between clients would be a real problem.
> Probably in this day and age it would have better chance at adoption if it would be a JSON file. It's just network effects.
And here I was feeling badly for thinking that it'd be better as S-expressions! I wasn't going to post that, but since you already did, here's the same example in the original YAML:
---
format: UPL1
name: Favorites
id: 2b43009f-d6a6-4f00-8533-09a9a73d8b54
entries:
- artist: Anciients
title: Following the Voice
duration: 408.764081632
ids:
sha2: e577cce68a69735acccd5d8603b3e663f6aa5bc9
sha3: e577cce68a69735acccd5d8603b3e663f6aa5bc9
mbtrackid: b00a2b97-53f1-485a-9121-1fe76b55e651
filepath: Anciients/Following the Voice.mp3
uri: nfs://example.com/music/ftv.mp3
This is such a good idea. If nothing else, third party tools that people make to transfer playlists between platforms could standardize on this and, in doing so, would reduce duplicated efforts. After trying (and being dissatisfied with) EVERY music service, I'm currently trying to consolidate back to Apple Music. Manually. It's awful.
Since this is coming along rather okay, I think I will write a converter this weekend. I aim to have it accept an M3U/PLS playlist and the corresponding files and write out a UPL.
The opposite step would be a bit harder, I'd have to index all music files in a library and convert the UPL to PLS, which might be too much work for a simple tool, but I'll see if it can be done quickly and dirtily. That would be a nice idea, `upl2pls -i favorites.upl -o favorites.pls` and your favorite player is ready to play the files.
I'm currently writing a small converter utility to convert PLS/M3U/etc playlists to UPL and back, and to serve as a reference implementation/proof of concept. If you want to follow development, feel free to star this repo:
I'd love something like this, but I don't see any upside for iTunes or Spotify to support it. Anything that makes it easier for a customer to switch to a competitor is something they're going to resist.
They have plenty of upside to allow import through this format, as you can just upload your Winamp (yes, I'm old) playlist to Spotify and immediately migrate to their service. I'm hoping there'll be enough user demand that they'll choose not to appear evil and support export as well.
This is not in the least bit surprising. I download financial information from my accounts in the form of .csv files, but have never found any accounting software that will import those files reliably. I finally had to write my own csv parser and then tweak it for each wretched csv data provider to get things to work right.
For example, download credit card transactions as csv. Import into Quicken. Quicken decides that skookumchuck is the payee on all the transactions. Arggh. This is 2017 fer cryin out loud.
It's more suited to applications that require a URI for a single file. I don't see how it can be applied here, or what benefit it has over a simple list.
Yes, right now I'm collecting ideas (there have been some very good ones in this thread), and I plan to write a few small converter programs to convert from M3U/PLS, as well as some plugins (I'm thinking about beets, currently).
Removal performance of large directories is very bad. Reading large directories can result in lots of seeks. Directories are not suited to this at scale, and symlink directories definitely aren't portable.
The introduction sounds much more ambitious than your comment:
> The Universal Playlist Format (UPF) is a data format for music playlists. It aims to be more flexible and complete than existing formats. A Universal Playlist (UPL) will allow better description of the tracks it contains, as well as enable added functionality in music players.
I am really curious about moving from Google Play to a local library, though - can you say more about that? GP has been losing tracks, slowly but steadily, or mislabeling them horribly.
> The introduction sounds much more ambitious than your comment
It aims to be more flexible, but it doesn't specifically aim to replace them. I don't mind players using M3U, PLS, or whatever else, I just want the extra features that UPL would provide. I guess it's a fine distinction.
> I am really curious about moving from Google Play to a local library, though - can you say more about that?
Well, if Google Play allowed me to export my playlists as UPL, my local foobar2000 could easily find the tracks in my local library (I already have them all) and correlate which tracks on Google Play are which tracks locally. As it stands, I have to go into my Google Play playlist, find the thousands of songs in my playlists one by one locally, and add them to my local playlists as I find them.
Good catch, I had them there initially but decided to play it safe and avoid the broken hashes, so I removed them from the list but forgot the example, thank you.
Playlist and metadata are separate things, and coupling them is not good from an architectural perspective (redundancy/denormalization). The two can still be related, even in a single file: relational database. (plug: I'm working on http://jstimpfle.de/projects/wsl/main.html)
Sketch to make more clear what I mean:
This is more normalized, less redundant. Of course other schemes are thinkable, for example storing each playlist in its own file, but this way metadata and playlists will more quickly get out of sync.The library I'm working on allows automated conversions of such hierarchical data to/from relational databases, so data can be viewed, and even stored, as JSON, while integrity is checked based on an relational schema - something like