One less-secretive way I've seen pregaps used is for live recordings.
The crowd noise betwixt songs can be contained in a pregap, so that it is only ever heard when listening to the album straight-through (instead of in shuffle or track-program mode).
---
Another fun feature of audio CDs is indexes.
A disc can have 99 tracks, and each track can have some pregap (including track 1, as the article discusses). And each of these 99 tracks can be further subdivided with 99 index markers.
This gives a CD the theoretical ability to have 9,801 selectable audio segments.
Although realistically, I've only owned a couple of CD players that even displayed index numbers and exactly one CD player (a Carver TL-3300) that allowed a person to seek to a given index number within a track.
(And I've only known one CD to actually make use of indexes in any useful manner, which was a sound effects CD from the early 1980s that had a lot more than 99 sounds on it -- all organized by tracks, and sub-organized by index marks. I just can't think of the name right now.)
My personal CD ripping script is configured to leave all pregaps after track one at the end of the preceding track when splitting them out as individual files. It gets ripped in one DAO pass for guaranteed preservation of all samples when using gapless playback on live recordings. Track navigation then works just like a real CD without having to listen to an incongruous section of audio meant to link the previous track on sequential play or, even worse, missing it altogether.
I have a classical CD from the 80's with index marks for different movements within within the individual compositions represented by a handful of tracks. My understanding is that DG was the only publisher routinely using them. That required some manual intervention to convert the indices to separate tracks. Sony was pretty good about providing index nav. on their full size stereo players. At least until their perpetually cruddy remotes eventually failed.
That's probably the best way to do it, given common toolsets and players. I also rip pregaps as lead-outs (rather than the lead-ins that the structure may appear to suggest).
It's things like this that make me wish that we'd landed on on a good, popular way to store albums (with metadata!) instead of individual tracks -- or to at least reassemble individual tracks' files properly into whole albums without glitches and weirdness. (FLAC/cue can do some of this, but hardware player support is nearly nonexistant.)
I've been told that this is a stupid thing to want, and I want it anyway.
I'm old enough to remember listening to albums the whole way through by default since anything else would take extra steps, and perhaps fortunate-enough to have generally preferred listening to albums where that is a thing that is also worth doing intentionally.
(And yet, I am young enough to still be bitter about Lars killing Napster. My dissatisfaction is multifaceted.)
In addition to lossy compressed track files I also generate a FLAC with embedded cue as a master copy of the original. It's useful for recreating the whole recording for mass editing. I have a few discs mastered with preemphasis that needed correction. I too hope there will be a day when all FLAC players support track navigation. The reality is the music album has had its day in the sun and will largely be a forgotten curiosity like the typewriter or rotary phone.
You're not wrong. New music isn't frequently recorded with the intent for it to be heard in an album-oriented way.
But the albums I like to listen to as albums will remain cohesive albums for an eternity.
Lots of stuff from Roger Waters is cohesive in that way, which is perhaps something a person might expect me to say.
But also lots of stuff from Maynard James Keenan, Trent Reznor, and even Marilyn Manson is also this way, which is perhaps less expected.
(And sure, I can rip an album as an album and convert that to a singular MP3 that I can play as an album almost anywhere, and it needs to be a single file since MP3s can't be perfectly concatenated. But then, I can't easily skip around on that singular album when it behooves me to do so.
I could do both things when it was still in CD format.)
Billy Eilish's latest is intended to be listened as a complete album (but of course the fact that this is known as an exception proves the general rule...)
Her recordings are excellent. They generally sound simply fantastic. When turned up on the big stereo, they tickle every auditory input I have -- including the usually-strictly-tactile ones.
I've heard that her brother, who is probably (and perhaps obviously) her biggest fan, generally has a huge part in producing and mixing her music. It is apparent that they work well together.
Anyhow, thanks. That album is on the list for the next time the neighbors have left for the weekend.
Ideally, I think I'd want a singular container (of whatever sort) that has the album's audio, the music-related timing metadata (as applicable), and whatever other metadata may be appropriate (lyrics? liner note graphics? music videos? sure!).
The audio should be able to be FLAC. But it should also be able to be anything else, like Vorbis or MP3 or AAC or IDK. It needs to be able to be played continuously without aberration (which can't actually be done with a group of MP3 streams).
The audio needs to be able to be seekable, like a CD is also seekable. By track. By index. (With pregaps, where appropriate -- because CDs also have pregaps.)
Other potential metadata must be able to include whatever subcodes are involved in things like CD+G[0] and HDCD and CD Text, since all of those are supersets of the regular datastream and playback is compatible with any CD player.
And it needs to be a singular container file because...well, that's just easier to keep track of as the years go by and data migrates.
Only then, will we have the beginning of a valid archive format for audio CDs as they actually still exist on [some] store shelves today.
(Some stuff can be optional, just as lots of things are optional inside of an MKV container for a film.)
[0]: Almost nobody ever used this outside of the 1990s karaoke world, but Information Society's self-titled album includes an illustrated sequence, with lyrics, that is completely implemented in CD+G and that runs for the entire length of the album. And I should be able to render that locally here in 2024 from a container on my pocket supercomputer instead of watching a bad rip from a Sega Genesis: https://www.youtube.com/watch?v=b89sSa8QlLg
You mentioned MKV - Matroska (MKA for audio, MKV for video) could honestly work quite well for this situation with just a little extra standardization.
Audio codecs: use a single stream of whatever codec you'd like. FLAC/Vorbis/MP3/AAC/Opus/etc. can all go into Matroska.
Seekable: Use chapters for tracks, and nested chapters for indices. Matroska documentation even gives an example of using ChapterPhysicalEquiv 20 for CD tracks and ChapterPhysicalEquiv 10 for CD indices.
Other metadata can be muxed into the stream as well.
Lyrics can be included as text in metadata (lyric tag) or as a subtitle stream.
Liner note graphics (and basically anything else) can be included as embedded files.
Music videos can be video streams in the Matroska file.
I'm glad to see this mentioned. This was first thought I had as I progressed through this thread. I'm surprised this isn't a popular, supported standard already.
Nested chapters can work for index markers, especially if a player supports them right.
I mean: As mentioned, these have almost never been usable with real CD players in the wild. Maybe not much is lost there. (But the format must still accept these things, and allow them to be usable! An archival format must respect all aspects of the item being archived, including those that are unpopular or disused. I am willing to die on this hill.)
What of things like CD+G? Here in 2024, they're very simple graphics using 35+-year-old tech, and they should be archived neatly, precisely, and without interpretation, to be rendered client-side at a later point. I think I've mentioned it, but we literally have pocket supercomputers in common use today. If we can make the complexities of MAME work for the past couple of decades, and do it with direct ROM dumps, we can do this for CD+G.
But the CD+G must be rendered synchronously with CD audio data on playback. This applies whether it is my Goldilocks example of an Information Society album, or whether it is a CD+G karaoke disk with Garbage's I'm only happy when it rains (and twelve other crowd pleasers from that month of 1995).
How will that work with MKA?
And how will pregaps work?
(Maybe MKA isn't an ideal container if it does not already include avenues that lead to this kind of functionality in ways that are compatible with the original article.)
Interesting point about CD+G. I think whatever format was used needs to take this into account.
There were also a ton of Audio CDs that were not CD+G but had a data track with the music video etc on them.
I worked on a horrible one for Sony, one of those ones with all the anti-rip protection on it, where I was tasked to build a binary blob for a web site that detected if the specific audio CD was in your drive and let you into the web site. What were those things called, ActiveX?
Sony had plenty of awful stuff at different times for audio CDs, despite being a co-developer of this wildly-successful and long-lasting format.
I think you're referring to ActiveX, yes. It's the only thing I can think of where "web" stuff and "hardware" stuff commingled back then in a semi-transparent way.
Anyway, I'll just assume that you aren't the rootkit guy -- or even if you are, that your heart is in the right place.
---
And yes, CD+G is is important. As are the mixed-mode releases with video. All things CD audio are important if we are to talk about an ideal archival (and playback!) format for audio CDs, and archiving an audio CD is not always quite as simple as ripping a folder of FLACs -- there's a ton of diversity here that FLAC (and cue) can't accurately embody.
We're fortunate that we still have so many CDs right now, and that they're still being sold today. This will change. (It must change. It can't not change.)
The good folks working on the Domesday Duplicator have a relatively uphill battle for the often-older (and often rotting) LaserDisc media that they're working on tools to properly preserve.
It would be good to get ahead of the curve and get something with a practical workflow working sooner instead of later.
You are almost describing the MAME CHD format. As they have the problem that the object (hard drive, cd, dvd, etc) must be in one file. Have the ability to do differences (writable in some cases). But also compressed (compressed hunks of data). They also need that sub track data too as some systems do interesting things with that sub data. As some even hide their encryption format in the SBI fields. The CHD format is more like a container that acts like whatever media it was. Depending on what system they hook it up to. The downside is there is no concept of 'metadata' to find different things in CHD. It is up to the system it is hooked up to to interpret what that data stream is.
This could be a good avenue. It might be possible the CHD format could be extended and be backward-compatible, or even just as simple as bodging all the extra data onto the end of the file and hope it is ignored by other readers. This is an avenue worth exploring, thank you.
There is a way to extend the format. As it has version number. It does have some metadata fields (drive geometry, compression formats, version of mame created with, etc). The trick would be getting the MAME team to accept the changes. Just dumping it on the end of the file I would guess they would not be too happy with.
There are a number of requests out there to extend the chd format and fix a few things. They are currently tracking some of this info in XML files (called hash files). They would be down with more accurate information though. So you would need a proposal that adds more accurate information and gives them something to work with. More like getting all the info that something like the redump project tracks in there would make them very happy.
There is a separate project that some other emus use libchdr which is a soft fork of MAME. I think they are trying to track closely to what the MAME group is doing but let other emus use it.
Can’t an MP4 container do most or all of that already? (Pregaps would probably need to become a full-fledged chapter in their own right, with the current spec.)
Cue is a bodge that should never have become a defacto standard. Joerg Schilling's cdrdao tool has its own TOC format that faithfully captures everything including index marks, various flags, and multilingual CD text but it was ignored by everything else in the heyday of the ripping era. Nowadays we'd be better off with a standard yaml/json format that duplicates what cdrdao provides.
This is an issue for movie discs too. Some mkv rips will preserve chapter data (though player support is spotty), but in the end it's still a big linear file— menus, intros, trailers, optional features, etc are all gone once it's ripped unless you rip that stuff to separate video files.
Which I get on the one hand, but it's a bummer that in all these cases (CD, DVD, Blu-Ray) the metadata for the larger structure of the production got inextricably tied to the specific physical media implementing it, such that the only real way to preserve that data was to rip a full disc copy.
They were physically robust but the carbon button contacts always became dodgy for me. I tried to avoid Sony products for this reason because I encountered it so often in other people's gear. I have a remote from the late 00's that saw virtually zero use and it conked out with age alone.
I had a set of cds that went with an intro to music theory textbook around ~2009. It did also made heavy use of indices in tracks to do exactly the same thing. My car stereo listed each index as a track.
I wish I could remember the name of the textbook because I really liked a lot of the baroque music on the CDs and can't remember who they were by or the titles of the songs...
Do you remember if the textbook was orange (possibly with a two-tone cover design)? I had a really good textbook in college that had a... 4? CD set (with the big jewel case) that had a bunch of tracks and like you I really enjoyed it.
It was a reddish (could be orange, could have been maroon) color lightly mottled in black with a picture of a violin (or cello, idk) set in the lower 2/3 of the cover. I'm somewhat certain it had 6 cds because it filled my disc changer in my stereo, although that detail is fuzzy too.
I mastered a CD in 2000 for a band that wanted a secret track at the end. I came up with a novel way to do it.
There were a dozen regular tracks. A bunch of empty ones. And the final track over about a dozen tracks of varying length with no gap. Used all 99 tracks.
I could only pull it off with this CD burning software that didn’t have a UI. It took a text file as input at the command line. But it could do everything from almost every color of spec (Red Book, Blue Book, etc) for CDs.
I've had visions of putting a CD together that was that way, but with pregaps and indexes utilized as well.
"WTF? The time counter keeps going forward, and then sometimes it goes backwards! And using the track seek buttons completely eliminates some parts that I can hear if I don't touch anything!
It's a whole different song entirely when you program tracks 39, 40, and 52 in a loop, and IDFK what it is with this Index number that only always showed "1" before.
Oh wait. Srsly? From tracks 71-93, it's using the index to count beats...and the track number to count measures? No, that can't be it. Except...."
I thought I'd really (ab)used the CD specs, but don't recall ever trying indexes. Curious how most CD players, which only had a two-digit track indicator, handled indexes. I would have used that if I had known about it.
I wasn't aware these existed either. I suspect the answer is incredibly boring: most CD players simply wont seek to an index, pressing skip track will just skip to the next track ignoring any indexes present.
> Broken was re-released as one CD in October 1992, having the bonus songs heard on tracks 98 and 99 respectively, without any visual notice except for the credits, and tracks 7–97 each containing one second of silence.
Broken was absolutely perfect to put into a multi-disc player along with TMBG's Apollo 18, which contains "Fingertips", a suite of 21 very short songs. Set it to shuffle songs from everything in the player, and enjoy your sonic whiplash
Was this better with the crazily fast-loading chonka-chonk slam-slam nature of a Pioneer 6-disc cartridge changer, or with something slower and perhaps more-civilized like a period-correct Technics 5-disc changer, with its nearly-silent and relatively exquisite, seemingly-careful demeanor?
(Both have their merits, but I unfortunately have neither at hand. And I only have one of these 2 albums. And one of those albums is the original Broken, which only has 6+2 tracks across two discs instead of 99 tracks on one disc.
And how do the 91 silent tracks on a more-common release of Broken affect things compared to the 26 musical tracks that the original 6+2+18 track-count ensemble may entail, in terms of inter-song delay or any other such thing on a real multi-disc changer?
I know TMBG fairly well, and NIN very well, and I enjoy the fuck out of gear, but I have so many questions.)
(I vaguely jest above, but Spotify only shows me 18 tracks on Apollo 18. And only one of them is Fingertips. Am I looking at this wrong?)
Presumably Spotify has glommed all the Fingertips into one file. On the original CD release it was twenty-one separate tracks; there was a bit in the liner notes that explicitly encouraged you to put it on shuffle. https://tmbw.net/wiki/Fingertips
I could not tell you what brand the all-in-one turntable/radio/tape deck/cd player I had at the time was. There was a big tray with room for five CDs and I have absolutely no memory of how much noise it made when switching from one disc to another, and every physical object involved in this affair is long gone in a hurricane.
I suspect both CDs should be easy to find used copies of, if you have the appropriate hardware and want to experience the tension of not knowing if the next song you hear is going to be Trent bitching, a brief moment of silence, a Fingertip, or whatever else you put in the player. Given my tastes at the time this would have probably been Skinny Puppy, Ozric Tentacles, and Björk, but do whatever feels like the most interesting possible choice; I have a disc lying around now that’s nothing but forty iterations of Satie’s Vexations and that would have certainly been a prime choice for this little game.
There wasn't really much effort involved. Pick a few discs off the CD rack that I thought will clash interestingly, load up the cd tray of the cheap all-in-one turntable/tape deck/radio/cd unit I had in my room, hit the 'shuffle' button until it tells me it's gonna shuffle everything together, hit play.
Looking Fantomas up on Wikipedia makes it sound like they'd go pretty well with "twenty tracks that sound like the choruses of twenty different songs" and "ninety-something 1s blank tracks plus a few industrial songs", though.
I'm curious if you have a specific example of an album with the crowd noise between tracks like that? I collect and rip hundreds of CDs and am always on the look out for edge case discs to further hone my tools.
On your pregap + 99 indexes remark, the "pregap" is the space between index 00 and 01 which continues on up to index 99. Players seek to index 01 as the start point of the track. There is no separate pregap designation. I've paid special attention to this because it is a difficult problem to solve as many discs have space between tracks stored in index 00-01 but rarely is there anything audible in there after the first track. The only example I have of this is a specialty music sample disc, Rarefaction's A Poke In The Ear With A Sharp Stick, that has over 500 samples on the disc accessed by track + index positions.
As a sidebar based on the later comments in the thread, I've made it a habit to rip and store every audio CD as BIN/CUE+TOC using cdrdao. This allows me to go back and re-process discs I may have missed something on. But that is imprecise even because it usually breaks bluebook discs with multiple sessions to store data due to absolute LBA addressing. Also the ways different CD/DVD drives handle reading data between index 00-01 on track 1 is maddening. Some will read it, some will error, and the worst is those that output fake blank data.
>I'm curious if you have a specific example of an album with the crowd noise between tracks like that? I collect and rip hundreds of CDs and am always on the look out for edge case discs to further hone my tools.
E.g. the Japanese version of Flying Lotus' album "Until The Quiet Comes" has a pregap of 5 seconds before the 19th track, to separate it from the rest of the album, as it's a Japanese-exclusive bonus track.
Not sure why I didn't think to mention this before: One tool you might consider is redumper, it's designed in particular to handle multisession CDs, and it attempts to over-read into the disc lead-in and lead-out to catch data outside of the range covered in the TOC (particularly common in older CDs). It only outputs a final split bin+cue, but everything read, including scrambled data, toc, and subchannel/subcode, is saved for future processing. The bin+cue can be used with ISO Buster (and probably other tools) to access Enhanced CD filesystems. Feel free to reach out if you need some tips, this is what I use for my collection.
Caveat: It is mostly intended for use with the low-level features of Plextor drives, so CD support on other drives is relatively limited; in particular it doesn't have any overlapping read paranoia-style features. The recommendation is to dump twice to confirm; it's running straight through without seeking so that's usually still quite fast.
Seven minute pregap on disc 1 track 4 of https://vgmdb.net/album/5549 , it's a whole long discussion between songs, with some audience cheering. VGMdb follows the "append pregap to previous track" convention, that's why track 3 looks so long. There's similar but shorter gaps with banter on tracks 2 and 7.
Cuesheet looks like:
TRACK 04 AUDIO
INDEX 00 00:00:00
INDEX 01 07:34:43
Semi-related: "Minidisc" is an album by Gescom (who are really Autechre in disguise) released, as the name suggests, only on Minidisc, containing 88 tracks which are designed to be played on shuffle, because Minidisc, unlike CD or any other physical format, can be shuffled with no audible gap between tracks.
Each track is designed to segue into any other so the album is different every time you play it.
I was responsible for some of the first digital content ingestion for the world's record labels back in the late 90s, which was all based around trucks filled with retail CDs being fed into CD-ROM drives and an army of young folks grinding hundreds of track names into a database. (what happens when a truck full of East Asian CDs turns up? what about all those albums by Aphex Twin and Sigur Ros with untypeable names? https://www.treblezine.com/wp-content/uploads/2014/08/aphex-... )
I love these hidden tracks to death, especially the two hidden pregap tracks on Ash's first album, but they caused me unending pain and suffering.
Not only are they an absolute nightmare to rip, often with more than one song per track (so the WAVs have to be edited), the names of the songs are often totally unknown, even to the record labels. What do you even number the things in the metadata?
Added to that, you nearly always didn't even know they were there, so the negative numbered tracks would fail to get ripped and all the other ones in between or at the end would get ripped in weird ways and confuse all the data folk.
One memorable album using this was Queens of the Stone Age’s Songs for the Deaf. If you rewound from the start of the first track you got 90 seconds of strange sounding (but tuneful) rumbles and bleeps and bloops.
When I looked it up online I found out it was called “The Real Song for the Deaf”. It was literally a song for deaf people, the idea was that if they turned it up enough they’d be able to hear the vibrations forming a song.
Same for 311’s Transistor. I remember accidentally doing it and wondering if I’d somehow distorted the fabric of spacetime. Took me a while to figure it out.
I read this whole thing twice and I now know what pregaps are and the history but still have no idea why people would put them on a CD or why they’re useful for hidden tracks.
An audio CD is mostly arranged like a single continuous recording. Tracks are added on top of this via the Q subcode channel that gives information about the current location and the ToC stored in the lead-in area (also using the Q subcode channel). In the ToC, each track will have one more indexes that points at a specific location on the disc by minute, second and "frame" (represents 1/75 of a second, basically a sector).
If a CD is properly following the Red Book standard, index 0 will point to a 2 second pre-gap of silentce and index 1 will point to the actual start of audio of the track (additional indices are allowed, but not common). The purpose of the pregap is to make life easier for less sophisticated players that aren't able to seek to a precise frame on the disc. They just have to be able to hit a 150 frame region. However, just because the standard says the pregap is supposed to be 2 seconds and silent doesn't mean it actually has to be. Players generally don't care and by the time the format was popular, even inexpensive players could seek precisely. This allows you to stick audio data before a track that will be skipped by the player when it's trying to seek to that track. If you stick it before track 01, it will be skipped even when just playing the disc through unless you rewind.
The key for a hidden track at the beginning is that players usually start playing track 1 from index mark 1 (1.1) rather than index 0 as with continuous play through all subsequent tracks. The lead-in area for 1.0 is a holdover from grooved phonorecordings never meant to be played. It's a way for the primitive hardware of early CD players to acquire the start of the data stream in a safe area that doesn't have to be faithfully reproduced.
Some players permitted you to skip back from 1.1 to 1.0 to hear the lead-in as a hidden pseudo-track. Typically this was only possible with hardware index nav. buttons rather than the track nav. buttons, further obfuscating the presence of the hidden track.
The other means of "hiding" tracks is to have a bunch of short silent tracks until you get to track 99 (inconvenient to reach on a player without numeric track entry) or to have a long section of silence starting on the last track from index 1.
>Typically this was only possible with hardware index nav
Holding the previous track button would "rewind" playback and get you into the pregap on all the CD players I remember using, but these would have been late 80s models onwards.
Basically cd audio tracks have a base sector and a start specified. That allows sectors representing audio before timestamp 0:00 to be represented the track. The reason for this originally was probably to allow the drive to get synchronized before the track started. Enterprising cd masterers put actual hidden audio data in that area which would allow you on some CD players to rewind past 0:00 and then play the hidden audio at the negative timestamp.
Incredible! Songs in the Key of X was the only album I ever knew to do this, and it wasn't even the first. I had no idea so many others did the same thing.
Edit: Son of a *, I've had a copy of Sister Machine Gun's Burn for almost 30 years and never knew there was a hidden track!
Classic X-Files album is the one I think of too. And how they hinted to everyone that there even was a hidden track on the sleeve: "'0' is also a number". (and the technical fineprint about the disc possibly not being Redbook compliant)
The writing is not good; I gave up part way through. It's weirdly elliptic and almost autistic in its focus on details and an almost complete absence of a big picture. It could do with some kind of proper context-setting introduction, at the very least.
A CD audio frame is defined as exactly 1/75th of a second (588 samples per channel). I don't know why the article waffles around with these poor wordings (emphasis mine):
> These albums all had a pregap of either 32 or 33 “frames,” with a frame representing a length of about 1/75th of a second, per Hydrogen Audio’s Wiki.
> To offer a small correction to the original question now that we know we’re talking about 74 frames per second rather than 60 or 100
It's needlessly confusing and undermines my confidence of the entire article.
This mirrors my experience. It's good content but I'd barely land on one splainer before being segueded into the next one. I kept thinking I missed the part they delved into hidden tracks.
As the article says its like an easter egg, putting a hidden song before the first track of a CD. If the song wasn't in the pregap it wouldn't be hidden. It's just for fun.
(Sometimes songs can also be hidden in tracks at the end of the CD like 99, but that feels less mysterious.)
Sometimes CDs would have a long piece of silence at the end of the last song and then another song on the same track.
Other CDs really experimented with the shuffle feature. They Might Be Giants’ Apollo 18 had a bunch of very short tracks that would usually play between songs when shuffle was used.
Broken was first released as a 2-disc set. It was still in a many-fold Digipak case, but also included was a 3" mini-CD that had Suck and Physical (You're So).
The regular-sized CD looked about identical to the 99-track version, but had only 6 tracks.
(It was expensive to do this, and was never intended for long-term production. Later versions were generally as you describe.)
I remember my friend accidentally found the negative track on a CD and called me up out of breath like aliens just landed. I think it was one of the early AFI albums.We spent the whole weekend checking for negative tracks on every CD we could find.
The negatives between songs were also pretty cool sometimes, Mediocre Generica by Leftover Crack makes very good use of them. Listening to it over streaming or even mp3s ruins the effect, unless someone captured the entire album as one file.
> The negatives between songs were also pretty cool sometimes
And one of these, the interlude at the end of “High Roller” on TCM's Vegas which is part of the pregap for “Comin' Back” https://i.imgur.com/G5PSCy3.jpeg
I wouldn't call it an “early” album but I found one of these (untitled 18-second intro) on AFI - DECEMBERUNDERGROUND
I think it was answer that and stay fashionable. We all stopped listening to them when black sails came out, so if im right about it being AFI, it was one of the first 3 albums, or the all hallows EP.
Those names are a rabbit hole event horizon. Album released 9/11, working title of Shoot The Kids At School was rej by label. Follow up was F WTC. Band lives in C-Squat...
I've been using https://github.com/whipper-team/whipper to digitize CD's, and it supports identifying Hidden Track One Audio (HTOA) when it exists and is not blank.
Add in MusicBrainz Picard and Navidrome and you have a really nice solution.
Whipper user here also. If you've not yet encountered it, as it's not as prevalent in repos as Whipper, Cyanrip is always very much worth a look and has come on in leaps and bounds, with recent updates adding (non compliant) .cue sheet support.
This specification anomaly sounds like the polycarbonate equivalent of vinyl's multiple-groove capability. [0]
I'd first heard of this for a Monty Python record (wikipedia notes this is in fact the most famous use case) but checked to see if people went for >2 grooves, and seemingly they did. I expect the casting for the pressing was horrendously expensive, which is why it didn't happen an awful lot.
I suppose both mediums shared the less-well-hidden feature where a long silence separated the penultimate from the ultimate track.
When I was very young, my parents had a game called 'wacky races' that was based on a multi-groove vinyl. It was a horse-racing game - I can't recall exactly how the gameplay worked, but the vinyl contained racing calls where the races would start the same way but the outcome would be somewhat random depending on which groove the needle ended up following.
This is supremely cool, thanks for sharing. I'm probably missing something obvious but why would the casting be any more expensive than any other pressing?
Not terribly informed about the pressing process, but as I understand it, it is (or was) effectively a player-process in reverse.
A needle creates the groove(s), replete with bumps for sound, in a not-quite-set (slightly soft) master disc - and I _speculate_ those follow a specific path defined by the mastering tool.
In comparison, playback just drops the needle in the track, and it necessarily follows the extant spiral form.
Making the master of a multi-groove record I'm assuming would require recalibration of the groove-defining mechanism (doubtless carefully designed for conventional layout), once for each of the grooves you want to make, ensuring they each stay within the boundaries defined by the previous grooves.
Also on the topic of trying to push the compact disc to its limits a Grindcore group who had a bonus track where "All efforts were made to exceed typical limitations of 16 bit linear digital technology compression, limiting, and equalization curves have been created to deliver maximum gain structure"
I had a period of bad luck in my youth where I believed all these new enhanced CD's and shaped CD's were damaging the tracking of the lens on my CD player so I gave Exit-13 a swerve and started to listen to safer music ;)
Things like CD's with their large number of partly-compatible extensions shoehorned in remind me that whenever one is writing a specification, one should make sure that every combination of bits/bytes is either valid with defined behaviour, or invalid.
The one exception is a field for "extensions", which should have some bits for 'compatible' extensions (ie. there will be extra data ignored by readers which don't understand them), and other bits for 'incompatible' extensions (ie. you have put a DVD into a CD player).
I had a Rammstein ablum, that if you rewound before track 1 there was a black box audio recording of a plane crash were everyone died. It was pretty macabre. I think the CD cover was like a plane's black box if I remember correctly.
No, that's what they were named after. They released an epinomous song on their first album (Herzeleid) that played with the visual imagery of that disaster ("Rammstein - a human burns / Rammstein - the smell of meat in the air / Rammstein - a child dies / Rammstein - the sun is shining"). They apparently initially wanted to name the band Rammstein Flugschau ("Rammstein flight show") before shortening it to Rammstein. The difference in spelling was accidental but Rammstein is evocative (literally "ramming stone") so it stuck.
The album the parent mentions is Reise, Reise, which is travel themed (in a broad sense of the word), the cover being styled after a black box (being bright orange of course). The flight recording in the pregap literally comes from a black box of a plane crash so that fits.
One compact disc extension I remember well is CD+G. It was pretty wild plugging an Information Society CD into a CDTV and watching the (admittedly crappy) graphics while you listened to music and samples of Leonard Nimoy and DeForest Kelley...
I remember discovering “hidden tracks” on the Beastie Boys intergalactic album with my cousins… we were like what on earth is happening as the CD player display glitched out and played this stuff we hadn’t heard.
Yes, there's been a serious issue recently reported. Apparently, it has triggered bureaucrats on the internet who can't acknowledge something innocuous that's never caused any problem for decades.
This inspired me to read up on the low-level details of CD structure. I'm curious if anybody scanned an entire CD and shared the results, so that we could work with a raw image of disc that contains all its quirks, as opposed to the typical .iso format?
It's really difficult. Unlike floppy disks, where you tell the drive to seek and get back raw magnetic pulses (so you can produce raw flux images), or hard disks where you tell the drive to read an arbitrary sector and get a blob of data (so you can produce sector-level images), the protocol for talking to a CD ROM involves asking for track/sector addresses, which means you have to trust the drive to interpret all the track metadata and error-correction for you - you generally can't just dump the "raw" data and do the interpretation yourself.
That's why the most robust CD image format is the BIN/CUE format. The BIN file contains all the sectors the drive allows us to read, the CUE file contains the disc metadata as interpreted for us by the drive firmware.
There are some drives which support extra "raw read" commands, but they're incredibly rare and consequently in great demand by CD preservation projects like redump.org.
Some people have used the contents of BIN/CUE data to reconstruct what should actually be on the disk, but that's not quite the same thing. Here's a great explanation of the CD structure in all its complexity:
Even BIN/CUE is not enough. It cannot store subchannel data like CD+G and is only able to hold a single session which breaks bluebook CDs with audio and data.
We do not currently have a widely supported CD standard for storing data from a CD that can properly hold all data. Aaru [0] is close, but still has to output back to other formats like BIN/CUE to use the contents of the disc.
Apparently makemkv forum members created some patched firmware that lets you raw read BRs for the sake of extracting metadata that’s intentionally hidden for DRM. Though I’ll have to recheck my understanding since you’re saying you can’t actually raw read disks anyway
Audio CDs were never ripped/transferred as ISO files. ISO-9660 is a filesystem that came years later, and Redbook audio CDs simply do not contain files.
If you want to look at the structure of a whole audio CD, then one way is to rip it with a decent tool (perhaps cdrdao or EAC) and generate a bin/cue file pair as an output.
But that's not my goal. I'd like to be able to observe every grove, the physical encoding of data, and see if I could implement decoding from scratch. First problem is though that I don't know how to get a microscopic image of the disc.
You don't need a microscopic image of a disc to do that; a two-dimensional photograph is of essentially no advantage here.
All you need is the unmolested data from that disc. The data is arranged on a singular spiral groove starting from the center and slowly winding its way towards the outside.
The data is completely linear: It begins at the beginning, and continues to the very end without interruption. This is all akin to (although opposite of) how a single-track vinyl record is physically laid out. The entire CD -- whatever it contains -- is just a continuous string of pits and lands.
And to observe that string as it appears on a real disc, all you need to get started is a regular old-school CD player and some appropriate data acquisition gear, and maybe an oscilloscope to help figure out what you're looking at.
The optics and basic motor controls are already solved problems, and it doesn't even have to be particularly fast data acquisition gear by today's standards to record what is happening.
Look into the Domesday Duplicator project for Laserdiscs as an example of how what ssl-3 is talking about can be done using a high sample rate input. That exact process is possible and with enough storage and processing power can be used to get the most "low level" access to the data. It is not for the faint of heart though, and can take around 1TB of storage and hours of CPU time to process full movies in this way, I know because I've done it.
I believe I've seen there is work being done to attempt this on CDs but it would have still been in the exploratory phases and not yet ready to start archiving with. It might seem like overkill to do this to something meant to be digitally addressed but I've experienced enough quirks with discs and drives when ripping that I would 100% be willing to switch over to a known complete capture system to not have to worry about it anymore. Post process decoding also allows for re-decoding data later if better methods are found.
The "unmolested data" would still have undergone error correction though, wouldn't it? I don't think a bin/cue rip would contain the redundant stuff, which GP seems interested in, nor the subcodes (of which some are represented in the cue file, while the bin file is PCM audio).
Ah, I see. So what kind of capture hardware could read from that point? I assume it's a digital signal taking the form of 2-voltages, flipping on the order of 3.6 MHz (16 billion pits to read over 74*60 seconds). With Red Book audio at 1.4 Mbps, more than half of the raw data must be devoted to things like redundancy and other non-PCM stuff, if my interpretation that pits==bits isn't far off.
Aside: is your username inspired by Secure Socket Layer or Solid State Logic?
I'm getting off into the weeds of what I know here, so take this all with a grain of salt. (I probably used to know more about all of this than I do right now.)
The difference between a pit and a land is an optical phase change. The pits and lands vary in length, and there are 9 valid variations in their lengths. This combined phase/temporal situation eventually (thanks, science folks from 1970-something!) turns into a serial binary electrical signal inside of a CD player.
This binary electrical signal can be recorded.
Recorded with what, you asked?
CDs have a lot more going on than just audio data: Remember, there's forward error correction at play and (by spec, IIRC) a player is supposed to be able to completely recover data even if there is a gap of 1mm due to a scratch or other interruption. (There's also room for tricks like CD+G to live in the background, and certainly what may seem like an inordinate amount of data used just for clocking: CDs are CLV, so playing them happens at a continuously-varying rotational speed in a tightly closed loop because buffer RAM was expensive to buy, and expensive to manage, and tight speed control was cheaper to implement. Remember, this was a finished digital product that was released in 1981.)
I find old references[0] that suggest that the raw data rate of a CD (it does not matter what kind) is 4.3218 Mbps.
So, to posit some example hardware: With careful loops and decent wiring, accurately capturing this seems like it would be well within the purvey of an RP2040's PIO's DMA modes to get that data into RAM, and also well within one of its 133MHz 32-bit ARM core's ability to package up and deliver that data over USB 2 to a host machine that can store it for later analysis -- plus or minus a transistor or two, or maybe a pullup resistor in just the right spot.
(But that's just my opinion as a home hacker who has dabbled in RP2040 PIO assembler, and who is at or a bit beyond their knowledge of compact discs. I may wake up tomorrow and decide that the above is all bullshit and wish I could erase all of it. If in doubt, Phillips datasheets for CD player chipsets from the first half of the 1980s can probably help a lot more than I can.)
---
As to the username: It's old. It predates Secure Socket Layer, but it's way newer than Solid State Logic. I was just a young kid with a new modem when I dialed into a Telegard BBS and started to sign up for an account, and got stuck at the prompt to enter a "Handle". I didn't know what a handle was in this new-to-me context.
The sysop saw that I was stuck and dragged me into chat, as good sysops (hi Shawn!) tended to do upon seeing such a thing. We chatted for a bit, and I wasn't feeling creative, so he suggested that maybe I could look around for inspiration since most people used a made-up handle on his particular BBS.
I found a 5.25" floppy disk on the desk that I'd borrowed from the local public library. It was labeled "Selective Shareware Library, Volume 3." (It was also almost certainly infected with the Stoned virus[1]).
Anyhow, that was sufficiently inspiring, so ssl-3 it was.
You might do well enough with https://en.wikipedia.org/wiki/Cdparanoia without needing use different hardware to scan the disc. Instead it relies on the CD drive's ability to report on inaccuracies in keeping in sync with the grooves.
I wonder if you could just tear the controller out of a CD/DVD drive and build a new one from scratch, kind of like the new floppy controllers being used now to read the raw magnetic data. You could just command the head to move to the center, find the beginning of the data and just keep reading until you hit the buffers.
Floppies (most of them, anyway) have fixed track widths, and these tracks are arranged cylindrically, and these cylinders align with the steps of the stepper motor that is used to actuate the head assembly.
It's relatively easy, with the right ratio betwixt step advancement and track width, to get the head moving properly on a new implementation of a floppy controller. Want to read track 1? Step to the head N times to reach track 1 from wherever it started, and read it. Next, want to read track 33? Step the head N times to track 33, and read that.
But tracking the spiral groove of a CD is a very different problem to solve. Steps tend to lose their meaning. Instead of electromagnetic steps, it involves 3 different laser beams: Two to continuously keep the head centered where it needs to be on the ever-changing groove using a servo feedback loop, and a third to read the data from the pits and lands from the middle of that groove.
Is it do-able? Sure! People with far less advanced tech than we on HN might have laying around did it 40+ years ago.
It's just a very different nut to crack than reading a floppy is, even if the mechanical and optical bits are recycled.
(And that's just head positioning. The pits and lands still needs to be read, and those reflect back from the disc as optical phase shifts, not as changes in magnetic polarity and/or amplitude.)
Coming back to this, having read some of the (great!) replies, I'm going to go out on a limb and say that in theory, this sounds possible, and fun, but highly impractical. I'll assume that by "scan" you mean a high end "flatbed scanner" optical scan which would return a 2D bitmap.
It's impractical because the resolution required to retrieve the data "flatbed scanner style" is comically high, perhaps 50k dpi, far beyond the capability of any commercial unit and well into scanning microscope territory. Sure, from my understanding, it looks technically possible. But it would be a very significant and costly project just to assemble the image in the first place. Even if you had that, the resulting file would be hilariously huge (something like 122GB), extremely difficult to work with, and you would be starting from scratch implementing some kind of visual pathfinding helical decoder to painstakingly unravel the linear coil of data the scan just sort of blatted into two dimensions.
It's a cool idea. But it's comically, exponentially harder than just using the equipment as intended to just read the laser returns off the disk directly, into a far, far more easily dealt with format.
I'm adding that CD scan to my list of things I'd like to do if I ever get really rich.
The crowd noise betwixt songs can be contained in a pregap, so that it is only ever heard when listening to the album straight-through (instead of in shuffle or track-program mode).
---
Another fun feature of audio CDs is indexes.
A disc can have 99 tracks, and each track can have some pregap (including track 1, as the article discusses). And each of these 99 tracks can be further subdivided with 99 index markers.
This gives a CD the theoretical ability to have 9,801 selectable audio segments.
Although realistically, I've only owned a couple of CD players that even displayed index numbers and exactly one CD player (a Carver TL-3300) that allowed a person to seek to a given index number within a track.
(And I've only known one CD to actually make use of indexes in any useful manner, which was a sound effects CD from the early 1980s that had a lot more than 99 sounds on it -- all organized by tracks, and sub-organized by index marks. I just can't think of the name right now.)