
Why I Ripped the Same CD 300 Times - jmillikin
https://john-millikin.com/%F0%9F%A4%94/why-i-ripped-the-same-cd-300-times
======
sjwright
In case anyone was wondering why a "red book" audio compact disc isn't
designed to be resiliant in a way that is familiar to a modern technologist
(or even compared to a CD-ROM disc) the basic reason is that the former is
designed to stream with no buffering directly into a DAC chip built with
technology from the late 1970s. It had to handle data rates that were
incomprehensibly massive for the time, so only the most insanely basic error
handling was feasible.

Whereas the "yellow book" CD-ROM standard got to take advantage of over four
years of technology advancement. Plus it doesn't have a real time requirement,
so there's plenty of time to calculate more advanced checksums and error
correction codes.

In many respects, an audio CD is more like a vinyl record than a data storage
format.

~~~
pwg
> In many respects, an audio CD is more like a vinyl record than a data
> storage format.

The design is nearly identical to what one would get if one simply literally
(almost) digitized a vinyl record. A single long spiral of data on a disk. The
most significant shift is that vinyl records read from outside in, while audio
CD's read from inside out. Even the method of manufacture is essentially a
clone (audio CD's are "pressed" just like vinyl records are "pressed", just in
different pressing machinery).

> built with technology from the late 1970s. It had to handle data rates that
> were incomprehensibly massive for the time, so only the most insanely basic
> error handling was feasible.

Yes, the limits of late 1970's technology had a huge impact on the overall
design, but at the same time the design also took advantage of the fact that
audio when being played back in real time is reasonably tolerant of a small
level of error resulting from the read-off of the disk. Most listeners would
never hear the random errors because they either are below the noise floor of
the amplifier tech. at the time, or because the resulting analog wave was
close enough to not be noticeably different.

~~~
sjwright
> The most significant shift is that vinyl records read from outside in

Equally significant (and equally interesting) is how they spin. Whereas vinyl
has constant _angular_ velocity—they spin at the same rate from beginning to
end, resulting in lower information density as you get towards the inner
grooves—audio CDs have constant _linear_ velocity, meaning that the rotational
speed gradually changes as the disc is played so that the information density
is equal from beginning to end.

For the era, this is very cool tech. (I'm also awestruck about how they
managed to get the laser head alignment to work perfectly back then, it's
practically magical.)

~~~
theandrewbailey
Techmoan did a video about a record that is recorded "backwards" to take
advantage of the increasing fidelity in the outer grooves.

[https://www.youtube.com/watch?v=5Afikv6k1-c](https://www.youtube.com/watch?v=5Afikv6k1-c)

~~~
arthurfm
Techmoan's recent video on extended play (2.5 hour) CDs is also quite
interesting.

[https://youtu.be/5fG1crhGqI0](https://youtu.be/5fG1crhGqI0)

------
gwern
This is a little awkward. The Touhou album in question is already in the
Touhou Lossless Music Collection (at least the last release:
[http://www.tlmc.eu/2018/01/tlmc-v19.html](http://www.tlmc.eu/2018/01/tlmc-v19.html)
since it's from 2005, it's probably been in most of them), and has track 3
("The End of Theocratic Era" by "弘世"). I just checked the TTA and track 3
sounds fine.

If anyone is terribly curious what it sounds like, I've put up an OGG copy
here:
[https://www.dropbox.com/s/u88u1xpdmdxbqal/03%20%E5%BC%98%E4%...](https://www.dropbox.com/s/u88u1xpdmdxbqal/03%20%E5%BC%98%E4%B8%96%20-%20The%20End%20of%20Theocratic%20Era.ogg?dl=0)

Oh well. I'm sure it was a great learning experience anyway. :)

~~~
jmillikin
Most of the TLMC rips are range rips with unclear CUE data, which makes it
difficult or impossible to verify they were correctly ripped.

For this album in particular, the audio CRCs of the other tracks don't match
up between TLMC and a fresh rip:

TLMC:

    
    
      01.wav: 9A5E3226
      02.wav: 577A675B
      03.wav: 3548C299
      04.wav: E5DEC006
      05.wav: D33AC4AE
      06.wav: 7427AC6F
      07.wav: 83A58517
      08.wav: 0DCA8419
      09.wav: 703BAEAC
    

My rip:

    
    
      01.wav: 565F2B5A
      02.wav: E916B3A2
      03.wav: A595BC09
      04.wav: D989F0D6
      05.wav: C6A2DD2B
      06.wav: C8403284
      07.wav: D01D6BC4
      08.wav: 8786C1AA
      09.wav: 1E3641DD

~~~
scrollaway
Is there such a thing as an audiodiff to tell how different they are to yours?

~~~
jmillikin
Yes: you can use Audacity
([https://www.audacityteam.org/](https://www.audacityteam.org/)) to "diff" two
waveforms by subtracting one from the other.

For comparing .wav files you can also treat them as plain binary data, and use
standard diff tools. For example I used `cmp -l` to count differences between
rips, and `vbindiff` to view them.

~~~
thr0w__4w4y
Thanks for mentioning vbindiff. I examine/compare binary files a lot (reverse
engineering for security) and I've used a bunch of tools, but somehow I've
never come across vbindiff.

------
tambourine_man
Tangentially related: I think this is the first time I see an emoji URL “in
the wild“, not as a proof of concept. I don't know how to feel about it.

~~~
Qwertie
Really not a fan. Makes the url impossible to type on desktop.

~~~
pervycreeper
Will go one further: the use of emoji at all defeats the purpose of having a
language and an alphabet.

~~~
Cthulhu_
You say that, and yet Egypt became an empire whose history has been saved 4000
years later thanks to a pictogram based language.

~~~
pbhjpbhj
Egyptians hieroglyphics are phonograms though, not [very loosely defined]
ideograms -- I thought that was the point about their translation, that people
wrongly assumed they were pictorial ideograms.

------
ivan_ah
Awesome read! It's great to see someone faced with tech challenge and not back
down. We need more of that.

For anyone interested, here are some links to the details of error-correcting
codes (ECC) used on CD-ROM: \-
[http://www.usna.edu/Users/math/wdj/_files/documents/reed-
sol...](http://www.usna.edu/Users/math/wdj/_files/documents/reed-sol.htm) \-
[http://www.multimediadirector.com/help/technology/cd-
rom/cdr...](http://www.multimediadirector.com/help/technology/cd-
rom/cdrom_spec.htm) \-
[http://bat8.inria.fr/~lang/hotlist/cdrom/Documents/tech-
summ...](http://bat8.inria.fr/~lang/hotlist/cdrom/Documents/tech-summary.html)

But I think CDDA possibly uses different (less) ECC... so info in links might
not be 100% relevant.

It's a shame this cool tech is not being used anymore. As an information
theorist, I used to be able to point to optical disks as an application of my
field, and be like "see all that math is useful for something," but now I
don't have anything shiny to point to anymore :(

~~~
zkms
> As an information theorist, I used to be able to point to optical disks as
> an application of my field, and be like "see all that math is useful for
> something," but now I don't have anything shiny to point to anymore

I strongly disagree, there's many shiny things you can point towards.

Point to a smartphone (and also the relevant sections of the
LTE/HSPA/UMTS/WCDMA/GSM / 802.11 standards documents); there's a literal
panoply of coding/error-correction maths that's crucial to every single one of
those standards. You can actually go to the ETSI website and find the LTE
standard and get free PDFs with the exact parameters of all the codes they
use, in enough detail to reimplement them yourself.

In those standards, there's all sorts of lovely little state machines for
coping with errors, like LTE's "HARQ loop" where your phone will tell the cell
base station that it didn't receive a chunk of data correctly; at which point
the tower will resend a _differently punctured_ version of the chunk of data,
and your phone's modem will try to soft-decode the data with _both_ versions
to work with. Oh, and that exchange (including processing on both ends and
radio transmission latency) takes under 10 milliseconds to _complete_ \-- the
standard places an exigent and strict deadline on how long your phone has to
respond with its acknowledgement/request.

Also hard-drive platters are extremely shiny (more so than optical disks) and
those also use error-correcting codes, as do SSDs. Did you know that bit cells
in modern SSDs are so small that the number of electrons it takes to affect a
measurable voltage difference for the sense amplifiers is _less than 100_
([https://twitter.com/whitequark/status/684018629256605696](https://twitter.com/whitequark/status/684018629256605696))?
All your data lives in those differences of tens of electrons per bit cell; no
wonder there's ECC machinery hard at work in SSDs!

~~~
exikyut
> _...like LTE 's "HARQ loop" where your phone will tell the cell base station
> that it didn't receive a chunk of data correctly; at which point the tower
> will resend a_ differently _punctured version of the chunk of data, and your
> phone 's modem will try to soft-decode the data with both versions to work
> with._

Cooool.

I googled "HARQ loop" and because I visit HN so much the first result was
[https://news.ycombinator.com/item?id=11151232](https://news.ycombinator.com/item?id=11151232),
so I learned that the data-processing portion (where the decode attempt is
retried) must complete within 3ms!

I've been wondering for some time about good (bandwidth-efficient) ways to do
error correction/recovery in the area of general-purpose high-efficiency byte
transports, and just in case, wanted to put this here in case you're/anyone is
interested. How good would it be to throw TCP's "flood of ACKs" out the window
and instead compare frame checksums (is CRC32 good?) of every say 32 frames,
and send a bitmask (in this case 8 bytes) every _n_ frames noting which were
correct and which aren't? This would send ~32 times less data (and I just
learned a TCP ACK is approx. 74 bytes!
[https://stackoverflow.com/questions/5543326/what-is-the-
tota...](https://stackoverflow.com/questions/5543326/what-is-the-total-length-
of-pure-tcp-ack-sent-over-ethernet))

(My interest/focus is within the domain of good ideas lacking widepread
implementation/uptake. I've found that these always seem to be hiding in the
woodwork.)

~~~
tialaramex
Your TCP ACK approach is confusing. If you mean to check all the frames, but
only say if they're OK every 32 frames that only makes sense if you somehow
have a medium with high bandwidth AND high error rates. If your error rates
are low conventional TCP will just send occasional ACKs saying e.g. "Yup, got
the last 500 packets, keep it up".

But if your error rates are high and your bandwidth is high you should already
know how to fix this and get very slightly less bandwidth without errors, so
why is this suddenly TCP's problem and not your transmission medium?

~~~
namibj
Because I for one can't change the DOCSIS parameters my ISP set for the
segment I am on, and neither can I affect radio communications reliability
without impaired movement. While the error detection is rather weird, and a
simple erasure code might well work better, this is a good idea.

~~~
exikyut
Thanks for the suggestion. I've been meaning to take the time to stare at how
erasure coding works and keep staring until my eyes stop glazing over and I
get it :)

I replied to the parent comment, FWIW.

------
nayuki
Xiph.Org has a command line program called cdparanoia, which is built with the
same philosophy as Exact Audio Copy (EAC). The FAQ page explains a lot about
why the program is necessary.

[https://www.xiph.org/paranoia/](https://www.xiph.org/paranoia/)

[https://www.xiph.org/paranoia/faq.html](https://www.xiph.org/paranoia/faq.html)

~~~
AdmiralAsshat
Thanks for the name drop. I went Linux-only on my primary laptop, but I still
have a Windows 7 laptop with EAC installed that I dig out every time I buy a
new CD and need to do a rip. Maybe I'll finally be able to put it to rest,
soon.

~~~
KozmoNau7
I've found that EAC works really well in Wine, and it's still my favorite
ripper, even on Linux. Similarly, Foobar2000 in Wine is the best Replaygain
scanner/tagger.

------
phyzome
Whoa, this update is weird:

« EDIT: After further investigation, I no longer believe it’s a factory
defect. If I write the beginning or end of the affected track to a blank CD-R
and rip it, the rip fails with the same error! Give it a try yourself with
minimal.flac. »

[https://github.com/jmillikin/john-
millikin.com/blob/master/%...](https://github.com/jmillikin/john-
millikin.com/blob/master/%F0%9F%A4%94/why-i-ripped-the-same-
cd-300-times/minimal.flac)

Uh... so what's going on there? Something that breaks the error correction
algorithm, or what? Can anyone with a burner repro this?

~~~
userbinator
_Uh... so what 's going on there? Something that breaks the error correction
algorithm, or what?_

It is likely "weak sectors", the bane of copy protection decades ago and of
which plenty of detailed articles used to exist on the 'net, but now I can
find only a few:

[http://ixbtlabs.com/articles2/magia-
chisel/index.html](http://ixbtlabs.com/articles2/magia-chisel/index.html)

[https://hydrogenaud.io/index.php/topic,50365.0.html](https://hydrogenaud.io/index.php/topic,50365.0.html)

[http://archive.li/rLugY](http://archive.li/rLugY)

~~~
phyzome
OK, wow! So basically CDs are not capable of recording arbitrary data, or at
least most burners will fail on certain inputs, and the song in question
causes trouble. Fascinating!

------
zkms
> It turns out you can’t fix noisy data by intersecting it with a different-
> but-related noise source.

No you actually can; it's called "soft combining":
[https://en.wikipedia.org/wiki/Hybrid_automatic_repeat_reques...](https://en.wikipedia.org/wiki/Hybrid_automatic_repeat_request#Hybrid_ARQ_with_soft_combining)
and it's a crucially important feature in high-performance air interfaces
(like LTE).

~~~
valine
That’s kind of like how you can reduce noise in low light images by taking a
series of photos and averaging them together.

~~~
kqr
I really don't think that's the same thing. Taking multiple shots of the same
subject is just taking a shot with a really long shutter time, except in
slices, a little bit at a time.

What OP was talking about was something akin to using the noise in a photo of
one subject to reduce noise in another photo of a completely different subject
but with the same camera.

That works, to a point, but not great.

~~~
dylan604
In astro-photography, this is quite common. You take a series of normal
images. You then cover the the lens, and take the exact same exposure. After a
series of processing, the noise pattern from the dark frames is subtracted
from the normal images. It works quite well.

------
jackvalentine
This kind of obsessiveness was used to create the best values of Oink's pink
palace - until that modern library of alexandria was burned down.

~~~
Puer
Followed by the loss of what.cd. It was truly one of the most impressive
catalogs of information on the internet.

~~~
Cthulhu_
I really hope there are a few people still out there that have the full
catalog of what.cd archived and either keep it alive by sharing it in the
darker recesses of the internet, or are able to secure it somewhere.

There really needs to be a media museum, an international library of all that
has ever been produced, from a hundred different recordings of the old
composers to whatever crap your 15 year old neighbour is uploading to
soundcloud.

~~~
Fnoord
If its anywhere like the scene and DCPP used to be then there's archivists who
have the extremely rare content, including for example MP3.com rarities. A
full archive is very convenient, but if you're a collector who's after certain
rarities all it requires is a good network (or networking skills). Social
network, and some bandwidth.

~~~
voltagex_
if you know where those MP3.com archivists hang out - a friend of mine is
after a particular disc.

------
userbinator
This is very similar to the technique used by
[https://en.wikipedia.org/wiki/SpinRite](https://en.wikipedia.org/wiki/SpinRite)

 _This didn’t seem to be an issue of wear or damage – the CD itself was
probably defective from the factory._

It would be very interesting to look at the surface of the disc under a
microscope, to see if you can find the defective area --- knowing which track
it's in, you can determine the approximate diameter at which it occurs, and
then rotate around that diameter looking for abnormalities. The pits and lands
are invisible to the naked eye but easily viewable with a light microscope:

[https://en.wikipedia.org/wiki/File:CD_Pits_at_6.25x_Magnific...](https://en.wikipedia.org/wiki/File:CD_Pits_at_6.25x_Magnification.jpg)

At 1200 bits/mm linear density of a CD, the 5KB (40Kbit) defective section
will correspond to ~33mm of track --- probably quite obvious if it is a
physical defect. (A "logical" defect, where the bits are OK but the ECC is
wrong, will not be apparent at the physical level.)

~~~
jmillikin
I don't have the equipment for this, but it might not be needed. The disc has
writing that goes "through" the top cover, such that you can read the text by
holding the disc up to a normal lightbulb.

The text is across the entire diameter and I don't know why it would cause
problems for only this track. Possibly the shape of the text?

edit: photo [https://i.imgur.com/fdtAAPG.jpg](https://i.imgur.com/fdtAAPG.jpg)

~~~
userbinator
That could be an area with insufficient contrast, a similar problem that early
drives had when CD-Rs started appearing. In that case I would've tried a
mirror or black pad (about the only thing those audiophile "disc mats" are
good for...) on the top, to shift the offset level.

Note that CD uses 780nm infrared so what the human eye sees is not necessarily
what the drive sees --- this is why "transparent" or "black" CDs work.

------
jugg1es
Fry's Electronics should pay this man for the plug at the end. He made a good
case to go there if you need to conduct an experiment requiring an
unreasonable number of unique disk drives right now.

------
rrauenza
This reminds me of trying to rip Paul Simon's Concert in the Park some number
of years ago. Everytime I ripped it, there was a minor glitch in The Obvious
Child at 2:02-2:03. I also heard it on the CD. So I tried different drives,
different machines, Linux, Windows, iTunes, Exact Audio Copy, etc., to no
avail.

Finally I reassesed my assumptions and found a copy of the track in the wild
-- hey! Same glitch. Then I emailed a friend who I knew also had a copy of the
album -- same glitch!

Eventually found an obscure forum where someone also complained about the same
glitch.

I always wonder how that made it out of the studio -- was it a glitch in the
recording equipment of the concert? Surely they used multiple recorders?

~~~
rrauenza
Found a VHS video --
[https://www.youtube.com/watch?v=obNpTfAga6M](https://www.youtube.com/watch?v=obNpTfAga6M)
\-- the artifact isn't there, so it must have been in post production?

------
fusiongyro
What a beautiful story of devotion! Imagine someone going to this much trouble
over your work, it's quite impressive.

I used to worry about this kind of thing, using cdparanoia and so forth on
Linux to manage my music. When I went Apple I eventually went iTunes, which is
a mixed blessing. I feel like it has made me able to appreciate more music
more easily by lowering the cost barrier, but it has reduced the amount I
appreciate an album substantially. And it's hugely inconvenient when they
disable my account for strange reasons or my internet connection vanishes. So
I may return to manual management someday.

~~~
andai
I recently found a backup of an old portable mp3 player and realized that the
manual curation and limited storage really made me appreciate my collection
more. Gave me a wonderful warm feeling going through that collection.

------
nottorp
Out of pure curiosity, what's the prison sentence in the US for ripping a CD
300 times? A few years multiplied by 300?

~~~
Mokou
Zero, because it isn’t illegal to rip CDs.

~~~
peatmoss
Fun times. Once upon a time I was working at a university in New Zealand. At
that time Apple had created a way to legally stream your iTunes catalog to
other LAN users—mostly as a way of enabling the college campus music sharing
experience without the illegalities of OG Napster, Limewire, and the ilk.

I got a nastygram from the IT department for sharing my iTunes catalog. I
replied that it was my understanding that the iTunes streaming was legal. They
concurred, but said that format-shifting my albums had been illegal. I
countered that I format shifted my CDs in the US, where it was legal.

In the end, the IT security people agreed that I was probably totally legal,
but they asked me to please refrain from leaving the iTunes sharing turned on
because it made their lives hard. That was a compelling argument, and so I
turned off iTunes sharing. :-)

EDIT: I should note that format shifting was later made legal in NZ last I
knew. Hopefully it still is.

~~~
pmarreck
What is "format shifting"?

------
pmoriarty
If you want to avoid having to do this you might want to look in to embedding
error correction data on to your CD/DVD (or even on to separate media) at burn
time, using dvdisaster:

[https://en.wikipedia.org/wiki/Dvdisaster](https://en.wikipedia.org/wiki/Dvdisaster)

------
geargrinder
"This sort of thing is common when buying used CDs, especially if they need to
transit a USPS international shipping center. "

As a used CD seller who often ships internationally with no complaints, I am
interested in learning more about what is behind this statement.

~~~
jmillikin
In my experience: as the box size gets bigger, the treatment gets rougher.
Single CDs in bubble wrap envelopes come through fine, as do small hardwall
boxes. CDs in medium boxes (10-20 CDs) often arrive with box damage and maybe
some torn internal packaging. I once used a shipping aggregator that put 50
CDs in one box, they were so banged up that I had to buy replacement jewel
cases.

If you ship CDs, I have only one request: please please do _not_ put bubble
wrap _inside_ the jewel case! Some sellers do this and it just destroys those
little plastic retainer teeth.

~~~
geargrinder
Thanks for the feedback. Bubble envelopes don't really protect CD jewel cases
very well, which is why I create my own packaging by wrapping the cases
individually in recycled cardboard. Simple and cheap. CDs are amazingly
resilient and with a proper polishing machine scratches can be completely
removed without changing the data on the disc. It is damage on the non-playing
side of CDs which is impossible to recover from.

------
anfilt
Touhou! I dont see that everyday on hackernews.

~~~
UlisesAC4
You do not really see it lately.

~~~
inawarminister
the time of utter popularity has passed, but it is still plenty popular both
in the world and Japan specifically. There's a few dozens conventions and
Touhou music concerts/DJ-ing per year, the biggest of which is the Reitaisai
and its accompanying events. And a new spin-off game is coming out this
August!

------
jaclaz
As a side note and not necessarily related, one of the devices I keep using in
case I need to read a (data, not music) CD is a very, very old CD-ROM drive,
of the original rype that use a caddy cartridge, 1x in speed, with a SCSI
connection.

Till now it never failed on discs that current, modern and newish fevices had
issues with.

------
elitistphoenix
It kind of makes you want to hear this track he was so dedicated to getting
off the CD.

~~~
jmillikin
I don't know how much piracy discussion is permitted on HN, so I'll be brief:
this album is contained in the lossless music collection.

~~~
angersock
Touhou is, iirc, basically completely public domain. It's a gigantic, bizarre
diaspora fueled entirely by dedicated fans.

Everybody who thinks that copyright is required should see the success of 2hu.

~~~
gwern
> Touhou is, iirc, basically completely public domain.

Zun actually does have a semi-formal license/terms of use (it primarily bans
commercial use but also quite a bit of other stuff, much of which I think goes
ignored):
[http://touhou.wikia.com/wiki/Touhou_Wiki:Copyrights](http://touhou.wikia.com/wiki/Touhou_Wiki:Copyrights)
[https://www.animenewsnetwork.com/news/2011-02-14/tohou-
proje...](https://www.animenewsnetwork.com/news/2011-02-14/tohou-project-
creator-restricts-commercial-works-anime)

~~~
anilakar
AFAIK the whole doujin scene is all about "go on, violate our copyrights but
don't complain when people violate yours".

ZUN's blanket ban on commercial distribution makes it really difficult and
expensive to legally obtain anything when the only distribution channels are
specialized doujin stores -- it's still commercial, but the ones getting the
profits are various middlemen and not the arrangers and performers.

------
Havoc
>It turns out you can’t fix noisy data by intersecting it with a different-
but-related noise source.

This doesn't make sense to me. Why didn't this work?

------
tambourine_man
I have about 200 DVDs, most 10 years old. Probably 1/4 is unreadable by now:
MacOS does not recognise the disk when inserted (oddly enough, much older CDs
still are).

The data is not terrible valuable and probably already mostly stored
elsewhere, but I'd like to give it a try. Is there hope for ddrascue? I've had
mixed results in the past. Are there better alternatives?

~~~
topspin
If you take away one thing from the article it is this; try different readers.
Some readers perform far better than others. I have one 'magic' DVD reader
that will overcome damage that makes most other drives choke with errors, and
the result is usually a flawless rip.

~~~
voltagex_
AVSForum, Doom9, Redump and Hydrogenaudio are all good places to find out
about these drives, too.

------
voltagex_
I wonder if what.cd has this album before it was razed.

A lot of value was lost when that site went down.

~~~
crtasm
[https://interviewfor.red/](https://interviewfor.red/)

~~~
voltagex_
Oh jesus. These guys thought the what.cd interview was too easy, didn't they?

~~~
cuckcuckspruce
It's substantially the same interview.

------
dsnuh
Thrift stores are a cheaper and very reliable alternative to Fry's, if you are
ever looking for one. I've never not seen a CD/DVD player at a thrift store.

~~~
Bluecobra
Good point, I’m curious if older drives produce better results. I remember
some discussion 10-15 years ago saying that modern floppy drives were junk
compared to older drives. The argument was that the prices dropped so much
that build quality also suffered. I wonder if the same applies to CD/DVD-ROM
drives.

~~~
dsnuh
My experience is that the more off-brand the player, the more lenient it is
with what it will play. I had a crap Lite-On that would play anything.

------
euyyn
The jump between the 53th and 54th rips is very suspicious! Didn't you look
more into it to try and explain it?

------
pmarreck
It has always struck me as odd that the powers that be have yet to figure out
how to harness the love (obvious since they toil to preserve these media works
at the best possible quality, albeit illegally) of data hoarders.

------
zuzzurro
I see that no one has mentioned yet cuetools and the cuetools database in this
thread. That seems the perfect tool for this kind of repairing (provided
someone else has ripped an accurate version in advance, of course)..

------
vasili111
Take a look at: [http://ixbtlabs.com/articles2/magia-
chisel/index.html](http://ixbtlabs.com/articles2/magia-chisel/index.html)

------
singingfish
doesn't abcde[1] do this?

[1][https://abcde.einval.com/wiki/](https://abcde.einval.com/wiki/)

------
arc2
Cool URL

------
dingo_bat
I wonder if the HDD the author used to store those bits is also having read
errors, which the drive is correcting on the fly.

~~~
jmillikin
It's very likely! There's a couple approaches one can take to detect this:

* The rip log generated by EAC includes a CRC of the audio data. I can calculate this independently to ensure the data is the same as what EAC wrote.

* The rip log itself has a checksum that can validate the CRCs are intact.

* I archive my rips in Google Cloud Storage along with a sha256. If my NAS's copy goes bad I can fetch the backup, and validate that it's the original data.

~~~
chungy
Alternatives or additions:

* Store par2 parity information alongside your cloud storage and/or NAS: [https://github.com/Parchive/par2cmdline](https://github.com/Parchive/par2cmdline)

* Use a file system with strong data checks and repairability, basically meaning, ZFS with a sufficient redundant set up. "zpool scrub" can do wonders, and you can guarantee backups to a different pool are identical to the source.

~~~
Fnoord
In the scene they've used SFV (CRC32) for ages. It works very well and quick
except it isn't useful against tampering. PAR2 provides additional parity, and
it also useful if for Usenet. You can also use RAID. For example, Synology NAS
easily allow RAID1 plus btrfs.

------
skookumchuck
What I'd do is the same thing I do for scratchy records. Look for the scratch
in a waveform editor. Interpolate the data across the scratch using cubic
splines. You can also paste in data from the other channel if it is similar
enough.

~~~
ygra
I thought the point was to get the _actual_ data back (not that you could with
records).

~~~
skookumchuck
True, but he was using statistical techniques, which are not guaranteed to
produce the actual data.

My technique is akin to what an artist would do in "restoring" a damaged
photograph. Often the glitch is only one data value being off, and it does
produce an audible click.

------
rocky1138
Not including a sample of the audio he/she ripped in an article such as this
should be considered a type of crime.

