This is such a huge problem for retro gamers. Any games on optical media, even those that aren't particularly old are at risk. As the article mentions, it's doubly infuriating given how optical media was marketed as more durable than cartridges. Meanwhile, a bunch of my Dreamcast and PS2 games have disc rot, while all my carts are still 100% playable.
Besides disc rot, optical drives are also at high risk of the laser dying - this has happened to a PS1, PS2 and Dreamcast of mine.
For these reasons, the most serious retro gamers I know are modding their optical drive systems to replace the CD drive with a SD-card based optical drive emulator.
Anyway, thank heavens for emulation and the homebrew fan community. Without them, so much of video game history would be lost.
I have not come across a Dreamcast game with rot. I have some early games which were just pressed badly and the top player in completely pealing off but it was not rot.
The Saturn Discs on the other hand I have come across more than a few with rot.
The loss of what.cd wasn't difficult because it made piracy hard, it was because the users had created one of the best fully verified lossless cd audio archives on the internet.
Much of this content has popped up (or even been extended) on replacement trackers, but it's only a matter of time before the heroic operators of these trackers get forced to shut down too.
I remember the early days of piracy, Napster, Kaza, eDonkey. People were simply sharing all their files. There was pride in having a hard to obtain item, such as TV shows from the 60s, movies not released in DVDs, etc...
P2P could have made archivists jobs so much easier. This is a tech that should have been the next iteration of internet that we killed for no good reason. Not even greed, just a refusal to evolve a profitable but obsolete system.
I used to agree totally, but now I have a more nuanced view. While artificial scarcity offends my hacker sensibility, I also have to admit that this creates a society I like, with talented wealthy celebrities, and a large population of people who make their living in and around the industry. I'm not sure I would like a society where creative acts only happens in home studios and the work is given away for free, and all artists have to have day jobs.
The only thing that gives me pause is that we transitioned to a system where individual music tracks are nearly free-on-demand* a decade ago and the sky never fell.
There were transition costs, and artists get way more from live performances now, but there's still an industry.
More than anything I'd just like to experiment with shorter copyright terms. I don't think the founders were crazy when they set them at around a generation. I agree that it might be risky to blow everything up, but I think there's a lot of bargaining we could do over term lengths that would be pretty harmless.
* To a first order approximation. Sure, many artists technically can get something when I pull up a track on YouTube. But if I'm not playing a whole playlist, I might not even see any ads.
I think if copyright was 20, heck 40 years this wouldn’t be such a problem because that would means everything earlier then 1980 would probably be available online for free. Long copyrights just prop up companies and their investors not people.
> I also have to admit that this creates a society I like, with talented wealthy celebrities, and a large population of people who make their living in and around the industry.
What should we value more: hacker principles or phony outdated business models that aren't compatible with the realities of the digital age?
Copyright was created back when industrial capacity was required in order to make copies at scale. People had to own stuff like printing presses in order to infringe copyright at a scale that caused any damage. This is no longer the case. Copyright infringement is trivial now. It's as easy as copy paste. To solve this, the copyright industry intends to lock us out of computing. They think unlimited computers are too powerful for mere citizens like us and we should have access to nothing but limited versions that can execute nothing but predefined functions approved by them.
The mere existence of copyright is an existential threat to hackers and their values. If the preservation of free computing and internet requires sacrificing the entertainment industry, then so be it. The alternative is to sacrifice free computing.
Due to the existence of subversive technologies such as encryption, governments have been increasingly adopting the same perspective as the copyright industry.
I never said the new model was "give away everything for free" I just said that "Consider every copy needs to be paid even though making a copy is basically free of cost and that this forces us to put a terrible censorship mechanism into place" was not the right approach.
Now that there are crowdfunding campaigns and tons of tools for microtransaction, that is even less excusable than when you had the effort to imagine the apparition of these obvious solutions.
I feel like we're screwed until we get over the "ownership" model.
It means we spend more effort on metadata (What are we allowed to do with it? Who owns it?) than the actual data.
It has stood directly in the way of new creative types (sampling in music, for example) and locked away content where the metadata is faulty/lost (orphan works problem)
It falls apart badly at scale (if you wanted to get distribution rights for everything on an OSX/Windows installation image, imagine the number of third parties alone, plus the original developers, you'd need to sign off)
and then it only really incentivizes work that's commercially appealing, not necessarily the technical best.
I dream of a day where people who want to create will be able to get a stipend-- it might cover a modest lifestyle, but ensures they can live the dream of being an artist/musician/programmer without having to clutch tight over every scrap of their output to keep their income streams alive. If we did it through a tax-funded endowment rather than market forces, we could probably finance a lot more creativity that way (think how many touring singers we could finance at 50k per year for the price of one Taylor Swift, or how many programmers we could hire to work om their passion projects in database theory for what is currently being spent on Oracle licenses.)
Hardly. Top acts make 10s of millions per month today. If people didn't have to spend money on the music itself they would probably spend even more for tickets and merchandise.
In addition to what the other commenter said about crowdfunding, celebrities now get most of their money from sponsorship deals, not their actual creative activity. Yes, you need to be a real superstar to get these deals, but the world of mainstream, "celebrity" creators is driven by superstar dynamics.
While early-millennium filesharing was useful for finding obscure content, the stuff available wasn’t up to the standards of professional archivists. In the library world, there are strict standards for how information is to be ripped, encoded, and tagged, but the content on filesharing networks seldom lived up to solid archival standards. It wasn’t until Oink and What.cd that we got a rigorous archive of audio material.
For me it had more to do with the bootleg sharing programs and the inability to download anything without 1500 different versions of malware also making it onto your computer.
For some reason the torrent community doesn’t have quite the same problem.
Torrent is still very centralized. The average user just downloads and seeds torrents made by a few torrent curators on each site. Becoming an uploader on a torrent site is a big privilege/responsibility that few people can take on.
Other older platforms like Napster or DC++ (which actually still exists) let people share their own collection at the click of a button. This gave access to a really really long tail of rarities that perhaps only a handful of people were interested in, while torrent sites are mostly mainstream content.
What do you mean by that? You can upload your torrents to any of the big public torrent sites and if it's a private site just being a member allows you to upload already. Public trackers are also all using DHT which makes this not centralized.
On private sites I know (in Hungary), being an uploader is a privilege one must earn, by being around for some time, earning the trust of the admins, having good uplink, learning all the rules and formats so all torrents are consistent etc. I guess perhaps about 0.1% of members upload stuff to these trackers. The situation may be different elsewhere.
Context: Torrent is hugely popular in Hungary, basically everyone is a member who uses the internet under ~40 years and is even remotely technical, which means more than a million people (with the single most popular site having over 700k members, with account sharing explicitly allowed). Perhaps not so much for the newest teenager generation (I don't know). Until very recently (and still now) streaming services have been limited contentwise and people just cannot afford to buy all the movies, songs, games etc. with the same ease as in Western Europe / North America.
I think two things happened which effectively destroyed p2p - for the music-sharing case, for me specifically, at least. Streaming and greater official release of bootlegs, demos, etc. For me, there is so much available through these channels, that I don’t even consider p2p as an option.
People have learned from what.cd. Lots of people have the same sense of inevitable doom you do so there's more people who have put thought and effort into being capable of rebuilding quickly if that happens again. Just varroa musica alone has created a giant decentralized metadata archive(and that's not even its main purpose just a neat side effect) that would make the restarting process much easier.
The downside is that the post what community is so much smaller. Spotify et al really put a huge damper on the music tracker community, and that makes it less resilient.
> The downside is that the post what community is so much smaller.
People were saying the same thing in 2007 when OiNK shut down and What was starting up. The community is not smaller; it's more fragmented, and that's a good thing. Centralization is no better in our societal constructs than it is in our technological ones. I'll trade some immediacy and convenience for longevity any day.
I disagree, the community is both more fragmented and smaller, what had 144k users on its last day vs the 35k of its spiritual successor today. The few music trackers that have sprung up post what.cd after said spiritual successor are all pretty small. The niche genre trackers all have about the same user numbers since then so they cancel out. Then factor in that waffles has been down for some time now and the difference is pretty big.
I can only estimate because exact numbers are hard to find, but I'd say peak what era music tracker community total users was in the 350k region and we're probably somewhere around 100k now.
Does Spotify really cut that deep into the demographic of hardcore music tracker users? I'd presume those people are chasing rare and lost releases in high quality and Spotify and other streaming services aren't exactly touting rarities. Preserving underground music will always be a job for the fans, not streaming services.
Yes. Music trackers have always been a ton of effort, but in their prime trackers had the advantage of having basically a monopoly on music discovery. If you wanted to find out about citypop in 2011 what.cd was the place, now youtube will recommended it to you on a Joe Rogan video.
People didn't go to what.cd and jump through all those hoops to find rare music, they went to find good music, and now Spotify and YouTube do a good enough job of that with a lot less effort.
As a funny example of that I discovered Billie Eilish when Ocean Eyes first came out from a private tracker and now she's about as big as anyone. So it isn't just about discovering rare czech folk singers, but any music that you might like.
I think the hardcore demographic you mentioned is spot on and does exist, but they were always a minority and the evaporation of the less hardcore users explains why the scene is so much smaller now.
> Does Spotify really cut that deep into the demographic of hardcore music tracker users?
It does because private trackers use buffer as currency. The music trackers have had to adapt to use points systems now to encourage downloading activity. But even still, activity is way down from the what.cd days. It's just easier to stream the easily accessible stuff.
Also if you're not an uploader, building ratio to use the site is hard. Going from ten years of what.cd freeleech of buffer to nothing and facing the prospect of building it up again can be daunting.
What.cd most certainly wasn't the best. They were really stuck up and got what they deserved. I was amazed at the process just to get access to what they didn't even own. Waffles and oink were so much better.
> I was amazed at the process just to get access to what they didn't even own.
it was that process that helped to curate one of the highest quality audio libraries to have ever been.
it's not all that important for most folks to listen to and compare 20 different releases of Bowie's 'Rise and Fall of Ziggy Stardust and the Spiders from Mars', but to those of us that enjoy music cataloguing as a hobby , the loss of what.cd was pretty terrible.
By what metric where Waffles and oink better? What.cd was run incredibly professionally, as you can see by the fact that it was active for almost 9 years and nothing happened to any of the people involved.
Then their criticism doesn't make much sense as both Waffles and Oink had an invite system. At least What.cd was more approachable as there was an interview system which made it possible for everyone to join as long as they learned some basics about audio formats / transcoding / site rules and had an hour to spend on an interview.
I'm working on something that should be the best place to find any media... unfortunately the VidAngel vs Disney lawsuit has made things a little tougher. Fair use took a big blow there.
Theoretically, optical media should be easier to recover than magnetic, because the technology to view their surfaces in detail is readily available --- a light microscope is enough to humanly view the actual recorded data on a CD and even a DVD:
It would be interesting to see what "disc rot" looks like at that level of detail; and if better reading hardware may be able to read it. In particular, pressed discs actually store data in the varying height of the polycarbonate disc, so as long as those pits and lands haven't been damaged, even if the metallisation is gone, the data remains intact; and if there was a way to strip and re-coat that layer, pressed discs could be recovered. For CD-Rs, perhaps the dye has faded to the point where the typical drive (which can't really spin the disc slower than 1x) won't be able to discern the pits from the lands, but if that data can still be seen somehow, it's recoverable.
I am the author of these photos... If you have a sample of rotten disk - you can send it to me and I'll take a look. Contacts are on the site.
Now my hardware is better, CD and DVD are easy to observe without physical damage. I also can see data on BD disk, but it is on the border of what is possible.
I also tried to decode data from the image, but it will require more collaboration.
For archivers there is now writable M-Disc bluray discs which claim to have longevity of 1000 years. It might be an overexaggeration, but they do use a different technology to record the information, a non-organic layer which is chemically much less active and will inevitably decay slower than a regular CD/DVD. There have even been some accelerated aging tests that were able to somewhat confirm that. If you're into archiving you might want to look that up.
> For archivers there is now writable M-Disc bluray discs which claim to have longevity of 1000 years.
ISO cert' states 1000 years mean lifetime (coin flip chance) but only 530 at 95% confidence. And that's at a constant 22C and 50% humidity.
And that's solely for the original 4.7GB DVD, not blu-rays.
> There have even been some accelerated aging tests that were able to somewhat confirm that.
OTOH others (the French National Laboratory of Metrology and Testing) did not and found m-discs to be no better (and possibly worse) than other high-quality discs with an accelerated aging setup of 90C 85% humidity.
After looking that up I've decided it's far cheaper to upload my data to S3 with maximum data center redundancy, fund my AWS account with a big enough balance to be able to pay for a decade of service, and give my keys to people I trust.
You should not put all eggs into one basket. Amazon might ban your account any day and you'll end up with no data. Or your data might lost because of some bug. There's Google coldline, there's online.net C13, they have similar price. Use at least 2 different providers.
Backblaze B2 as well, they advertise B2 at 0.5c/GB/mo, coldline is advertised at 0.4 (though the official docs quote 0.7c[0]), Glacier is 0.4 to 0.5 depending on the zone.
Azure apparently has an "archive" tier for 0.1c, which you can lower even further with long-term high-volume reservation: if you reserve 1PB for 3 years it's 0.081c/GB/mo (using only local redundancy) in the cheapest DCs where it's available (some DCs either don't have that feature at all, or it's way more expensive e.g. 0.1636 in asian DCs where it's available).
This is going to be one of the reasons why CD collecting (whether it's games, music or otherwise) will likely stall. Even games which have been kept in perfect condition can suffer from rot as a result of poor quality control.
Cartridges are generally more hardened unless they contain some battery-based saving system and even then a dead battery can be replaced.
I considered all my old CDs dead (music and PC games - RIP Phantasmagoria, plus MP3 collections) over a decade ago.
The majority of them developed white spots or detachment of the disc layer. This has been known since the early 2000s, I’m surprised anyone would collect them without being aware of the risks.
They claim to, but it's hard to determine that, and whether other archival-targeted discs (or which there are several e.g. Northern Star DataTresorDisc, MPO Gold, Verbatim Archival) are also resistant to disc rot, because it's unclear whether accelerated aging triggers disc rot in general or only some very specific instances thereof.
(apparently defunct) syylex's GMD would most likely be structurally immune to disc rot, but you had / have to really really want to make your data rotproof: single-disc (4.7GB) order was 160€ (>$200 at the time). Not including VAT/GST.
I have the original install disks for BoeingCalc, the precusor to the more famous VisiCalc (Excel) but they're unreadable using consumer grade tech. Probably the only public copy of the software that I'm aware of.
What's interesting about Boeing Calc is that it allowed storage of its documents "online". But it was surely only for specific local setup, so it was probably "on the local server."
Ah, I think what I was remembering was that you could have multiple worksheets/"tabs" and run calcs between them, like modern excel. I don't think visicalc supported that level of complexity until later.
What prevents, or tries to prevent, bit rot in standard CPU hard drives/flash drives? Is there some built-in forward error correction at some layer? If not, how could one add some?
Most SSDs only guarantee your data for 90 days when off. Even high grade, SLC, professional SSDs. Hard drives fare better, but your best bet for decade-long conservation remains tape.
That 90 days figure applies only to SSDs that have worn out their entire rated write endurance, and only to enterprise/datacenter SSDs. For client/consumer SSDs, the standard for unpowered data retention at the end of life is one year (at 30°C, compared to 40°C for enterprise drives). That longer retention requirement is one contributing factor to consumer SSDs being rated with lower write endurance.
In practice, hardly anyone uses up the entire write endurance of their SSD, especially not if they're using it to store backups, archives, or other relatively static data. Flash memory that isn't worn out has much longer retention times.
That's where hopefully Microsoft project silica will make it permanent for all practical purposes digital data storage available.
It apparently uses a similar basic process as those mall kiosks that take a volumetric picture of your face and then etch it into a cube of glass?
But in this case, they just write the data to the side of a small piece of glass over the course of days and it's basically permanent for all eternity unless something breaks it.
Yes, all modern disks use error correcting. Wrong bit reads happen frequently in HDD and corrected all the time. Bits can't rot in HDD or SSD outside of firmware bugs or transmission errors. You'll get sector read error instead of wrong data.
> Bits can't rot in HDD or SSD outside of firmware bugs or transmission errors
The entire photography community begs to differ.
Gradual degradation of stored image data across HDDs of all vendors is very well documented, mainly because the bit rot is very easy to spot when working with visual data.
Bit flips are persistent and they occur on the media itself, not in transmission. There are plenty of theories, including one that argues that bits are flipped by stray neutrinos, but nobody knows for sure.
> The entire photography community begs to differ.
Also ZFS. Part of its purpose and design goals was countering bit rot (as well as device hostility to keeping data safe), as Sun had customers affected (even with ECC). Hence the end-to-end checksumming amongst other features.
scrub exists pretty much just for bit rot: you run scrub regularly, it goes over the disk, checks that every block checksums properly, and if they don't[0] it repairs the data using the non-corrupted copy (assuming you have one).
[0] and your dataset is replicated, note that even if you don't use a raid configuration you can mark important dataset as to-duplicate for this purpose (this is not equivalent to device redundancy, it's a feature which exists solely for bitrot / corruption protection)
Ars Technica did a really good article [1] on bit rot. Although it's angled towards filesystems, it also provides good discussion on bit rot in general.
Can you link to any source giving details on this? I don't know of any mechanism that can protect individual bits from flipping without dedicating a serious amount of space to redundancy. And it doesn't align with my personal experience either: I had a bit flip in a Jpeg image this year. Fortunately I was able to restore it from backup.
It's pretty common knowledge. You don't need a serious amount of space, for example ECC RAM uses one parity bit for 8 bits to fix one bit flip or report 2 bit flips. Also disks use checksums which require even less storage.
Your situation is likely happened because of bad RAM.
> Your situation is likely happened because of bad RAM.
I don't see how that could have happened, since the image wasn't ever written to. It should have been the same file on disk as the backup, since I hadn't touched it since backing up the file, but it wasn't.
> You don't need a serious amount of space, for example ECC RAM uses one parity bit for 8 bits to fix one bit flip or report 2 bit flips
The thing about parity is that for it to be useful in a data recovery scheme, you have to know what bit got flipped. That works for RAID of course, because you usually know which disk is bad and so when you replace it you can work out the missing bits using the parity disk, but I don't think it could work for an individual hard drive, since in general you don't know which bit flipped.
And the thing about checksums is that they can tell you if your data has become corrupt, but they can't be used to fix things behind the scenes.
Turbo codes don't add a lot of overhead if you just want to protect against 1 bit flip per byte or so via FEC (forward error-correction), they're close to the Shannon information entropy limit in efficiency but I don't know of any open source implementations, and implementation details such as "I want to protect against at most 2 bits per byte" or "I want to protect at most 5 bytes per kilobyte" directly matter in the implementation.
I've been wanting to try to understand them to see if I could implement something myself for a while now
Turbo codes are typically soft decision codes, which don't make a ton of sense at the filesystem level, since there has already been a hard decision made. They are useful in storage at the read channel level, as in processing the analog stream from the head on the hard drive.
Reed-Solomon is often used in storage because it is an optimal erasure code - e.g., I know this block is missing or corrupt, correct it.
> Reed-Solomon is often used in storage because it is an optimal erasure code
I was initially thinking OP might be talking about something like this, but I think I would have heard if all new hard drives had it built into their firmware to store error correcting codes alongside the real data and automatically fix bit flips, since I suspect that would have pretty serious performance impacts in certain cases and people would be complaining.
This is not correct. All forms of Flash memory lose data (charge in cells) over time. Even glass window EPROMs are only rated at tens of years. Every new generation of SSDs uses smaller Cells that lose charge faster. Wear and tear accelerates this. Samsung 840 with 300TB wear will lose data when unplugged for a week. https://techreport.com/review/25681/the-ssd-endurance-experi...
This is not true. I have encountered several times on systems with ECC memory where MD5 values for large archives change on magnetic discs. No read errors or SMART errors. I have a HDD (1TB Western Digital Green) in my desk drawer that does this after a couple weeks of cold storage.
Curious - what are your complaints on these programs? My main one is that they are slow to create archives and rsbep only does error decoding, not erasure. I'm working on an alternative tool that is much faster, but curious if you have any other feedback.
I was tasked with transfering a large collection of cd-r media to hard disk. About 10% was unreadable or corrupted. I have not had a shelved hd go bad yet. I had an amiga 3000 from 92 boot after 25 years of inactivity. Also booted up a classic 8088 that was stored in a basement for about 30 years. No problems with either. Looks like a shelved hd is a cheap and reliable archival option. On a long enough timeline I would expect all cd-r media to be unreadable.
Afaik pressed disk rot is reversible, data is stored on the plastic part (as the name suggests its pressed in by the die), repair would involve sputtering new reflective surface. Plastic itself will probably hold tens of thousands of years. Writable disks are another matter altogether.
Makes me wonder if there is some probability of error vs information density, irrespective of media. I assume eventually you’ll hit some quantum limit where so much of the data, is error correction that it’s not worthwhile. Maybe instead of going denser, it just needs to go larger.
I've got a pretty significant collection of PlayStation 1 and 2 games, and every time I see an article on disc rot I look to see when I need to be worried.
Has anyone seen a list of when disc-based systems might start showing signs? Something like game consoles and capacitors.
Not exactly what you are asking but I recently sold off my physical CD collection of about 500, after being in my closet for the past 20 years so no temperature/climate extremes, a few of them had signficant value on EBay, but when I tried to use Exact Audio Copy to verify they still could be read, about half no longer matched the online checksums or the original CD rip I had done before putting them in storage. There were three from Denon Japan that had physically deteriorated inside the disk, I could see there was a problem from a visual inspection.
Aha! This must be why them audiophiles prefer to listen to their music using $$$ cd transport from original media! You can hear the difference!
On a serious note, I’m in a similar boat, we’re in a middle of the move and it appears that my cd collection is in same predicament. So I’m debating whether to throw them to trash, or keep around (mostly for cover artwork, and to keep my Exact Audio Copy -> flac transfers just a bit more legit (I care of such things)).
I dislike CD. My equipment is capable of 192khz/32bit (some only 24bit though), so CD are pretty much trash at 41khz. Better than MP3, solely due to the info MP3 throws away, but not great.
Tidal goes higher, to 96khz on master quality where available (the vast number of tracks are not master quality, sadly).
I have whole bunch of CDs and DVDs (media and data) many way older then 10 years. Some I burned myself and some were bought. Nearly all doing nice and dandy.
More then that I have an old box of 3" floppies dated 92-95. I recently decided to check some old code, bought USB floppy drive and I was actually able to read them. Well not all of them but still ...
Who says that I did not. All that really matters was backed up years ago in a few places. But I still have the originals and was just curious. There is something about fishing out old CD and putting it into player.
As for those floppies I was just curious. I could care less if I did not find that old piece of code I mentioned. I was just curious to look at it and see how my way of thinking did change.
Besides disc rot, optical drives are also at high risk of the laser dying - this has happened to a PS1, PS2 and Dreamcast of mine.
For these reasons, the most serious retro gamers I know are modding their optical drive systems to replace the CD drive with a SD-card based optical drive emulator.
Anyway, thank heavens for emulation and the homebrew fan community. Without them, so much of video game history would be lost.