
GitHub Archive Program - tosh
https://archiveprogram.github.com/
======
CameronBanga
10,000 years from now, society has been destroyed. And people come across a
glass plate with laser etchings. How bummed will they be when they go through
all of the effort of decoding it, only to learn it's like some python library
for managing drivers on a 2014 Dell laptop running linux?

~~~
sebazzz
Only to find a packages.json file without a packages-lock.json file checked
in, to make it impossible to restore the node_modules directory.

~~~
fiddlerwoaroof
Don't worry npm and cockroaches will be the only things to survive a nuclear
war

~~~
JimmyRuska
Pretty sure the universe is running on some advanced version of Erlang. Erlang
VM would survive until heat death of the universe, when it finally crashes the
heartbeat program will kick in and we'll have a new big bang

~~~
Jd
Obligatory XKCD citing: [https://xkcd.com/224/](https://xkcd.com/224/)

------
soheil
Github has blocked access to my account with 10s of popular projects because
one day they randomly sent me an email to click on a link to enable 2fa auth.
I was coerced to enable it. A while later I lost access to the phone where the
2fa auth app was installed not having back up codes since I was pushed to
enable 2fa in a rush now I'm completely locked out. I contacted support no
fewer than 15 times with them saying I need to create a new account since I
did not link my cell phone nor have back up codes. I had that account for over
a decade and now I cannot even control the projects I was working on nor
access any of my private repos. I have been communicating with them using the
email I have on my account, but this is not sufficient for them to restore
access to my account.

~~~
beart
I'm sorry you lost your account (really, that sucks) but I don't understand
framing this like Github is doing something wrong. It isn't like they banned
you for spamming emojis.

You enabled 2fa, you lost your 2fa, and you did not have any recovery codes.
Now you are asking for them to bypass the 2fa, and they are refusing.

Again, that sucks, but when I compare this to what the cell phone companies
are doing with sim swapping, it increases my respect for Github.

~~~
beshrkayali
There has to be a mechanism where you can get your account back if you provide
some form of identification. I know that you get into other issues that way,
but 2fa without a mechanism to restore account in case of phone/code loss (or
inability to access, which is somewhat likely if you don't keep multiple
copies of your codes) is pretty stupid.

~~~
xondono
The one I've had to use (and find reasonable) is to have "trusted devices".
That's how it works, at least with 1Password:

\- You log with your 2FA into a device. \- You set this device as trusted. \-
In case you lose your phone or 2FA device, log onto the trusted device and
disable 2FA. \- Then set up 2FA again with new phone or device.

It's not perfect, but it's workable. For instance, while I won't enable
trusted device on my laptop, having my desktop stolen is a way rarer occasion,
so I enable the "trust this device" options. It's just a matter of thinking on
the threat model and where you can place spots for recovery while loosing as
little security as possible.

~~~
ProZsolt
But that fine because the trusted device is the second factor (what you have).

You can have multiple second factors with Github. I currently have two
Yubikeys and one authenticator app enabled. If I lose one I can still log in
with another.

------
dmix
> On February 2, 2020, GitHub will capture a snapshot of every active public
> repository, to be preserved in the GitHub Arctic Code Vault. This data will
> be stored on 3,500-foot film reels, provided and encoded by Piql, a
> Norwegian company that specializes in very-long-term data storage. The film
> technology relies on silver halides on polyester. This medium has a lifespan
> of 500 years as measured by the ISO; simulated aging tests indicate Piql’s
> film will last twice as long.

That just sounds like a fun project.

~~~
breck
Right? Sounds so fun.

But also a great engineering exercise. I wouldn't be surprised if this
exercise leads to lots of valuable improvements for GitHub users in the here
and now. Trying to solve such a grand challenge forces you to develop a
vocabulary and understanding of your current systems that can lead to more
immediate improvements. I think it unlikely any of these archives will
actually be accessed, but simply building them could lead to great side
effects.

It's also great marketing, as I now believe that Microsoft/GitHub takes the
job of not losing user data extremely seriously, more so than if they had
spent an equivalent some of money buying an ad that says "We take not losing
data seriously".

------
tosh
> The GitHub Arctic Code Vault is a data repository preserved in the Arctic
> World Archive (AWA), a very-long-term archival facility 250 meters deep in
> the permafrost of an Arctic mountain. The archive is located in a
> decommissioned coal mine in the Svalbard archipelago, closer to the North
> Pole than the Arctic Circle. GitHub will capture a snapshot of every active
> public repository on 02/02/2020 and preserve that data in the Arctic Code
> Vault.

~~~
Already__Taken
Hope it's water tight

~~~
Zitrax
Unlike the svalbard seed vault:
[https://www.theguardian.com/environment/2017/may/19/arctic-s...](https://www.theguardian.com/environment/2017/may/19/arctic-
stronghold-of-worlds-seeds-flooded-after-permafrost-melts)

------
t0astbread
In a thousand years someone will inherit the ultimate legacy codebase.

~~~
falcor84
At least they wouldn't be expected to maintain it running in production on
some old mainframe

~~~
NetOpWibby
You’ve just jinxed someone.

------
Nuzzerino
Nice to see the Long Now Foundation involved with this. A friend of mine is a
member. The work that they do is a pretty big deal given that not much of it
is being done in today's culture of short term thinking.

www.longnow.org

~~~
julianmetcalf
We are thrilled to be partnering with the Long Now Foundation on this effort.
They were a huge inspiration to us!

------
vortico
An interesting thought: The license of your open-source software will become
irrelevant if this is cracked open in 1000 years because everything will be
public domain, assuming copyright laws aren't changed drastically, but it's
interesting to wonder about what future humans will think about the open-
source license movement of the 1980's-to-present.

~~~
jobigoud
This has me wondering what is the first free or open-source software that will
"fall" into the public domain, and when that will be?

~~~
vortico
Free (libre) software sort of faded into existence, simply because the notion
to copyright software faded into existence, and free software sort of relies
on people to consider that software can be copyrighted in the first place.
"A-2 System"
[https://en.wikipedia.org/wiki/A-0_System](https://en.wikipedia.org/wiki/A-0_System)
is roughly the first open-source software, although I'm not sure an explicit
license text even existed.

The second question is what will enter public domain first, business-owned
software or individually-written software. If the latter, which open-source
developer has died the earliest?

~~~
beefhash
> If the latter, which open-source developer has died the earliest?

Can we just take a second to appreciate how downright inhumane copyright
expiry is? It basically encourages _cheering for people to die_ because of
copyright expiry.

~~~
vortico
I agree that it's dumb. The only clear alternative, N years after the work was
created, has its own problems though, such as trying to figure out when each
work was created to determine whether it's in the public domain or not.

~~~
account42
Usually getting a upper bound on when something was first published is easier
than finding out when and if some random person has died.

------
chx
How awfully convenient the news that github is literally putting an archive on
ice breaks swallowing search traffic about how Github workers are resigning
because they want the company to break their contract with the ICE...

~~~
dwoozle
Oh please, if you want to change border policy use the political process, not
harangue a version control software company.

~~~
princekolt
lol

------
sigwinch28
> The 02/02/2020 snapshot archived in the GitHub Arctic Code Vault will sweep
> up every active public GitHub repository, in addition to significant dormant
> repos as determined by stars, dependencies, and an advisory panel.

[https://github.com/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee/eeeeeeee...](https://github.com/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee)

The archive team should be able to get a very good compression ratio on that
repo.

------
beokop
Is there a way to opt out of this? I’m not sure I want my code to be stored
forever in a glacier with no way of deleting it.

~~~
tjr
I'm thinking I might want to opt in. What should I add to GitHub before
February, to make sure it gets in the archive?

~~~
celeritascelery
Nothing. So long as it’s active, it will get archived.

------
darkwater
Almost totally unrelated but:

> As today’s vital code becomes yesterday’s historical curiosity

shouldn't be "tomorrow's historical curiosity"?

~~~
ksec
Same question. If it is not wrong, could someone please explain?

~~~
Arnavion
Both are correct. The article's phrasing is that the code of today is
"yesterday's historical curiosity" from the point of view of someone who looks
at it tomorrow.

~~~
vemv
The article's phrasing doesn't seem that correct: yesterday's historical
curiosity is not necessarily _today 's_ historical curiosity ( _today_ being
3000 AD).

------
DCRichards
Wait, stop, there's been a mistake! This means PHP will live on for another
1000 years!

------
tw2019111327667
How do I submit a patch against archiveprogram.github.com? The copy contains a
typo, and unlike GitLab—which has a footer at the bottom of most public-facing
pages with a link back to the markdown source—there's no indicator of how to
get this corrected.

------
jaytaylor
This effort actually touches me and brought tears to my eyes. Beautiful.
Sometimes people do more amazing work than I could've imagined.

------
wewake
Software/code rots if not touched for months. This is such a fruitless effort
but does get Github some good PR. They should instead preserve ideas that make
all software happen.

~~~
dgl
I find the assumption that software rots after months a sad thing. Most of
that is poorly versioned libraries that change quickly. Something like the Go
1 compatibility promise means code written just against core Go should be fine
if someone has the latest (maybe last) version of Go 1.

Also archeologists study rot.

------
cayblood
The Arctic World Archive description says: "The film technology relies on
silver halides on polyester. This medium has a lifespan of 500 years as
measured by the ISO; simulated aging tests indicate Piql’s film will last
twice as long."

Why not use a much more accessible medium like M-DISC, which claims a lifespan
of at least 1,000 years?
[https://en.wikipedia.org/wiki/M-DISC](https://en.wikipedia.org/wiki/M-DISC)

~~~
jerrysievert
> Why not use a much more accessible medium like M-DISC

just wondering how an encoded digital format is more accessible than something
printed on a transparent film that you can see by looking through it.

~~~
Avamander
If it's QR codes as mentioned by people in this thread then those probably
aren't easier to decode than a M-DISC would be.

------
stereo
It would be interesting to include the great open databases of our time -
Wikipedia, OpenStreetMap, AllTrials, various national open data portals...

------
madamelic
Cool.

I bet people in 2200 can't wait to figure out what version of node they need
to run a specific gulp version from an obscure error, caused by an underlying
dependency's reliance on an deprecated built-in (true story).

After figuring out that issue, they will need to debug why a specific
package's version is broken, only to find out the maintainers 200 years ago
hard-coded the century.

------
euske
I'm skeptical of the idea of static archives, especially when it comes to
code. I think software is pretty much a living creature - with all of its
environments and contexts and baggage. The only feasible way to survive it is
to keep changing it. When it's no longer updated, we should pay a respect and
bury them.

Plus, people are going to reinvent the wheel no matter what.

~~~
reikonomusha
Common Lisp code begs to differ. It’s not perfect but code that’s half a
century old (Lisp that’s not even Common Lisp!) can run nearly untouched. It’s
a huge reason I write it.

~~~
gen220
I get where you’re coming from, but the interpreter that you’re using to run
that code has been rewritten to suit the underlying architecture of your
machine dozens of times in the intervening time.

I think that’s what GP means, that if you don’t include the “context”
(interpreter, cpu architecture, power supply), you’re only burying the most
malleable tip of the iceberg.

It reminds me of the Zork source code that was published here some months ago.
It was written in a language for which the compiler appears to have been lost.
People have tried to develop compilers given the source code provided, but we
can’t get it to work _just_ right, because some crucial bit of context appears
to be missing.

There will probably be people who maintain LISP and C for decades more, but
imagine trying to write a Java or Haskell compiler with _only_ the source code
for Kafka or Pandoc as your guide?

Not impossible of course, but hey, maybe you should check the source code for
a JVM or ghc version (and maybe add a Linux distro and gcc for good measure)
in with your source in time for the cut off date. :)

Point is, once you enumerate all of the dependencies, you realize the only
solution is what GP said. Constant maintenance at _all_ levels is the only
thing that keeps the ship running.

------
michaelmior
> For greater data density and integrity, most of the data will be stored QR-
> encoded

At face value, this seems like a really odd choice. I don't understand why you
would choose QR encoding unless this was being printed. I feel like I'm
missing something here.

~~~
333c
It's being stored on film, according to the page.

~~~
michaelmior
Thanks! I somehow missed that.

------
krschultz
I was really hoping this would simply mean enabling a "deprecated" flag for a
repo.

~~~
julianmetcalf
Hey there, we have functionality in GitHub where you can archive a repo. Go
here to learn how: [https://help.github.com/en/github/creating-cloning-and-
archi...](https://help.github.com/en/github/creating-cloning-and-archiving-
repositories/archiving-repositories)

------
app4soft
It reminds me about _Google Code Archive_ [0]

[0] [https://code.google.com/archive/](https://code.google.com/archive/)

------
benburleson
Oh great, now my coding sins can live on forever!

------
explodingcamera
1000 years from now, someone will find the python 3 source code and the whole
world will finally upgrade their python 2 programs...

------
rusini
Wow, are we preparing for a global catastrophe?

~~~
oscargrouch
I think this is more in the line of

"Si vis pacem, para bellum"

Its very important to protect the civilizatory process from fallout, as severe
or black-swan conditions might swipe what we have acomplished so far, and
history is full of examples of advanced civilizations that were vanished and
we had to dig out from the mud slowly without a chance to learn, starting all
over again from scratch.

The civilizatory process is fragile and is always menaced by all sides, all
the time. Its good to be always vigilant and prepared to anything.

------
natmaka
[https://www.softwareheritage.org/](https://www.softwareheritage.org/)

------
jedieaston
will there be a computer in there that can survive 10,000 years so that you
can compile the code?

~~~
julianmetcalf
I'm the PM at GitHub for the Archive Program. We aren't making the assumption
that you will have computer. The archive will include a Tech Tree that
explains in plain language the fundamentals of computer programming and how to
use the material in the Arctic Code Vault. We will also include technical
guides to QR decoding, file formats, character encodings, and other critical
metadata so that the raw data can be converted back into source code for use
by others in the future.

~~~
mihaifm
Will you publish these guides? Would make a good read.

~~~
julianmetcalf
That’s the plan!

------
deegles
Another great reason to store your blog on github!

------
ngcc_hk
Ok. The program Is interesting. No more side track to other 2fa, ...

What is the issue of archiving like this may I ask.

------
w1nst0nsm1th
That's where come to die the MacOS open source community made irrelevant by
the Apple notarization.

It's like these strange place full of dead elephant in africa.

------
stblack
Conceptual acid test: Go to
[https://archiveprogram.github.com](https://archiveprogram.github.com), then
disable all CSS.

Observe: resultant web page is unusable.

~~~
hollerith
How would I disable CSS in Chrome?

~~~
arthurcolle
In Chrome DevTools you could delete the link element that includes the
stylesheet.

~~~
hollerith
I was able to do that (so, thank you) but don't understand what great
grandparent means by unusable. I prefer it to the version with CSS (which I
gave up on after about 2 seconds before getting intrigued by great
grandparent). The list of partners is not rendered; is that what qualifies it
as unusable, great grandparent?

(I am participating here solely out of curiosity about the web; am not trying
to make any point or argue any position.)

~~~
stblack
The question is, will CSS and any other present-day tech be around in the
medium-future?

We know plain text will be around.

I was surprised to find the website promoting this is tricked-out, which isn't
consistent with very long-term vision of the project.

