Hacker News new | past | comments | ask | show | jobs | submit login
Arctic Code Vault Contributor (elborai.me)
83 points by dgellow on July 17, 2020 | hide | past | favorite | 71 comments



- We cracked the encoding on the artifact

- What did you find? What was in the vault?

- We think it's... code. Software code. Early 21st century.

- Code from the early web? This could be historic! What's in there? The google algorithm? The code for the iphone?

- Not that we've found

- Twitter? Does it have Twitter? Imagine if we found the code that contains the bug that caused world war three!

- No

- Then what is in there?

- Just... tools, mainly. Manual tools from when people used to write software by hand.

- Really? Why do you think they archived those?

- I don't know. Maybe they thought they were clever?

- Maybe it's not actually an archive. Maybe it's just a garbage dump.

- You think everything we dig up is a garbage dump.

- It usually is!

- True. But look, we did find one artifact that's interesting. It's called 'react'. It appears to be an object of veneration. A lot of other things reference it.

- Are you saying this may have served some sort of ritual purpose?

- It's possible....


> Maybe they thought they were clever? ... Maybe it's just a garbage dump.

This is a really nasty, snarky comment. Deriding people who try to build things and talking about their work as garbage.

There's lot of software we wish we had access to now from the past, but we don't because people had the same attitude as you.


It is a joke about the tendency of archaeologists to classify everything they find as being either part of a garbage dump (aka a midden[0]) or as some variety of religious artifact [1].

[0] https://en.wikipedia.org/wiki/Midden

[1] See discussion here: https://en.wikipedia.org/wiki/Archaeology_of_religion_and_ri...


Come on! Don't be so touchy-feely. Take it for what is - a lighthearted joke.


In some ways this is a nice feature - it's a neat accomplishment for contributors to be aware of, and something that they can be proud of.

But I don't entirely understand why it has been enabled on public profiles by default.

As a regular user I hope that GitHub doesn't continue down the path of becoming a popularity contest at the expense of fostering co-operative work on equal terms.


I had to opt-in. So it is not something that is on-by-default, from what I know.


It is turned on for me by default.


I had to opt out -- it appeared on my profile without warning.


Where did you opt-in? I wasn't asked before it showed up, and then I disabled it through the "Settings > Profile > Profile Settings" page.


I cannot reproduce, unfortunately. But I half-conciously agreed to some modal/popup telling me I got some award and wether I wanted to show it. At least, that is what I remember, but it might have been a modal telling me they were going to show it, rather than asking.


DIDN'T opt in but I'm on beta flags, Idk


I think I'd rather see which of my repos were contributed rather than what the most popular repos I fixed typos in were.


All public repositories[1] were archived.

_On February 2, 2020, we took a snapshot of all active public repositories on GitHub to archive in the vault._

[1] https://github.blog/2020-07-16-github-archive-program-the-jo...


Sure, but without remembering what repos existed when or had how many stars when, that'll be easy to forget or be hard to verify. Rather than a stupid profile badge, I'd rather see like, a repo badge, maybe linking to the latest commit that made it into the code vault.


Not all public repositories, all active public repositories. The homepage[0] clarifies that this means "every repo with any commits between the announcement at GitHub Universe on November 13th and 02/02/2020, every repo with at least 1 star and any commits from the year before the snapshot (02/03/2019 - 02/02/2020), and every repo with at least 250 stars"

[0] https://archiveprogram.github.com/


They only encoded 21TB worth.

Also, you can see which repos of yours were selected with the "Arctic Code Contributor" badge in the highlights section of your profile.

In my case they only grabbed one library that I was half-finished on... I'm pretty sure they grabbed everything they thought would be important and filled remaining space with random selections


> Also, you can see which repos of yours were selected with the "Arctic Code Contributor" badge in the highlights section of your profile.

Not really. It only shows the first three or four. As an example from mine, it says:

  avelino/awesome-go, awesome-selfhosted/awesome-selfhosted, ansible/ansible, and more!
For people who have contributed widely, it's not a useful representation. That being said, it doesn't appear to be actively hurting anyone. ;)


They go into their methodology on the website, https://archiveprogram.github.com/


Yeah, the badge highlight seems to just list the three biggest projects you've contributed to, and the rest are summarized as "and more!"

So 2/3 of mine are Linux and Golang; great, but I don't use either anymore, so I'd like to see other stuff. If it isn't just "all of Github," which maybe it is.


Well my dotfile repo was archived, which to my knowledge has never had any viewers other than me. I think it may well be all of GitHub, or at least all repos that pass some measure of uniqueness


1 Commit + 1 Star or 250 Stars.


Popularity, or recent contributions.


Agreed. All I saw were various non-forks of "torvalds/linux" (e.g. raspberrypi/linux; why is this not a fork?). I removed it from my profile.


Is anyone here active on GitHub and _not_ a vault contributor?

I was pretty happy to find that my company's free football data made it in, although I'm not sure how much help that's going to be to rebuild society after the collapse:

https://github.com/statsbomb/open-data


I open a lot of issues but don't really commit code and I didn't get the badge.


That feels like an oversight, what a shame not to acknowledge those contributions.


Yep, no doubt there's a huge amount of valuable contribution archive material in pull request and issue comments as well.

Some of these may be available soon via joint partner initiatives[1].

[1] - https://github.blog/2020-07-16-github-archive-program-the-jo...


Were issues also archived? This badge is specifically an indicator of whether code you committed made it into the vault, not whether you contributed to a project in some other way, or a for other reasons a nice person.

I'm still cynical enough to think that this is mostly a PR move. If society breaks down to the point where we need to open up code vaults, I think massively distributed local archives will be more useful than some remote arctic data center.


As far as I can tell, just about every active GitHub user has the same thing - just as long as you've ever committed to a public repo


Yeah, the inclusion criteria are pretty broad:

  The snapshot will include every repo with any commits
  between the announcement at GitHub Universe on November 13th
  and 02/02/2020, every repo with at least 1 star and any
  commits from the year before the snapshot (02/03/2019 -
  02/02/2020), and every repo with at least 250 stars.
from https://archiveprogram.github.com/#arctic-code-vault


This would make sense, as it looks like money wasn't a huge factor. If money wasn't an issue, time and effort would be, and you can easily reduce effort by storing pretty much everything.


Oh, ha, you're right! I read this post and thought, "wow, so cool!". But after reading your comment I checked my own GitHub profile and sure enough, I have it, too!


I think it's still pretty cool :) Although like the author, I shudder to think what people might think of my code 1,000 years from now (let alone what people think of it now ;)


This badge will become increasingly scarce as more people register for GitHub accounts. I bet this badge will be a GitHub status symbol 10 years from now.


I'm not going to lie, the comment about "if they can use git" is the core of my problem with the arctic storage. I'm still wondering what the point is other than to make some meta point about how committed github is to preservation


As GitHub explains in their blog post[1]:

_It will also include works which explain the many layers of technical foundations that make software possible: microprocessors, networking, electronics, semiconductors, and even pre-industrial technologies. This will allow the archive’s inheritors to better understand today’s world and its technologies, and may even help them recreate computers to use the archived software._

[1] https://github.blog/2020-07-16-github-archive-program-the-jo...


There's no git usage, they're just storing a snapshot of each repo's current HEAD in a TAR file (and then compressed and QR-encoded).

> The snapshot will consist of the HEAD of the default branch of each repository, minus any binaries larger than 100KB in size—depending on available space, repos with more stars may retain binaries. Each repository will be packaged as a single TAR file.

The guide included on each reel about how to access the data is public, git isn't involved in the process: https://github.com/github/archive-program/blob/master/GUIDE....


No, it is not QR-encoded.

This is the method:

https://earth.esa.int/documents/1656065/3222865/170922-Piql-...


In the link you post it states "piqlWriter: data writtenas high-densityQR codesEncode binary data to 2D barcode (apply Forward Error Correction)Modulate light using a Digital Micromirror Device (DMD) to project the barcode on the film."


It's a lie. It's not a QR code. I don't know why they say it is. Look at the code. It's nothing like a QR.

They should have said "2D Barcode" but they might be assuming an ignorant audience.


Haha, yes, I have this image in mind of future scientists trying to understand the mess the git CLI is! I'm sure they will have a bunch of theories in which git is considered a religious relic from past humans, used to summon some gods by writing weird incantations, that's already how it feels to me some time :)


The git-cli is an inconsistent mess, agreed.

The data-model, however, is not. It is actually a very smart, mathematical (cryptographical) tree-structure[1].

I'm pretty sure some far future generation will be able to figure out the math behind a "git database". I'm actually more confident they will be able to do so, than your average "excel" or "winrar backup".

-- [1] I really started liking git and understanding what it all is about (and how to use the weird commandline --flags - or was is -flags or -f or -F or flags) after reading the Git From the Bottom Up book: https://jwiegley.github.io/git-from-the-bottom-up/


My understanding is that the vault also includes instructions on how to access the stored projects. I would guess this has to at least explain the minimum ability to replicate git or understand git's format?


The git source repo is probably included in the archive itself. So meta.


Pretty sure the repos are stored plaintext in tar balls


when I started coding my mentor told me: "it doesn't matter if it's 2 lines of code, you are still contributing". I'll always remember that, a contribution is a contribution :)


The right line in the right place can save the whole system.

Especially if the line is a consistency check.


The three projects shown in the badge seem to be the top three by star count. Of course star count often poorly reflects actual importance of project: in my case, python/cpython only squeezed in at the third, another cornerstone project where I’m a member and contribute to long term got relegated to “and more”, while two popular yet much less important developer tools where I submitted some drive-by patches took the top spots.

It would be nice if star count is weighted by amount of contributions, and cornerstone projects like python/cpython get a boost in ranking.


I assume by "importance of project" you mean your importance to the project? I agree that there's probably a balance to be struck there. But also, I think that choosing what repos to show in the badge isn't really that important so any simple heuristic is fine.


No, I meant importance of project. For instance, sindresorhus/awesome has 137k stars, python/cpython only has 32.4k, but the latter is clearly orders of magnitude more important than the former. Which is why I called the latter a cornerstone project.


Good point. Although when it comes to deciding which repositories to show on the badge, I think I'd like a balance between the amount of my contribution and the importance of the repository.


Yeah, I did say star count should be weighted by amount of contributions.


Ah! Checked mine, I got one too. Not sure what to make of it!


In the incredibly competitive and arbitrary world of tech hiring and networking, any little thing that makes you stand out is worth while. I have friend whose résumé says "600 lines of my Python are running on the International Space Station". I got my current job because an undergrad internship from 10 years ago caught the hiring manager's attention. Any little thing helps.


Perhaps it would be better if it showed a list of contributions to projects in the Code Vault, ordered by contribution size…


While it seems silly, I think it can also help highlight how even the smallest changes can be impactful. Collaboration like this requires lots of unsung heroes. A fix or small change goes a long way in helping people. I think of it as a reminder that even the smallest things can have real impact.


I got the badge. I didn’t understand why and couldn’t find which repos I’d contributed to had become part of this vault I’d not heard of before. I decided it was meaningless because I couldn’t immediately see any substance behind the award. I turned it off. Have I missed out on some greatness?


If you hover over the badge on your profile, it shows you which repos you contributed to that made it into the vault.


Well, three of them.


I got that for a bunch of repos I've archived. Not many stars, but they are ones that are part of a team effort. It may have to do with the number of contributors.

I'm not entirely convinced that it's a "special honor."

If so, GoMeGoMe. If not, meh.


Just wondering, I didn't got the badge since I work on private repos mostly but I do have write access to a few public repos that got stored on the vaultg, if I commit to them now will I get the badge?


I assume not since your code will not be in the vault.


Related thread from yesterday: https://news.ycombinator.com/item?id=23860659


Really don't like they don't show contributions in an (open source) organisation. My private contributions are not that many in comparison to org contributions.


Apparently my single rule addition to HTTPS Everywhere got me the badge. Not entirely sure what I'm supposed to do with that.


I wonder if the small contributions I made to the WebGL implementation in Mozilla will go in there...


What was the aim of Artic Code Vault ? Was that necessary?


Yeah, I just noticed mine... 3 went in.


Does anyone know how this interacts with GDPR? In particular, if someone has personal information stored in a GitHub repo and asks for it to be permanently deleted, do they need to dig it out of a bunker in the Arctic?


Woot I made it!


Thanks to Microsoft, the crapware arrives at GitHub. I'm looking forward to a long future of disabling ads* that have been automatically added to my profile page.

* Yes, this is an ad for a GitHub project that I didn't ask to be a part of and give zero farts about. Depicting me as some kind of deliberate contributor is false.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: