- We think it's... code. Software code. Early 21st century.
- Code from the early web? This could be historic! What's in there? The google algorithm? The code for the iphone?
- Not that we've found
- Twitter? Does it have Twitter? Imagine if we found the code that contains the bug that caused world war three!
- No
- Then what is in there?
- Just... tools, mainly. Manual tools from when people used to write software by hand.
- Really? Why do you think they archived those?
- I don't know. Maybe they thought they were clever?
- Maybe it's not actually an archive. Maybe it's just a garbage dump.
- You think everything we dig up is a garbage dump.
- It usually is!
- True. But look, we did find one artifact that's interesting. It's called 'react'. It appears to be an object of veneration. A lot of other things reference it.
- Are you saying this may have served some sort of ritual purpose?
It is a joke about the tendency of archaeologists to classify everything they find as being either part of a garbage dump (aka a midden[0]) or as some variety of religious artifact [1].
In some ways this is a nice feature - it's a neat accomplishment for contributors to be aware of, and something that they can be proud of.
But I don't entirely understand why it has been enabled on public profiles by default.
As a regular user I hope that GitHub doesn't continue down the path of becoming a popularity contest at the expense of fostering co-operative work on equal terms.
I cannot reproduce, unfortunately. But I half-conciously agreed to some modal/popup telling me I got some award and wether I wanted to show it. At least, that is what I remember, but it might have been a modal telling me they were going to show it, rather than asking.
Sure, but without remembering what repos existed when or had how many stars when, that'll be easy to forget or be hard to verify. Rather than a stupid profile badge, I'd rather see like, a repo badge, maybe linking to the latest commit that made it into the code vault.
Not all public repositories, all active public repositories. The homepage[0] clarifies that this means "every repo with any commits between the announcement at GitHub Universe on November 13th and 02/02/2020, every repo with at least 1 star and any commits from the year before the snapshot (02/03/2019 - 02/02/2020), and every repo with at least 250 stars"
Also, you can see which repos of yours were selected with the "Arctic Code Contributor" badge in the highlights section of your profile.
In my case they only grabbed one library that I was half-finished on... I'm pretty sure they grabbed everything they thought would be important and filled remaining space with random selections
Yeah, the badge highlight seems to just list the three biggest projects you've contributed to, and the rest are summarized as "and more!"
So 2/3 of mine are Linux and Golang; great, but I don't use either anymore, so I'd like to see other stuff. If it isn't just "all of Github," which maybe it is.
Well my dotfile repo was archived, which to my knowledge has never had any viewers other than me. I think it may well be all of GitHub, or at least all repos that pass some measure of uniqueness
Is anyone here active on GitHub and _not_ a vault contributor?
I was pretty happy to find that my company's free football data made it in, although I'm not sure how much help that's going to be to rebuild society after the collapse:
Were issues also archived? This badge is specifically an indicator of whether code you committed made it into the vault, not whether you contributed to a project in some other way, or a for other reasons a nice person.
I'm still cynical enough to think that this is mostly a PR move. If society breaks down to the point where we need to open up code vaults, I think massively distributed local archives will be more useful than some remote arctic data center.
The snapshot will include every repo with any commits
between the announcement at GitHub Universe on November 13th
and 02/02/2020, every repo with at least 1 star and any
commits from the year before the snapshot (02/03/2019 -
02/02/2020), and every repo with at least 250 stars.
This would make sense, as it looks like money wasn't a huge factor. If money wasn't an issue, time and effort would be, and you can easily reduce effort by storing pretty much everything.
Oh, ha, you're right! I read this post and thought, "wow, so cool!". But after reading your comment I checked my own GitHub profile and sure enough, I have it, too!
I think it's still pretty cool :) Although like the author, I shudder to think what people might think of my code 1,000 years from now (let alone what people think of it now ;)
This badge will become increasingly scarce as more people register for GitHub accounts. I bet this badge will be a GitHub status symbol 10 years from now.
I'm not going to lie, the comment about "if they can use git" is the core of my problem with the arctic storage. I'm still wondering what the point is other than to make some meta point about how committed github is to preservation
_It will also include works which explain the many layers of technical foundations that make software possible: microprocessors, networking, electronics, semiconductors, and even pre-industrial technologies. This will allow the archive’s inheritors to better understand today’s world and its technologies, and may even help them recreate computers to use the archived software._
There's no git usage, they're just storing a snapshot of each repo's current HEAD in a TAR file (and then compressed and QR-encoded).
> The snapshot will consist of the HEAD of the default branch of each repository, minus any binaries larger than 100KB in size—depending on available space, repos with more stars may retain binaries. Each repository will be packaged as a single TAR file.
In the link you post it states "piqlWriter: data writtenas high-densityQR codesEncode binary data to 2D barcode (apply Forward Error Correction)Modulate light using a Digital Micromirror Device (DMD) to project the barcode on the film."
Haha, yes, I have this image in mind of future scientists trying to understand the mess the git CLI is! I'm sure they will have a bunch of theories in which git is considered a religious relic from past humans, used to summon some gods by writing weird incantations, that's already how it feels to me some time :)
The data-model, however, is not. It is actually a very smart, mathematical (cryptographical) tree-structure[1].
I'm pretty sure some far future generation will be able to figure out the math behind a "git database". I'm actually more confident they will be able to do so, than your average "excel" or "winrar backup".
--
[1] I really started liking git and understanding what it all is about (and how to use the weird commandline --flags - or was is -flags or -f or -F or flags) after reading the Git From the Bottom Up book: https://jwiegley.github.io/git-from-the-bottom-up/
My understanding is that the vault also includes instructions on how to access the stored projects. I would guess this has to at least explain the minimum ability to replicate git or understand git's format?
when I started coding my mentor told me: "it doesn't matter if it's 2 lines of code, you are still contributing". I'll always remember that, a contribution is a contribution :)
The three projects shown in the badge seem to be the top three by star count. Of course star count often poorly reflects actual importance of project: in my case, python/cpython only squeezed in at the third, another cornerstone project where I’m a member and contribute to long term got relegated to “and more”, while two popular yet much less important developer tools where I submitted some drive-by patches took the top spots.
It would be nice if star count is weighted by amount of contributions, and cornerstone projects like python/cpython get a boost in ranking.
I assume by "importance of project" you mean your importance to the project? I agree that there's probably a balance to be struck there. But also, I think that choosing what repos to show in the badge isn't really that important so any simple heuristic is fine.
No, I meant importance of project. For instance, sindresorhus/awesome has 137k stars, python/cpython only has 32.4k, but the latter is clearly orders of magnitude more important than the former. Which is why I called the latter a cornerstone project.
Good point. Although when it comes to deciding which repositories to show on the badge, I think I'd like a balance between the amount of my contribution and the importance of the repository.
In the incredibly competitive and arbitrary world of tech hiring and networking, any little thing that makes you stand out is worth while. I have friend whose résumé says "600 lines of my Python are running on the International Space Station". I got my current job because an undergrad internship from 10 years ago caught the hiring manager's attention. Any little thing helps.
While it seems silly, I think it can also help highlight how even the smallest changes can be impactful. Collaboration like this requires lots of unsung heroes. A fix or small change goes a long way in helping people. I think of it as a reminder that even the smallest things can have real impact.
I got the badge. I didn’t understand why and couldn’t find which repos I’d contributed to had become part of this vault I’d not heard of before. I decided it was meaningless because I couldn’t immediately see any substance behind the award. I turned it off. Have I missed out on some greatness?
I got that for a bunch of repos I've archived. Not many stars, but they are ones that are part of a team effort. It may have to do with the number of contributors.
I'm not entirely convinced that it's a "special honor."
Just wondering, I didn't got the badge since I work on private repos mostly but I do have write access to a few public repos that got stored on the vaultg, if I commit to them now will I get the badge?
Really don't like they don't show contributions in an (open source) organisation. My private contributions are not that many in comparison to org contributions.
Does anyone know how this interacts with GDPR? In particular, if someone has personal information stored in a GitHub repo and asks for it to be permanently deleted, do they need to dig it out of a bunker in the Arctic?
Thanks to Microsoft, the crapware arrives at GitHub. I'm looking forward to a long future of disabling ads* that have been automatically added to my profile page.
* Yes, this is an ad for a GitHub project that I didn't ask to be a part of and give zero farts about. Depicting me as some kind of deliberate contributor is false.
- What did you find? What was in the vault?
- We think it's... code. Software code. Early 21st century.
- Code from the early web? This could be historic! What's in there? The google algorithm? The code for the iphone?
- Not that we've found
- Twitter? Does it have Twitter? Imagine if we found the code that contains the bug that caused world war three!
- No
- Then what is in there?
- Just... tools, mainly. Manual tools from when people used to write software by hand.
- Really? Why do you think they archived those?
- I don't know. Maybe they thought they were clever?
- Maybe it's not actually an archive. Maybe it's just a garbage dump.
- You think everything we dig up is a garbage dump.
- It usually is!
- True. But look, we did find one artifact that's interesting. It's called 'react'. It appears to be an object of veneration. A lot of other things reference it.
- Are you saying this may have served some sort of ritual purpose?
- It's possible....