Hacker News new | past | comments | ask | show | jobs | submit login
Guy commits his genome to Github, smartass forks and issues a pull request (github.com)
476 points by HectorRamos on Feb 13, 2011 | hide | past | web | favorite | 72 comments

You know... what would be interesting is if he convinced his parents to submit their data and then he had his data as a merge commit.

git blame

And then we could have the usual talks every parents has with each other:

"That ugly nose is your father's"

"Your mother is the one that gave you X disease."

that's not a smartarse, that's quite genuinely funny (ymmv)

Here are some edits that are not humorous, but would actually improve the DNA:


These edits were proposed by Michael Cariaso, whom I met recently. Mike runs a cool site called http://snpedia.com.

SNPedia is like 23andme, except every night SNPedia reads PubMed abstracts to see if it can determine information about more SNPs. So, I believe it now gives information on 10x as many SNPs as 23andme, because it groks a lot more medical research.

SNPedia has a mode where you do not even need to upload your data to them, instead you simply download the program and run it locally. But, regardless, you need to get your DNA, and 23andme is the cheapest option for doing so.

This issue is smartarse:

"No instructions on how to build"


Is it funny because it's funny, or is it funny because this will probably happen (in all seriousness) for real one day?

(rhetorical musing question)


It is going to happen.

And also we'll see more people asking for changes on appearance than health.

it's funny because I bet this is how the Eugenics Wars/World War III started in Star Trek.

wow; that's a blast from the past but that war was middle-asian, wasn't it?

Not funny at all. I have copied everything. I will now make his CLONE. muhaahaaaa!! (and malicious grin ;)

Sometimes I do not understand HN. There is the original post of the guy that posted his DNA on github.com (with the link of course) and a decent discussion on


(5 hours ago)

and still this post is the most upvoted

Well the original is the blog entry, this is the actual pull request. Plus, this title is a lot funnier. What do you expect late on a Saturday night?

I don't read HN much, but I thought that in theory that's something you would expect of reddit and not HN? Isn't HN supposed to be above that kind of stuff?

There exists a non-negligible rift between "supposed to" and "actually does".

You could make the argument that HN would be more interested in the code as opposed to the blog talking about it...

Which is odd because starting from the blog you get the author's take on it AND a link to the code. It's not like you just get one or the other. I would think HN would (should?) be more interested in the whole story and not just a link to a codebase whenever possible.

In my experience the stories upvoted on a weekend or holiday are more lighthearted than on a week day.

The original is interesting, the Pull is funny. Funny beats interesting most days of the week. I laughed at least, and I think most people did.

If only we could truly activate noprocrast mode in our genetic code by simply changing 3 base pairs...

Imagine how much money the people who discovered it would end up making!

> Imagine how much money the people who discovered it would end up making!

And they'd start making it right away ;)

I thought they found that the ability to delay gratitude is largely genetic? That's still a long way from identifying the particular gene responsible (or even knowing that there is a particular gene, and it's not a combination of a large number of unrelated genes), but it does make it plausible that this will be possible one day.

I don't see why Github couldn't be used to version track actual genomes of engineered small organisms... It would be great, you could curate changes that are 'virtual', changes that have been made, tested, and validated.

"Eyelids now close in proper way. Fixes issue #42." I find humor in this.

Maybe "genome" should be replaced by "a part of his genome".

For more information about the raw format used by 23 and Me: http://www.snpedia.com/index.php/23andMe

shame, I did this a few weeks ago: http://github.com/orta/dna

In your README you state: I find it fascinating that you can get a text file, roughly 25meg big that contains what I'd imagine to be terabtyes worth of data

I just thought I'd point out that you're really just getting a 25MB patch to a 3GB file (which is really duplicated to 6GB).

Ah, now that makes sense, thanks mbreese

I would say you dodged a bullet. This guy's repo is fast becoming a gag: https://github.com/msporny/dna/network/members

Also, he explicitly released it under the Creative Commons Public Domain License. Incidentally, I wonder if that has any patent implications? Does having genomes in the public domain prevent pharma companies from patenting their "discoveries"?

Heh. The GitHubt default text on that page is accidentally funny in itself ;)

msporny created dna and everyone else forked it. This is the family tree.

Aren't you not supposed to be able to patent facts? Seems like your DNA structure should be classified as a fact.

see the BRCA1/2 (myriad) debacle. horrible mess.

The public domain shouldn't matter. Publication could.

Manu said [1] he was the first to do so, but looks like maybe you beat him.

[1] http://manu.sporny.org/2011/public-domain-genome/

So, where/how do you get a dump of your DNA, anyways? I suppose it must cost a pretty penny, too?

He used 23andme. And they only give you data for around a million SNPs (http://en.wikipedia.org/wiki/Single-nucleotide_polymorphism).

The cost of sequencing decreases by a factor of 10 (!) a year per base pair. I will wait until it's cheap to sequence my whole genome, before I'll shell out the money.

(Synthesizing also gets cheaper by around the same factor. But it started from a higher level of cost.)

I saw the XMas sales for 23andme and I thought the money at that time was well worth the knowledge.

If and when they offer the service for a complete sequence, I'll probably pay the money again, just to have a complete record.

Did you find anything interesting? Ailments or ancestry?

I found all of it interesting, but I'm a bio geek. I'm also planning to have kids with my fiancé so it's also cool because of that.

Some highlights:

Both my fiancé and I are carriers for a mild form of Hemochromatosis, so we have a 1/4 chance of having a kid with it.

I am a little bit Asian! That's probably the Jewish ancestry. My fiancé thought he had a black ancestor but it turns out was 100% European. My best friend, whose father had always told her that her grandmother was Native American, found out he was lying to her (which did not surprise but did disappoint her.) Also the heat map of where your mitochondria are from is fun; it confirmed what I knew already but it was still really cool to see that the first migration of my maternal line out of Greece was with my grandmother.

But that's really just the tip of the iceberg. Part of what makes 23andme so fun is you get to learn a lot about biology from the framework of your own genes. Each SNP they report on has a bunch of information associated with it. It's only boring if you aren't interested in biology.

That sounds cool. Can they tell you from which European ethnic groups you come from too? How specific are they about this? Do they say French, German, etc...? Or do they say South of France Celtic, Germanic tribe from the North and so on?

Right now the only categories where they give you your percentage and paint your genome accordingly are European, Asian, and African.

The cool thing about 23andme though is that they're always rolling out new stuff. There is a separate, more detailed ancestry break-down in their "labs" (a part of the website for experimental stuff) but it's not very good. Part of the problem is that there isn't enough research pinning down large swaths of the genome to more specific areas. So what 23andme does is use user reported data- which is problematic due to globalization and limited by people's knowledge.

The only other things that are good and can really pin down geography are your mitochondrial DNA, like I mentioned, and the Y chromosome.

Aaaah, not very useful for me then. I live in Europe and doubt that I have any Asian or African ancestry. Maybe some Middle-Eastern or Sephardic Jew but I guess that falls in the same race as Europeans so they won't be able to tell me. $260 was going to be a big investment for me, I will wait until they can tell the difference between early Europe's ethnic groups: Celts, Germanic tribes, Iberians, Basques, "Roman", etc... They need to study people living in Europe in kind of isolated communities, remote mountains or whatever, then they will be able to tell the difference.

And even then it may be hard. There has been mixing up even back then.

I didn't, I found the 23andme website pretty boring, infact, it basically turned into a $200 git clone joke I made a few weeks ago

Any predictions of when sequencing the whole genome might cost around $200?

Just take the current price, and extrapolate. Not more than a few years for researchers. I don't know how retail will develop, since there may be regulations and other overheads.

For my friends in systems biology it's now almost cheaper to just send their microbes in for sequencing than doing their own gel electrophoresis.

(If you want to read around in Wikipedia, also have a look at Southern blots and Western blots, and DNA microarrays.)

This guy made real changes to the genome: https://github.com/cariaso/dna

ie: removed increased risk of coronary artery disease at rs1333049

Pretty awesome

ignorance is bliss, i had to look this up since i use mercurial.


makes sense now, pretty funny comment about the nipple.

GitHub and Mercurial aren't mutually exclusive; hg-git works pretty well.

This just blew my mind - mostly because it could totally happen some day.

I find those commits to be more fun (because they seem to be real thing): https://github.com/cariaso/dna/commits/

There's a huge difference between releasing his fully sequenced dna and the data from a genotyping chip.... I went to the github site expecting to see several large fasta files for each chromosome.

I'm sure he's not worried about Facebook privacy.

<insert joke about natural selection>

with proper TDD (tests representing the environment) it should be possible to generate random commits and only commit them when they pass the test suite.

What do you do when there are serious ethical considerations with running your test suite?

Run it in a Virtual Universe with the ethernet cable unplugged.

I would say it should make us realize we are no more than mere simulations in some huge computer about to have the power cable pulled by god's boss before he tells him to get back to work. (scifi book idea )

I'm pretty sure that was the plot of a fairly well know scifi book cough hhgttg cough. Except it wasn't God's boss that pulls the cable - Earth was demolished to make way for an intergalactic bypass.

That's kind of a stretch of an interpretation, don't you think? I always thought that the Earth was still physical in the way that we usually think of it, and humans were just part of its physical operation rather than being simulations.

Yeah, us being the substrate rather than the software is a bit of an unusual take on it, even now.

This raises the possibility of us finding out that happiness/nirvana is NP-complete, leading to the mass suicide of the entire human race. That would suck.

Procedurally generated programming?

Given how that works out for the world we live in, there really would be more bugs than anything else....


Because a passing test suite is proof of correctness, right?

Haha, that breeds all kinds of crazy ideas. If one could simulate the world's biosphere and its organisms at the genetic or even cellular level, we might one day be able to predict the effects of evolution on humans and other animals.

MAYBE even with bacteria and viruses to predict and then cure diseases before they even present themselves!

I doubt our understanding of genetics/biology is extensive enough for this yet, not to mention the computing power needed, but it seems quite possible.

There is this professor here at Oxford, Nick Bostrom, who argues that we are currently living in a simulation:

1) Moore's law holds across worlds. 2) People/beings will, for various reasons (historic, sociological, experimenting, play) want to simulate another world (at a social/molecular/submolecular, etc level). 3) If they simulate one, they'll likely do more. 4) Ergo, there will be more simulated worlds than real ones. 5) Therefore we are more likely to live in a simulated world/universe than in a first order one.


Think about this when you unplug something in the future...

Close; he doesn't argue that we are living in a simulation, he argues that at least one of the following propositions is true:

- The human species is likely to go extinct before reaching a stage where it can simulate its ancestors.

- Future humanity will have no interest in simulating its ancestors.

- We are almost certainly living in an ancestor simulation now.

I simplified it a bit, and explained the last, optimistic option... :)


<insert derogatory anti-religious comment>

Funny, now!

In the future, common practice.

Literally laughing outloud at this.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact