
How to download the complete genome for an organism - danso
http://www.ncbi.nlm.nih.gov/guide/howto/dwn-genome/
======
dekhn
Amusingly, the reason that NIH uses FTP, rather than HTTP, for bulk genome
file transport is that because in the old, old days, one PI (professor with a
grant) was running an old version of IE which couldn't download files larger
than 2GB. They had a lot of trouble with file corruption and because the PI
was "important" NIH hasn't really moved to an HTTP content distribution
method. Mirroring the NIH FTP servers was always a huge pain.

~~~
fapjacks
Hey that's great folklore! Do you have a link with more of this story?

------
elijahz
Neat!
[http://www.ncbi.nlm.nih.gov/nuccore/KJ137266.1](http://www.ncbi.nlm.nih.gov/nuccore/KJ137266.1)

------
nickthemagicman
Any guidance on tutorials on how to use this? i.e. software to analyze this
data?

~~~
jldugger
It's called grad school ;)

There's tools like BLAST that can use this data to answer questions like "is
there a copy of influenza virus in pig DNA". It's basically a fast fuzz search
engine.

If you grab a lot of these, you can concatenate pairs of genomes, estimate
ancestry distance based on compression ratio, and then produce a phylogenetic
tree.

Project Rosalind ([http://rosalind.info/problems/list-
view/](http://rosalind.info/problems/list-view/)) has a series of simple
programming problems related to bioinformatics.

------
Ultimatt
I don't really get why this has been posted? Yup genetic data is a thing that
exists... That isn't all species that have been sequenced just the NCBI
collection.

------
userbinator
In case anyone is wondering, there is a human here:
ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/

------
lyschoening
The NCBI has a database of annotated genome sequences. For simple organisms,
these might be only a couple of MB in size. You can use their search function
to look up a genome and then download it from the overview page. No need to
use their FTP server.

