
Severe acute respiratory syndrome coronavirus 2 Wuhan-Hu-1, complete genome - arethuza
https://www.ncbi.nlm.nih.gov/nuccore/MN908947.3
======
acqq
Once one is there, it's easy to see that there are also many other sequencings
in the database, isolated on different locations. E.g.

[https://www.ncbi.nlm.nih.gov/nuccore/MT370930](https://www.ncbi.nlm.nih.gov/nuccore/MT370930)

"collection_date="2020-03-22""

"Submitted (22-APR-2020) Center for Research on Influenza Pathogenesis (CRIP),
New York, NY 10029-6574, USA"

------
th3h4mm3r
One question (If someone please can explain me) : if coronavirus is an RNA
virus, why in the genome at end we can see "t" (timina, I don't know the name
in english sorry!) and not "u" (uracile)?

~~~
acqq
[https://bioinformatics.stackexchange.com/questions/11353/why...](https://bioinformatics.stackexchange.com/questions/11353/why-
does-the-fasta-sequence-for-coronavirus-look-like-dna-not-rna)

"Virtually all sequencing of RNA viruses is done through cDNA"

------
rudolph9
Could someone post some helpful references on how the data is structured?

~~~
zimpenfish
Going off the links at the top, it's Genbank[1] or FASTA[2] (although the
stuff on the FASTA version doesn't really match the example on the FASTA
explanation.)

[1] [http://scikit-
bio.org/docs/0.5.2/generated/skbio.io.format.g...](http://scikit-
bio.org/docs/0.5.2/generated/skbio.io.format.genbank.html)

[2]
[https://zhanglab.ccmb.med.umich.edu/FASTA/](https://zhanglab.ccmb.med.umich.edu/FASTA/)

