Hacker News new | past | comments | ask | show | jobs | submit login
"Computing resources for genome data soon exceeds those of Twitter and YouTube" (nature.com)
7 points by samuell on July 29, 2015 | hide | past | favorite | 4 comments



Arvados.org to the rescue: https://twitter.com/peteramstutz/status/626395473704845315 :)

(Yes, it is one of the most promising solutions to the problem)


The Arvados project https://arvados.org/ is an open source scale-out storage and compute platform designed to address the needs of huge data, like genomics.


Genomics data is only 'big data' until you have an alignment. After that the raw data can be archived or even deleted. Most secondary data such as variants and expression data are not large at all. The only real problem in this field is that bench biologists tend to rush head first into sequencing without involving IT early in the planning process. The tools already exist, it's the communication that is lacking.


Are you sure about that? What happens when you get to a million cohort that you want to analyze?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: