Start with downloading SRA toolkit: https://github.com/ncbi/sra-tools/wiki/02.-I...

mbreese · on Oct 11, 2022

If anyone is thinking of doing this, it is both this easy and way harder.

Yes, this is an approximate workflow. It doesn’t take that much specialized knowledge to get it running.

However step 2/3 (find/download a dataset of interest) is harder. Finding whole genome sequencing data for a cancer that you can download without being part of a research institution is difficult. There are a lot of controls over who can access raw DNA sequences from patients. RNA data are much more readily available as they are less identifiable.

Specifically, here is the type of access you need:

https://gdc.cancer.gov/access-data/obtaining-access-controll...

The reasons for this are good and I’m not trying to say otherwise. Just that from a practical perspective, being able to technically perform the analysis is doable for many non-biomedical people here. However, accessing the raw data is much more difficult.

lifeisstillgood · on Oct 11, 2022

I am reaching for something - that software is now enabling a minimal access level to amazing corners of science and technology - we can luck satellite photos from the ether, have whole dna sequences on our desktop.

I feel there is a "basic bootcamp for the 21C" that I missed