

Ask HN: Best resources to learn about protein folding and algorithms for it? - schtog

I have been interested in protein folding for a while and now have the time to dive into it.<p>I have waded through some articles on wikipedia and watched some youtube and done some googling but I haven't found any good resource for algorithms for protein folding or bioinformatics in general.<p>What can computer scientists do to help?<p><pre><code>  *Invent better algoritms so biologists and chemists can run more experiments?

  * Is there some constraint, some condition, some fold we can do to find a new medicine or something?
</code></pre>
A basic introduction to computational biology and relevant algorithms would be very appreciated.<p>I found this(genetic programming for protein folding):
http://www.techfak.uni-bielefeld.de/bcd/Curric/ProtEn/121.html<p>Ah and ofc MIT:
http://ocw.mit.edu/OcwWeb/Biology/7-91JSpring2004/CourseHome/
======
acesamped
I'm actually a bioinformatics major, bioinformatician by profession. We do
work with computer scientists a lot, but one thing we find very frustrating is
that 99% of them don't know biology, biochemistry, chemistry, or organic
chemistry to the necessary degree. They also don't know how to read lab tests
we run or how to interpret them.

Personally, I think that computer programming is something that anybody can
do. The advantage computer scientists have is a deep understanding of the
inner workings of computers and computer language structure-- which is how
computer scientists are able to optimize so well.

Since I'm currently working in the field of bioinformatics, I'll tell you
this...KNOW YOUR SCIENCE!! You don't know how many times a computer scientist
will optimize the hell out of an algorithm and make it look great and run like
butter, but only to have it be junked in the end because is doesn't make any
scientific sense.

As for the computer languages we use, we use perl a lot... too much even. perl
has become our staple language because it takes less than a minute to write
good script if you know what you're doing. python is popular too, but I'd say
the majority of people use perl.

Another important language most bioinformaticians use is C/C++. Why? Wouldn't
you want to use a faster language (C) to crunch 100 gigabits of genetic data
instead of a slower one (perl).

And note, bioinformatics and computational biology are two different fields.
This is a very common misconception. Do a little bit of research and you will
discover this.

protein folding is one of the more prominent areas of biology being researched
right now. Good luck with the learning and feel free to contact me.

------
schtog
After doing some more research I have found: Perl and Python are very popular
languages in bioinformatics. I love Python so good for me and I don't know
Perl so easy choice then, Python it is.

The big wellknown library is BioPython: <http://biopython.org/wiki/Main_Page>

Course, Bioinformatics and Python:
<http://www.pasteur.fr/recherche/unites/sis/formation/python/>

The foldit game: <http://fold.it/portal/adobe_main>

------
biohacker42
I suggest you start out with some basic chemistry. There you can find the data
on how the amino acid chains line up and fold and twist. There was that game
somebody released where you "fold" with the mouse. I think it was an
experiment to find out if humans can do the folding faster then computers.
That should give you an idea of the problem.

The core problem in protein folding is that all the parts heavily interact
with each other. And that means that it is not easily split across nodes.
Forget about networked nodes, the latency is way too big.

Some academic groups have proposed a funnel shape to the folding
probabilities. That is to say that initially things could go in any direction
and the possibility space is HUGE. But as the protein grows the number of
possible moves collapse quickly and you are left with very few moves at the
end.

I think the Features in Biotech podcast had at least one show on protein
folding, fun to listen to.

Hope this helps, enjoy yourself.

~~~
schtog
Thanks, which bio podcast? Googling results in a lot of them.

~~~
biohacker42
<http://twit.tv/FIB>

------
etal
For an overview, read the DoE's primer:

[http://www.ornl.gov/sci/techresources/Human_Genome/publicat/...](http://www.ornl.gov/sci/techresources/Human_Genome/publicat/primer/toc.html)

The U.S. Department of Energy does a surprising amount of research on genetics
and bioinformatics. The reason: while the Manhattan project was running, DoE
scientists were aware that radioactive weapons would cause amazing and lasting
damage, but really didn't know much about how radioactivity would affect
living things specifically. So a parallel project was set up to study the
effects of radiation on cells -- e.g. selectively damaging DNA and proteins
and watching what happens to the organism. Research continued after the
Manhattan project ended, and eventually led to the Human Genome Project.

Another resource you should definitely be familiar with is NBCI:

<http://www.ncbi.nlm.nih.gov/>

Yes, algorithms are an important area of research. Caveat: it's entirely
driven by biology. For example, aligning two partially matching protein
sequences requires a clever algorithm. Sounds like diff, right? The catch is,
related sequences don't match particularly well until you take into account
which transformations are more likely to occur in nature, which takes
significant biochemistry to determine and use properly. So really, your best
bet is to associate yourself with a university of some sort, since that's
where most of the molecular biologists tend to hang out. Learn biology first,
and you'll pick up algorithms in the process.

------
Anon84
Although not exactly my field... This seems to be a good review of some of the
algorithms currently used.

<http://arxiv.org/abs/0707.3382>

By following the references therein you can probably track down the canonical
papers for the area. The OCW course should also give you a broad overview of
the subject, but before you can make any significant contributions in terms of
algorithms and results, you need to thoroughly understand the biology behind
it.

------
weebob
Well, if you want to get a clue about how molecular biologists think you could
do a lot worse then read "The eighth day of creation." It's a general history
of molecular biology. If you find any of it confusing then a good introductory
text book may help; Molecular Biology of the Cell or Stryer would get you
started (and will be available in any decent college library).

As for the protein folding or the protein function question google around
computational chemistry, but be warned -- this is tough stuff! But if you are
shit hot, please come as we need the help...

------
tjr
[http://www.amazon.com/Molecular-Biology-Made-Simple-
Third/dp...](http://www.amazon.com/Molecular-Biology-Made-Simple-
Third/dp/1889899070/)

...seems to be a good introduction to molecular biology in general, depending
on how much background you have.

