

Researchers find a completely new DNA binding protein - darxius
http://arstechnica.com/science/2013/04/researchers-find-a-completely-new-dna-binding-protein/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+arstechnica%2Findex+%28Ars+Technica+-+All+content%29

======
timr
Some perspective on what this means: it's somewhat rare (but not insanely
rare) to find new DNA-binding proteins that aren't very similar to something
that already exists in another organism.

What this paper is saying is that the researchers found a new gene with little
similarity to known DNA-binding proteins, and showed that it's both a DNA-
binding protein, and that it binds specific DNA sequences. An implication is
that we might not know as much as we think we do about the _kinds_ of proteins
that bind to DNA, since we're still finding new stuff. However, for all we
know, this protein could work just like one of the known DNA-binding proteins,
just using a different protein sequence. That would be weird, but not
exceptionally weird.

It's possible that this is a pretty rare find, and that we really do know most
of the basic protein structures that bind DNA sequences. But at the same
time...there are tons of weird microbes out there whose genes we know nothing
about. The world is a big place.

That said, we're still not any closer to making velociraptors.

------
jballanc
There's a very interesting rule of thumb that is used in the protein folding
field. If 30% of the amino acids in two different proteins are identical, then
you can be very certain that they will have identical (or nearly identical) 3D
shapes. However, it is entirely possible that two proteins have _no_ sequence
in common, yet still fold to the same shape. A friend actually stumbled upon
just such a case when I was still doing protein folding work, and it was
amazing to look at the sequence of these two different proteins and see
_nothing_ in common, then flip to the 3D structure and see that they line up
almost perfectly.

Proteins are immensely cool.

~~~
stopcodon
That's why ab initio protein structure prediction is such a sought-after tool
in bioinformatics. You can have an incredibly high degree of nucleotide/amino
acid substitution between two proteins with the same structure, function and
evolutionary history, but current methods of detecting these relationships
still largely rely on sequence comparison alone.

~~~
chriscoyfish
The last time I read into this subject, there were neural network models that
achieved around 80% accuracy. There's a research group at my uni dedicated to
this research area.

~~~
cing
I think you're referring to the prediction of "secondary structure" of
proteins, which was attempted using neural network models as early as 1989.
[1] Predicting secondary structure is child's play compared to predicting the
folded tertiary structure of a protein, hence the continued use of insanely
computationally expensive ab initio folding efforts like Folding@Home.

[1] <http://www.ncbi.nlm.nih.gov/pmc/articles/PMC286422/>

------
dkural
The protein families we know of are based on a very small number of full
genomes. Even if we have conservative assumptions - i.e. that new protein
families are not uniformly distributed, but obey some kind of power law, where
the first few sequenced organisms will contain most of the families; given how
few species are sequenced, I expect that there are many more families.

The title would be better if it said "protein family" or "protein class". New
transcription factors are discovered fairly often. Humans have around ~2K
transcription factors. New families, less often.

