Hacker News new | comments | show | ask | jobs | submit login
Researchers find a completely new DNA binding protein (arstechnica.com)
65 points by darxius 1605 days ago | hide | past | web | 10 comments | favorite

Some perspective on what this means: it's somewhat rare (but not insanely rare) to find new DNA-binding proteins that aren't very similar to something that already exists in another organism.

What this paper is saying is that the researchers found a new gene with little similarity to known DNA-binding proteins, and showed that it's both a DNA-binding protein, and that it binds specific DNA sequences. An implication is that we might not know as much as we think we do about the kinds of proteins that bind to DNA, since we're still finding new stuff. However, for all we know, this protein could work just like one of the known DNA-binding proteins, just using a different protein sequence. That would be weird, but not exceptionally weird.

It's possible that this is a pretty rare find, and that we really do know most of the basic protein structures that bind DNA sequences. But at the same time...there are tons of weird microbes out there whose genes we know nothing about. The world is a big place.

That said, we're still not any closer to making velociraptors.

There's a very interesting rule of thumb that is used in the protein folding field. If 30% of the amino acids in two different proteins are identical, then you can be very certain that they will have identical (or nearly identical) 3D shapes. However, it is entirely possible that two proteins have no sequence in common, yet still fold to the same shape. A friend actually stumbled upon just such a case when I was still doing protein folding work, and it was amazing to look at the sequence of these two different proteins and see nothing in common, then flip to the 3D structure and see that they line up almost perfectly.

Proteins are immensely cool.

That's why ab initio protein structure prediction is such a sought-after tool in bioinformatics. You can have an incredibly high degree of nucleotide/amino acid substitution between two proteins with the same structure, function and evolutionary history, but current methods of detecting these relationships still largely rely on sequence comparison alone.

I saw Rhiju Das speak last week (really fantastic talk). He's working on some very cool deterministic ab initio folding approaches. http://www.stanford.edu/~rhiju/research.html

The last time I read into this subject, there were neural network models that achieved around 80% accuracy. There's a research group at my uni dedicated to this research area.

I think you're referring to the prediction of "secondary structure" of proteins, which was attempted using neural network models as early as 1989. [1] Predicting secondary structure is child's play compared to predicting the folded tertiary structure of a protein, hence the continued use of insanely computationally expensive ab initio folding efforts like Folding@Home.

[1] http://www.ncbi.nlm.nih.gov/pmc/articles/PMC286422/

Indeed! And crazier still, you can have two proteins with substantial structural homology, but with entirely different functions and mechanisms.

When I worked in structural bio, one of my colleagues was studying a protein [complex] that had a prominent structure homologous to a helicase, but no DNA-unwinding activity whatsoever. To this day it's unknown what that structure does.

I'm excited to more progress linking structure to function, even if this is very very difficult.

I have no idea what "protein folding" does, but you just sold this story so well, that I'm about to look it up.

My point: great comment, makes me want to learn.

protein folding is the process by which a sequence of amino acids - a protein - assumes a 3d structure, made up of "local" units of structures, often turns, sheets (beta sheets) and cylinders (alpha helixes) further arranged into a larger super structure. this is what confers functionality (binding, enzymatic reactions, etc) and dynamics (movements integral to the protein's functionality). the chemical principles that drive this are things like hydrophobic amino acids burying themselves away from a typical aqueous (water-based, or at least highly polar) environment, ionic pairings, hydrogen bonding (strong but not permanent bonds), etc.

in terms of predicting the protein's structure, the challenge is the sheer number of computations, the dynamics the protein goes through, and the effects of any environmental factors as the protein is synthesized or folded.

i spent much of a decade (mid 90's to early 00s) studying folding with an aim to getting into enzyme engineering. fun stuff, but i have since left biochem.

The protein families we know of are based on a very small number of full genomes. Even if we have conservative assumptions - i.e. that new protein families are not uniformly distributed, but obey some kind of power law, where the first few sequenced organisms will contain most of the families; given how few species are sequenced, I expect that there are many more families.

The title would be better if it said "protein family" or "protein class". New transcription factors are discovered fairly often. Humans have around ~2K transcription factors. New families, less often.

Applications are open for YC Winter 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact