> it can't predict folds that haven't been seen This seems strange to me. The en...

COGlory · on July 28, 2022

AlphaFold figures out that my input sequence (which has no structural data) is similar to this other protein that has structural data. Or maybe different parts of different proteins. It does this extremely well.

flobosg · on July 28, 2022

This is a gross misrepresentation of the method.

COGlory · on July 28, 2022

Perhaps you'd care to explain how? AlphaFold does not work on new folds. It ultimately relies on mapping sequence to structure. It does it better than anyone else, and in ways a human probably couldn't, but if you give it a brand new fold with no relation to other folds, it cannot predict it. I routinely areas of extremely low confidence many of my AlphaFold models. I work in organisms that have virtually 0 sequence identity. This is a problem I deal with every day. I wish AlphaFold worked in the way you are suggesting, but it just flat out does not.

flobosg · on July 28, 2022

> It ultimately relies on mapping sequence to structure.

So does every structural prediction method.

> if you give it a brand new fold with no relation to other folds, it cannot predict it

That will depend on the number of effective sequences, not the actual fold.

> I work in organisms that have virtually 0 sequence identity.

Then the problem is low sequence coverage, not the protein fold. On a side note, there are sensitive homology search protocols that rely very little on actual sequence identity.

bamboozled · on July 28, 2022

So then based on your counter arguments to the OP, have they mapped the entire protein universe ? Or should it say, the “already known protein universe” ?

flobosg · on July 28, 2022

Neither the protein sequence nor structure spaces have been fully explored, and the sequence set of UniProt does not represent every single extant protein. My answer is “no”.

johndfsgdgdfg · on July 28, 2022

There's hype and then there's anti-hype hype, which tries to undermine any genuine progress in a hip contrarian fashion. Eg look, I'm the only who can see the truth. There's AI hype and then there's anti-AI Gary Marcus hype, who never produces any novel criticism. It's the same banal broken record every single time put in a very self-aggrandizing manner.

DM is probably hyping it up and you are most likely hyping up your own criticism. It's a great symbiotic relationship outwardly presented as opposition.

dekhn · on July 28, 2022

No organisms have virtually 0 sequence identity. That's nonsense. Can you give an example? n Even some random million-year-isolated archae shares the majority of its genes with common bacteria.

biomcgary · on July 28, 2022

Organisms, yes. Individual genes within an organism may have no sequence identity to genes in other organisms (outside of what you would expect at random). See: https://en.wikipedia.org/wiki/Orphan_gene

dekhn · on July 28, 2022

Yes, that's what I thought. I worked with m. genitalium and we were always looking for proteins that had no homology or no existing structure (https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.20...)

COGlory · on July 28, 2022

MKVLMKKESLPIVKPFDEVIIEVLQAPKEVEREVALKDGTIKKIQDYSIIVKPVSGKFESVTEKVTSKTEDGDEVVKPKKYDASELKDKVVMKLTQKAFEVLYDAWQNKEIGEGTKLKIKVTKKQNKTYFDEITVLDEKEEEETEEEAKVKPKPKLKG

dekhn · on July 28, 2022

That's a single protein not an organism.

andrewflnr · on July 29, 2022

They obviously mean organisms that have notable numbers of proteins with virtually no sequence identity. The difference is only germane to the conversation if you're looking for something to nitpick. The only point of bringing it up was that they encounter non-trivial numbers of really weird proteins.