Hacker News new | past | comments | ask | show | jobs | submit login

This take seems to be somewhat over sceptical and slightly over-reaching to me.

> when your entire computational technique is built on finding analogies to known structures, what can you do when there’s no structure to compare to

Lots of people seem focused on the idea that deep networks can't do anything novel and are just like fancy search engines that find a similar example and copy it. This is not true. They do learn from much deeper low level structures in the domain they are exposed to. They can be aware of implicit correlations and constraints that are totally outside what may be recognised in the scientific understanding. Hence AlphaFold is quite capable of predicting a structure for which there is no previous direct "analogy". As long as the protein has to follow the laws of physics then AlphaFold as at least a basis to work from in successfully predicting the structure.

> It is very, very rare for knowledge of a protein’s structure to be any sort of rate-limiting step in a drug discovery project!

This and the following text are very reductive. It's like saying, back in 1945 that nuclear weapons would not be any sort of advantage in WW2 because it is very rare for weapons of mass destruction to win a war. Well yes it was rare, because they didn't exist. And so too did we not have a meaningfully accurate way to predict protein structures until AlphaFold. We've barely even begun to exploit the possible new opportunities for how to use that. And people have barely scratched the surface in adapting AlphaFold to tackle the related challenges downstream from straight up structure prediction. Predicting formation of complexes and interactions is the obvious next step and it's exactly what people are doing.

It's not to say that it will revolutionise drug development, but the author's argument here is that he is confident it will not and he really doesn't assert much evidence of that.




If you’re gonna get mad and quote a sentence to rail on the author, at the least quote the full sentence: the author ends it with “and there never will be.” Because among other things he’s talking about intrinsically disordered proteins[1]. What can the best prediction model do to predict the truly unpredictable? Just tell us that it’s unpredictable.

And what is your second criticism exactly? The author comes from the drug discovery industry. What the author said should be generalized to: even if we know the perfect experimentally confirmed 1A resolution structure of every protein out there tomorrow, that won’t exactly revolutionize drug discovery. That’s because protein structure gives maybe 10% of the context you need to successfully design a drug. It’s dynamics, higher order interaction specifics, complex interplay in signaling pathways in particular cells in particular contexts and what entire cells and organ systems in THAT PARTICULAR ORGANISM do when this protein is perturbed, are what truly affects drug discovery.

If you absolutely want to revolutionize DD, find us a better model to test things on that’s closer to the human body as a whole. Currently mice and rats are used and they’re not cutting it anymore.

This fundamentally goes back to the downfall of the prototypical math or software guy trying to come and say “im gonna cure cancer with MATH!” No you’re not. You’re gonna help, and it’s appreciated, but if you’re gonna truly cure cancer you better start stomping on a few thousand mice and maybe also get an MD.

1. https://www.nature.com/articles/nrm3920


Curious as to what are some medically important examples of disordered proteins might be?


Transcription factors often are partially disordered, just to name one. A bunch of others here:

https://www.nature.com/articles/nrm3920


As someone who works with Transformers and DL in general I decided to chime in:

>They can be aware of implicit correlations and constraints that are totally outside what may be recognised in the scientific understanding. Hence AlphaFold is quite capable of predicting a structure for which there is no previous direct "analogy".

Yes, while an expert of their field, I suspect author doesn't have full understanding of what neural networks or ML is capable of, in this case it's not necessary to have seen fully similar molecules but it is sufficient for AlphaFold to have encountered basic blocks of the vocabulary that are familiar. One can see how this argument would be silly if (hyperbolizing) someone said that just because we are located in a tiny part of observable universe the physics we know to work here wouldn't work in Alpha Centauri. To make this point simpler, this is no different than suggesting that a model will break if it encounters an out-of-vocabulary word, whereas in reality a simple tokenization technique would ensure that in majority of these edge cases, the unknown word would be broken down into subwords, which, in turn, would still be familiar to the model and statistics that model has learned would still give it a reasonable guess as to what the sum of the parts entail. There was a recent work (https://arxiv.org/abs/2207.05221, in LLM, not molecular prediction) where models could be shown to "know" (oh G-d how tiresome it is to make these disclaimers, but models are NOT sentient) fairly decently what they don't know. Thus, I wouldn't be surprised if AlphaFold could at least give a confidence score to its prediction of folded protein, helping scientists who use it as a TOOL (which it is, it's not a solution) to exercise caution.

>As long as the protein has to follow the laws of physics then AlphaFold as at least a basis to work from in successfully predicting the structure.

This is pure speculation. There is no guarantee AT ALL that AlphaFold follows laws of physics other than the way predictions are clipped to be reasonable distances between molecules or other "hacks" authors have added as "inductive" (actually symbolic) biases to the model.


> This is pure speculation. There is no guarantee AT ALL that AlphaFold follows laws of physics

I agree with you, but just to point out that isn't what I wrote. It's not a speculation that it has "at least a basis to work from". Whether it actually does or not is speculative. The assertion in the article is that it has no basis to do it.


You make a claim that [They do learn from much deeper low level structures in the domain they are exposed to] and yet what we see is not learning. All such systems are constrained by the programming that goes into them. As such, these systems may be able to generate data that is then stored, but this is not learning.

In addition you make the claim [They can be aware of implicit correlations and constraints that are totally outside what may be recognised in the scientific understanding] and again this is not awareness in any way. Is is still a result of the constraints of the programming that goes into them.

Being able to predict something is not learning nor is it awareness. Mathematical equations predict things that can occur and these are not aware nor are the equations learning. The programming that underlies these systems is mathematics and as such are just as constrained.

I have said this elsewhere, all such systems are Artificial Stupidity systems. It takes a human mind (which we still do not understand) to look at the results. To say that these systems are more than they are is missing the usefulness that can be obtained from them. Irrespective of any claims otherwise, all such systems are simple and are in reality GIGO (garbage in, garbage out) systems. To forget this is to walk a path that is dangerous.

I have, over decades, had to deal with systems that appeared to give the "right" or "reasonable" outputs and yet when analysed where found to be in great need of redevelopment or in some cases thrown away as pure garbage.

Computer systems are useful but don't depend on them without human checking and control.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: