Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's so, so complex! I confess I had a sense of this but had no idea. We don't even hear which MSA algorithm is used to align the protein sequences.



Hi, I was one of the authors of this! I think we briefly mentioned this in a footnote somewhere (a lot of things got cut or moved to footnotes since it is already so long & wanted to focus on the ML parts that aren't described elsewhere).

But yes as @Flobosg mentioned, for protein chains they use jackhmmer to search 4 of the databases (except when searching Uniclust30 + BFD when HHBlits is used instead) and for RNA chains they used nhmmer to search then hmmalign to re-align these to the query chain.

Hope that helps!


Input MSAs are generated with jackhmmer and HHblits and further processed, if I recall Alphafold’s paper correctly.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: