Interesting write-up on the additional work of almost implementing a spreadsheet in the back-end! I'm curious as to what kinds of functions exactly are these.
In the article, you mention that "We chose to compute asynchronously in case computation takes too long or runs into a fatal error" -- coming from Excel, how do you handle errors? And how long exactly do these formulas potentially take? (As someone without a lifesci background, it's hard to wrap my head around the concrete parts)
Hi there, thanks for the feedback! The 2 examples in the write-up are:
1) Molecular Weight: weighted sum across amino acid sequences, using amino acid weights defined in [1]
2) List Concatenation: aggregate lists of added resistances (like ['Ampicillin', 'Kanamycin']) across itself and all ancestors
These are simple, and can be expressed as formulas in Excel or functions. They take < 0.5 s.
A more complex biochemical property for antibodies that can't be as easily expressed in Excel is isoelectric point ([2], see example BioJava implementation at [3]). It requires a binary search, but the search space is constant so usually these calculations finish < 20 s.
Since our implementation is in Python, we can wrap the function in try/catch and, in the catch block, log the error and set the failed computation status.
In the article, you mention that "We chose to compute asynchronously in case computation takes too long or runs into a fatal error" -- coming from Excel, how do you handle errors? And how long exactly do these formulas potentially take? (As someone without a lifesci background, it's hard to wrap my head around the concrete parts)