bioRxiv doesn't expose LateX files, they explicitly only use PDFs to make things easier. Which means you're going to need to reflow PDFs (a la https://docushow.com/), and I would guess there are a lot more edge cases there
Thanks for linking to https://docushow.com
Also a work in progress, but PDF reflow is a hard problem so you never ship if you want to solve all cases :)
Your solution using the LaTex source generates really nice HTML, congrats!