Are there any papers that use it for things other than demonstrating Jax? I can't think of one off the top of my head.
Perhaps I should have specified "papers outside those introducing new frameworks, or around speed benchmarking".
There are a bunch of interesting papers using custom libraries for distributed training, and ones targeted at showing off the performance of specific hardware (NVidia has a bunch of interesting work in this space, and Intel and other smaller vendors have done things too).