be7a's comments

be7a · 2025-09-23T21:34:09 1758663249

The biggest takeaway is that they claim SOTA for multi-modal stuff even ahead of proprietary models and still released it as open-weights. My first tests suggest this might actually be true, will continue testing. Wow

ACCount37 · 2025-09-23T22:29:39 1758666579

Most multi-modal input implementations suck, and a lot of them suck big time.

Doesn't seem to be far ahead of existing proprietary implementations. But it's still good that someone's willing to push that far and release the results. Getting multimodal input to work even this well is not at all easy.

Computer0 · 2025-09-23T22:34:57 1758666897

I feel like most Open Source releases regardless of size claim to be similar in output quality to SOTA closed source stuff.

be7a · 2025-07-21T17:18:50 1753118330

Super interesting that they moved away from their specialized, Lean-based system from last year to a more general-purpose LLM + RL approach. I would suspect this likely leads to improved performance even outside of math competitions. It’ll be fascinating to see how much further this frontier can go.

The article also suggests that the system used isn’t too far ahead of their upcoming general "DeepThink" model / feature, which is they announced for this summer.

be7a · 2025-06-17T20:10:34 1750191034

The rate limits apply only to the Gemini API. There is also Vertex from GCP, which offers the same models (and even more, such as Claude) at the same pricing, but with much higher rate limits (basically none, as long as they don't need to cut anyone off with provisioned throughput iiuc) and with a process to get guaranteed throughput.

zzleeper · 2025-06-18T02:07:07 1750212427

Had no idea... always thought Vertex was just a way to do enterprise offering!

be7a · on March 17, 2025

Have you checked out https://github.com/prefix-dev/pixi? It's built by the folks who developed Mamba (a faster Conda implementation). It supports PyPI dependencies using UV, offers first-class support for multi-envs and lockfiles, and can be used to manage other system dependencies like CUDA. Their CLI also embraces much of the UX of UV and other modern dependency management tools in general.

be7a · on Aug 17, 2023

Mastermind intrigued me in the same way as the author some time ago, and I've used it as a standard problem when trying out new computational frameworks/methods ever since.

Here is my Rust version with multi-threading, SIMD, WASM running on your device inside a WebApp: https://0xbe7a.github.io/mastermind/

Repo: https://github.com/0xbe7a/mastermind

It is quite fast (1.8 Billion position pairs evaluated in 1652ms on my device) and can also exploit some symmetries inside the solution space.