Hacker News new | past | comments | ask | show | jobs | submit | neodypsis's comments login

Interestingly, o3-mini-high was correct when first thinking about it:

> Okay, we're asked how to get exactly 6 liters of water using an 12-liter and a 6-liter jug. The immediate thought is to just fill the 6-liter jug, but that seems too simple, doesn’t it? So maybe there’s a trick here. Perhaps this is a puzzle where the challenge is to measure 6 liters with some pouring involved. I’ll stick with the simple solution for now—fill the 6-liter jug and stop there.


I have to take all these comparisons with a heap of salt because no one bothers to run the test 20 times on each model to smooth out the probabalistic nature of the LLM landing on the right answer. There must be some fallacy for this, that you would sample once from each and declare a definitive winner, I see it all the time.

How does it compare to Jina V3 [0], which also has 8192 context length?

0. https://arxiv.org/abs/2409.10173


They perform different roles, so they're not directly comparable.

Jina V3 is an embedding model, so it's a base model, further fine-tuned specifically for embedding-ish tasks (retrieval, similarity...). This is what we call "downstream" models/applications.

ModernBERT is a base model & architecture. It's not supposed to be out of the box, but fine-tuned for other use-cases, serving as their backbone. In theory (and, given early signal, most likely in practice too), it'll make for really good downstream embeddings once people build on top of it!


Isn't that what happened to one of these guys? It is reported that one of them caught the fungus due to inspecting his attic (hadn't used the guano yet).


Fungus is bad, but rarely kills healthy adults. These guys weren’t healthy.

Rabies is entirely another matter.


I assume must feces require prior treatment before they can be used as fertilizer.


Cowdung was/is used untreated as manure in India, also as flooring and medicine. IIRC it was/is low in the list of health concerns.


Not really, it's just time before spreading and time for the soil bacteria to do their thing.


Reportedly, one of them had a bat infestation that produced a thick layer of guano in his attic, which he planned to use as fertilizer. He probably should have inspected his attic wearing a N95 respirator.


> I mind that resources about how to use crypto in software applications are often inscrutable, all the way down to library design, for no good reason.

I haven't read it, but I plan to, eventually. There's a book titled "Cryptography Engineering: Design Principles and Practical Applications" that could help you.


Schneier’s book? I can fully recommend that. It comes at the solutions from a practical point of view instead of the theoretical one.


> Schneier’s book?

Yes


Can a ramp pump be used as a booster pump?


Someone should finetune an LLM to create a "patentese" assistant writer.


Please no


The reverse would be good though


It is already one of the main applications of llm systems. Lawyers use a lot


> Seems like we still have a long way to go after Adam...

A preprint in arxiv suggests that Adam works better than SGD for training LLMs due to the issue of class-imbalance [0]. It appears that scaling the gradient step helps with the training, for example, see another approach suggested in [1].

0. https://arxiv.org/pdf/2402.19449 1. https://arxiv.org/pdf/2402.02347


> I know the one I make isn’t going to be as precise or accurate, the build quality won’t be as good, but it’ll be good enough for my purposes.

Have you had issues with the ESP32's ADCs?


Not that I know of, but truthfully I’m so bad at this stuff that if it was an issue I might not recognize or identify it as such.


Reportedly, the ADC's of the ESP32 are unreliable.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: