Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Weights are not binary. I have no idea why this is so often spread, it's simply not true. You can't do anything with the weights themselves, you can't "run" the weights.

You run inference (via a library) on a model using it's architecture (config file), tokenizer (what and when to compute) based on weights (hardcoded values). That's it.

> but you can’t modify

Yes, you can. It's called finetuning. And, most importantly, that's exactly how the model creators themselves are "modifying" the weights! No sane lab is "recompiling" a model every time they change something. They perform a pre-training stage (feed everything and the kitchen sink), they get the hardcoded values (weights), and then they post-train using "the same" (well, maybe their techniques are better, but still the same concept) as you or I would. Just with more compute. That's it. You can do the exact same modifications, using basically the same concepts.

> don’t have the source to recompile

In pure practical ways, neither do the labs. Everyone that has trained a big model can tell you that the process is so finicky that they'd eat a hat if a big train session can be somehow made reproducible to the bit. Between nodes failing, datapoints balooning your loss and having to go back, and the myriad of other problems, what you get out of a big training run is not guaranteed to be the same even with 100 - 1000 more attempts, in practice. It's simply the nature of training large models.





A binary does not mean an executable. A PNG is a binary. I could have an SVG file, render it as a PNG and release that with CC0, it doesn't make my PNG open source. Model Weights are binary files.

You can do a lot with a binary also. That's what game mods are all about.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: