Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
anon373839
3 months ago
|
parent
|
context
|
favorite
| on:
The Era of 1-bit LLMs: ternary parameters for cost...
It also means the largest models can be scaled up significantly with the same inference budget.
llm_trw
3 months ago
[–]
Depends. The only paper they cite for training:
https://arxiv.org/pdf/2310.11453.pdf
doesn't improve training costs much and most models are already training constrained. Not everyone has $200m to throw at training another model from scratch.
arunk47
3 months ago
|
parent
[–]
Is there any scope for indie builders?
llm_trw
3 months ago
|
root
|
parent
[–]
Not really. These are slightly better for memory during pre-training and fine turning but not enough to make a 4090 usable even for a 7b model.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: