Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
woodson
on Jan 3, 2023
|
parent
|
context
|
favorite
| on:
Some Remarks on Large Language Models
I think quantization (e.g. 4-bit,
https://arxiv.org/abs/2212.09720
) and sparsity (e.g. SparseGPT,
https://arxiv.org/abs/2301.00774
) will bring down inference cost.
Edit: This isn’t handwaving btw, this is to say some fairly decent solutions are available now.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
Edit: This isn’t handwaving btw, this is to say some fairly decent solutions are available now.