Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
mark_l_watson
80 days ago
|
parent
|
context
|
favorite
| on:
Amazon spends another $2.7B on Anthropic
I have a M2 Pro 32B memory Mac mini, and I can run mixtral-7-8-Q3 - that is with 3 bit quantization.
woadwarrior01
68 days ago
[–]
IIRC, you're mentioned once before that you've used Private LLM. :) Please try the 4-bit OmniQuant quantized Mixtral 8x7B Instruct model in it. It runs circles around RTN Q3 models at speed and RTN Q8 models at text generation quality.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: