Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
Efficient and Lossless Moe Diffusion LLM Inference with I/O-Aware Expert Offload
(
tide-paper.vercel.app
)
1 point
by
imalomder
1 day ago
|
hide
|
past
|
favorite
|
1 comment
help
imalomder
1 day ago
[–]
Hi HN, this is my research project that allow people to locally deploy MoE Diffusion LLMs more efficiently. With this method, you can fit a 100B LLaDA2.0-flash model into a PC with a RTX5090 and run it faster than other methods.
reply
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
reply