Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Cheaper or similar setup like Asus ROG G16 for local LLM development?
3 points by hedgehog0 on March 9, 2024 | hide | past | favorite | 1 comment
Hello everyone,

I’m a math graduate student in Germany and recently I’m interested in developing local and/or web apps with LLMs. I have a 12-year-old MacBook Pro, so I’m thinking about buying something new.

I have searched relevant keywords here, and the “universal suggestions” seem to be that one should use laptops to access GPUs on the cloud, instead of running training and/or inferences on a laptop.

Someone mentioned [ASUS ROG G16](https://www.amazon.de/Anti-Glare-Display-i9-13980HX-Windows-Keyboard/dp/B0BZTJKZ5L/) or G14/15 can be a good local setup for running small models. While I probably can afford this, but it’s still slightly more expensive than I thought.

Given that a 3060 is around 300 euros, I was wondering that would a cheaper solution would be to build a PC myself? If so, how much cost do you think I would spend? I’ll probably move to a new place in the Fall semester, so I would like something portable or not too heavy if possible.

Thank you very much for your time!




So running models on a M1 Mac with 32gb+ works very well as the CPU and GPU RAM are shared so you can run some really significant models with with.

Earlier this year, I also went down the path of looking into building a machine with dual 3090s. Doing it for <$1,000 is fairly challenging once you add case, motherboard, CPU, RAM, etc.

What I ended up doing was getting a used rackmount server that is capable of handling dual GPUs and two nVidia Telsa P40s.

Examples: https://www.ebay.com/itm/284514545745?itmmeta=01HRJZX097EGBP... https://www.ebay.com/itm/145655400112?itmmeta=01HRJZXK512Y3N...

The total here was ~$600 and there was essentially no effort building / assembling the machine, except I needed to order some molex power adapters, which were cheap.

The server is definitely compact, but it can get LOUD when it's running heavy load, so that might be a consideration.

It's probably not the right machine for training models, but it runs inference on GGUF (using ollama) quite well. I have been running Mixtral at zippy token rates and smaller models even faster.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: