Hacker Newsnew | past | comments | ask | show | jobs | submit | syntaxing's commentslogin

Wasn’t Composer 2 a “fine tune” of Kimi2.5?

Have you used it with any agents or claw? If so, which model do you run?

I have two Strix Halo devices at hand. Privately a framework desktop with 128gb and at work 64GB HP notebook. The 64GB machine can load Qwen3.5 30B-A3B, with VSCode it needs a bit of initial prompt processing to initialize all those tools I guess. But the model is fighting with the other resources that I need. So I am not really using it anymore these days, but I want to experiment on my home machine with it. I just dont work on it much right now.

Lemonade has a Web UI to set the context size and llama.cpp args, you need to set context to proper number or just to 0 so that it uses the default. If its too low, it wont work with agentic coding.

I will try some Claw app, but first need to research the field a bit. But I am using different models on Open Web UI. GPT 120B is fast, but also Qwen3.5 27B is fine.


Qwen3-Coder-Next works well on my 128GB Framework Desktop. It seems better at coding Python than Qwen3.5 35B-A3B, and it's not too much slower (43 tg/s compared to 55 tg/s at Q4).

27B is supposed to be really good but it's so slow I gave up on it (11-12 tg/s at Q4).


Agreed. Qwen3-coder-next seems like the sweetspot model on my 128GB Framework Desktop. I seem to get better coding results from it vs 27b in addition to it running faster.

The 8 bit MLX unsloth quant of qwen3-coder-next seems to be a local best on an MBB M5 Max with 128GB memory. With oMLX doing prompt caching I can run two in parallel doing different tasks pretty reasonably. I found that lower quants tend to lose the plot after about 170k tokens in context.

That's good to know. I haven't exceeded a 120k context yet. Maybe I'll bite the bullet and try Q6 or Q8. Any of coder-next quants larger than UD-Q4_K_XL take forever to load, especially with ROCm. I think there's some sort of autotuning or fitting going in llama.cpp.

As another data point.

Running Qwen3.5 122B at 35t/s as a daily driver using Vulcan llama.cpp on kernel 7.0.0rc5 on a Framework Desktop board (Strix Halo 128).

Also a pair of AMD AI Pro r9700 cards as my workhorses for zimageturbo, qwen tts/asr and other accessory functions and experiments.

Finally have a Radeon 6900 XT running qwen3.5 32B at 60+t/s for a fast all arounder.

If I buy anything nvidia it will be only for compatibility testing. AMD hardware is 100% the best option now for cost, freedom, and security for home users.


How is the performance for Z-Image on the R9700s?

About 10 seconds for a 1024x1024 on one, but not found a nice way to scale processing a single image across both.

Are the dedicated GPU cards on another machine or you’re using eGPU with the framework?

A separate machine.

Wow this is super interesting. This creates a local “Gemini” front end and all. This is more or less a generative AI aggregator where it installs multiple services for different gen modes. I’m excited to try this out on my strix halo. The biggest issue I had is image and audio gen so this seems like a great option.

Super interesting, building their llama cpp fork on my Jetson Orin Nano to test this out.

Pardon my ignorance, but what’s CCC?

That's a fun one because none of three Cs listed on Wikipedia stand out to me.

https://en.wikipedia.org/wiki/CCC?#Politics



What a bummer. FIRST robotics was a big part of why I’m an engineer today.

I was a mentor for an all girls high school FIRST team and I have to say, the way they were treated at competition by other teams and the way the organization handled that sexual objectification of them at competition leads me to a “that checks out” conclusion of Kamen and Epstein.

Culture propagates from the top.


How did you rule out the much simpler explanation that the culture propagates from the hormones of high school boys, and going against that is a hard problem? You're going to have to be explicit about the details of "the way the organization handled that", as the obvious assumption is that they'd be stuck between a rock and a hard place trying to post-facto punish at the organizational level (as opposed to proactive policies for team mentors to follow going forward).

I am currently a mentor and previously a judge and volunteer for many years at regional events. In all my years I have never seen anything remotely like sexual objectification. I obviously can't know your experience but I would be very very surprised to find this occurring... especially at competitions.

I believe this implication goes against core values of the org and certainly it's local volunteers. I have no skin here except to defend a program that is doing amazing work. My kids are participants and I have contributed to the org for more than 10y.

Just offering some more anecdata for passers-by.


such a shame, I’m a FIRST FRC alum from about 20 years ago. I hope they bring back vex robots to replace Lego.

Split RAM and GPU impacts it more than you think. I would be surprised if the red box doesn’t outperform you by 2-3X for both PP and TG

Some CIO thought it would be great to get rid of our local in office IT team and replaced them with a multi million contract with HP to use their “tier 1” support. Their service was absolute garbage. But the CIO got a fat bonus check for the “cost savings”


> Mac: Like CPU - Chat only works for now. MLX training coming very soon


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: