Hacker Newsnew | past | comments | ask | show | jobs | submit | vunderba's commentslogin

Nice job. Desperately needs an instant-down (perhaps the Enter key) - even with the spacebar to speed things up it felt excruciatingly slow.

Local text-to-vid such as LTX Video 2.3 or even older WAN could easily handle this then combine with something like SeedVR2 to upscale.

https://github.com/Lightricks/ComfyUI-LTXVideo


Every continuous shot lasts no more than five to ten seconds. It's not a "give-away" as such, but it's certainly a tell. r/aivideo is chockful of this crap.

Yeah, even the latest version, Marble 1.1 which this repo uses can make a royal mess of things, especially outdoor environments.

This is more like a Claude-based skill set that orchestrates a bunch of different, separate systems. The closest equivalent to Trellis would probably be its usage of Huyuan-3D, which it uses to create some of the 3D object models.

From what I can tell, it takes an image and first segments it into objects versus environment then sends the environment to Marble 1.1 to generate a Gaussian splat,sends all the isolated individual objects to Hunyuan to generate GLB model files.


> Here’s what my nephew and I did when we got confused: we picked up the piece and looked at the base. Each figurine has a small chess symbol printed on the base. Chewbacca is a knight. The Stormtrooper is a pawn. Problem solved.

The real question. How is Chewbacca (wookies rip your arms out of their sockets when they lose) not a rook? Shouldn't knights be... hmmm I don't know, jedi knights like Skywalker, Obi-wan, etc.?

And yes, more related to the article: actuators that can manipulate the world are why it can be interesting to give LLMs access to a domain of commands that map to actions within the world (even a virtual world model for example).


I was literally just about to say this. It’s similar to what a lot of really good teachers do: students use them as a “rubber duck,” and then they respond as an almost Socratic guide.

I've actually used that approach during my years teaching ESL - self-discovery often leads to the most persistent (long-term) lessons.


Smaller models might not make the best agentic coding assistants, but I have a 128GB RAM headless machine serving llama.cpp with a number of local models that handles various tasks on a daily basis and works great.

- Qwen3-VL:30b > A file watcher on my NAS sends new images to it, which autocaptions and adds the text descriptions as a hidden EXIF layer into the image along with an entry into a Qdrant vector database for lossy searching and organization.

- Gemma3:27b > Used for personal translation work (mostly English and Chinese). Haven't had a chance to try out the Gemma4 models yet.

- Llama3.1:8b > Performs sentiment analysis on texts / comments / etc.


Look into updating to Gemma4 and Qwen3.6, they are good at agentic things. qwen36moe with unsloth's 8bit quant is my daily driver now.

Have you noticed a gap between 8bit and 4bit quant? I've always ran 4bit quant cause less memory required

I run the biggest quant because it is more capable, spark has enough memory for two qwen at 8bit and full context length (roughly 48G each)

I find gemini/gemma to have become worse at coding, they are better for non-coding tasks, but maybe not even that, the hallucinations and instruction following have both degraded ime


Starting off as a banker is probably the easiest way to "cheat". :)

It's been great for me. I have a secondary PC that's been running Windows 10 LTSC IoT for 5 years now. I’m still getting security updates but nothing else (that's a feature to me).

The only time I had an issue was when a DAW installer required me to upgrade to 22H2. I grabbed the enablement package directly and used the DISM tool to install it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: