UI detection’s a big focus - we use visual grounding + structured observations (...

		frabonacci 11 months ago \| parent \| context \| favorite \| on: Launch HN: Cua (YC X25) – Open-Source Docker Conta... UI detection’s a big focus - we use visual grounding + structured observations (like icons, OCR, app metadata, window state), so the agent can reason more like a user would. It’s surprisingly robust even with layout shifts or new themes