Submissions from thinkwright.ai

		State of the Agent: Do coding agents know what they don't know? (thinkwright.ai)
		2 points by oceanwaves 25 days ago \| past
		Agent-evals: Overlap, boundary, and metacognitive scoring for coding agents (thinkwright.ai)
		1 point by oceanwaves 25 days ago \| past \| 1 comment
		Agent-evals: Metacognitive scoring and boundary testing for LLM coding agents (thinkwright.ai)
		2 points by oceanwaves 27 days ago \| past