Submissions from github.com/agi-eval-official

		Show HN: CATArena – Evaluating LLM agents via dynamic enviroment interactions (github.com/agi-eval-official)
		3 points by jinqueeny 3 months ago \| past
		Stop benchmarking LLMs. Make them fight (github.com/agi-eval-official)
		2 points by jinqueeny 3 months ago \| past