Hacker News new | past | comments | ask | show | jobs | submit login

You need benchmarks with the following three properties:

1) No known solutions, so there's no "ground truth" dataset to train on

2) Presumably hard to solve

3) But easy to verify a solution if one is provided.

This, of course, is easier done on the STEM side of things, but how do you automatically test creativity, or philosophical aptitude?






I guess it's purely subjective. Maybe some internal commission if it comes to quality of creative work?



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: