Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I agree that for any given test, you could build a specific pipeline to optimize for that test. I supposed that's why it is helpful to have many tests.

However, many people have worked hard to optimize tools specifically for ARC over many years, and it's proven to be a particularly hard test to optimize for. This is why I find it so interesting that LLMs can do it well at all, regardless of whether tests like it are included in training.





The real strength of current neural nets/transformers relies on huge datasets.

ARC do not provide this kind of dataset, only a small public one and a private one where they do the benchmarks.

Building your own large private ARC set does not seem too difficult if you have enough resources.


How can they keep it private? It's not like they can run these models locally. Do the providers promise not to peak when they are testing?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: