Hacker News new | past | comments | ask | show | jobs | submit login

I don't think you understand the scope of training data required for these models. We're talking thousands of lifetimes worth of reading for ChatGPT (GPT-3 for example is trained on 45TB of textual data).



I was responding to someone claiming humans learn these things with only one or two examples. I am aware of that GPT3 pretty much scraped every bit of text Open AI could find on the internet and I agree that probably makes it less example efficient than humans. But I also think this critique is slightly unfair, your brain has had the benefit of thousands of lifetimes of experience informing their structure and in built instincts. Yes it's a bit sad that we haven't done much better, but it's not totally unreasonable that machine learning should need more data than a single human does to catch up


The human brain hasn't had to "evolve" to learn writing. Our brain hasn't really changed for many thousands of years and writing has only been around for about 5000 years so we can't use the argument that "human brains have evolved over millions of years to do this" - it's not true.

GPT3 essentially needs millions of human years of data to be able to speak English correctly but still make obvious mistakes to us, so there's clearly something massive still missing.


Writing was specifically designed (by human brains) to be efficiently learnable by human brains.

Same for many other human skills, like speaking English, that we expect GPT to learn.


You are right, as far as we know brains didn’t evolve for writing and language (though there is plenty of evidence that learning to read/write changes the brain). But writing and languages did evolve and adapt FOR humans. They are built to be easy for us; we didn’t care about their mathematical properties.

AI is playing catch up.


The training data is also not great if you want to generalise the AI. There have been a lot of research showing that smaller datasets with better labelling make a far greater difference.

Remember, humans need less examples but far more time. We also don’t start from a blank slate: we have a lot of machinery built through evolution available from conception. And when we learn later in life we have an immense amount of prebuilt knowledge and tools at our disposal. We still need months to learn to play the piano, and years to decades to perfect it.

AI training happens in minutes to hours. I am not sure we are even spending time researching algorithms that take years to run for AI training.


There's a fun short story by Ted Chiang where the first truly human like AI results from weird people who keep interacting with and teaching AI pets from a company that goes out of business. It touches a bit on this idea that humans get a lot of hands on time compared to AI.

https://en.wikipedia.org/wiki/The_Lifecycle_of_Software_Obje...


I'm certain that humans are trained on far more than 45 TB of data, the vast majority of it is 'video' though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: