Hacker News new | past | comments | ask | show | jobs | submit login

Another angle here is: it is going to be very valuable to some companies to ensure that their datasets go into the LLM training process. For example, if you are AWS, you really want to make sure that the next version of GPT has all of the AWS documentation in the training corpus, because that ensures GPT can answer questions about how to work with AWS tools. I expect OpenAI and others will start to charge for this kind of guarantees.



well, as for AWS docs, I'd argue it's in OpenAI's interest to include them in training corpus.

Generally speaking, high value content will get indexed, whether voluntarily or via paid channels




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: