Another angle here is: it is going to be very valuable to some companies to ensu... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

d_burfoot 34 days ago | parent | context | favorite | on: Leaked deck reveals how OpenAI is pitching publish...

Another angle here is: it is going to be very valuable to some companies to ensure that their datasets go into the LLM training process. For example, if you are AWS, you really want to make sure that the next version of GPT has all of the AWS documentation in the training corpus, because that ensures GPT can answer questions about how to work with AWS tools. I expect OpenAI and others will start to charge for this kind of guarantees.

nextworddev 34 days ago [–]

well, as for AWS docs, I'd argue it's in OpenAI's interest to include them in training corpus.

Generally speaking, high value content will get indexed, whether voluntarily or via paid channels

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact