Bing is now the default search for ChatGPT

ozfive · on May 24, 2023

From what I have experienced it fails even more on retrieving data from pages. This is certainly the beginning to the race to the bottom.

ilaksh · on May 24, 2023

I wonder if there are any startups that specialize in extracting the actual content from bloated web pages, caching it, and serving it via API.

To do the extraction effectively you might need some kind of hand-crafted scraper, or maybe access to GPT-4-like image understanding. Or just extract all of the text and process it with Claude or another system that has a large context window. To throw out all of the extraneous text like ads, navbars, random junk that is only slightly related, etc.

Or just serving cached static content would be useful to speed up the process.

Does Bing search or Google Search give you some of the actual content?