In using it, I found that my summaries / queries would often result in "The text... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

andai on April 14, 2023 | parent | context | favorite | on: Auto-GPT Unmasked: The Hype and Hard Truths of Its...

In using it, I found that my summaries / queries would often result in "The text did not include (thing you asked for)." because the 2nd half of the page text would be unrelated to the main content (and the text is split into small chunks for processing to fit in the context window).

My solution to this (not yet integrated) is to use a faster / cheaper model to process all the text first and see if it's actually relevant or not, before running the summarization / task prompt with the main model.

I realized that this is essentially a semantic search engine for text. Using the same principle you can feed a very large text file in, ask it a question, and it finds the page that answers that question.

This would be useful as a layer "beneath" the internet research agent, that it could use to sort through all the noise and answer the question.

https://gist.github.com/avelican/58958a2cf2b7e9f9f555ab94549...

It gives a lot of false positives, I'm not sure if this is a limitation of GPT-3 (have not tried GPT-4 for this yet, a bit too expensive for searching books) or a limitation of my implementation.

It's probably slower and more expensive than just using a vector db, I haven't tried those yet.

alreadydone on April 23, 2023 | [–]

You could give this a try: https://www.aomni.com/

andai on April 16, 2023 | [–]

Update: Using the embeddings API is an order of magnitude cheaper for filtering text.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact