Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Progressive summarization works pretty well - I'm using that for https://findsight.ai

You can even get lesser LLMs to do the bulk reduction that have GPT clean it up on the way to even less content. Admittedly, that does take a lot of prompt engineering, chunk selection and reinforcement though (LLM supervising LLM).



I understood everything except:

> reinforcement though (LLM supervising LLM).

Is there something I can read to understand what that looks like?


I don't think this approach is formalized but I can give a few examples:

A) Prompt leak prevention: chunk and embed LLM responses, than compare against original prompt to filter out chunks that leak the prompt

B) Automatic prompt refinement: Prompt a cheap model, use an expensive model to judge the output and rewrite the prompt (this is in part how Vicuna[1] did eval for their LLaMa fine-tuning)

Basically using LLMs in the feedback loop.

[1] https://vicuna.lmsys.org




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: