Progressive summarization works pretty well - I'm using that for https://findsig...

alchemist1e9 · on April 11, 2023

I understood everything except:

> reinforcement though (LLM supervising LLM).

Is there something I can read to understand what that looks like?

summarity · on April 11, 2023

I don't think this approach is formalized but I can give a few examples:

A) Prompt leak prevention: chunk and embed LLM responses, than compare against original prompt to filter out chunks that leak the prompt

B) Automatic prompt refinement: Prompt a cheap model, use an expensive model to judge the output and rewrite the prompt (this is in part how Vicuna[1] did eval for their LLaMa fine-tuning)

Basically using LLMs in the feedback loop.

[1] https://vicuna.lmsys.org