Could you indulge a bit in describing the workflow?
For example, the "story info extraction" and its "results evaluation" deserve some more in depth explanation.
How do you have the LLM summarize the article and the HN submission page: do you put the whole text in the input (as in: "Summarize the following text: <page text here>")?
For the story summary: send a prompt to the Perplexity API (which can access URLs) requesting an extraction of the article's content (a sort of very detailed technical brief). Then, use Gemini to classify the results from the Perplexity API as valid or invalid (sometimes, Perplexity isn't able to access the URL but it doesn't reliably return the string I've requested it to return when it isn't). After that, send a more detailed prompt to Gemini requesting the story summary be generated with a specific style and format (Markdown).
For the comments summary: use the HN API to pull all comments, then prepare a Markdown document with a selected subset of comments (based on user karma, replies, and thread depth) if exceeding a given number of comments, otherwise all comments. Annotate comments with an index (eg, 1.1) to indicate the nesting to the LLM. Along with some formatting and stylistic guidance, send that to the LLM requesting summarization.
While the story summaries are generally static once generated, regenerate the comment summaries on an interval (this could be optimized).
For example, the "story info extraction" and its "results evaluation" deserve some more in depth explanation.
How do you have the LLM summarize the article and the HN submission page: do you put the whole text in the input (as in: "Summarize the following text: <page text here>")?