This combines a bunch of different things, but the key idea appears to be having a Flink (stream processing server) SQL query set up that effectively means that new documents added to the data store trigger a query that then uses an LLM (via a custom SQL function) to e.g. calculate a summary of a paper, then feeds that on to something that sends an alert to Slack or similar.
So this is about running LLM prompts as part of an existing streaming data processing setup.
I guess you could call a trigger-based SQL query an "agent", since that term is wide open to being defined however you want to use it!
Author here, thanks for reading and commenting! Indeed my conclusion is that "agent" means different things to different people. The idea for this post was to explore what's there already and what may be missing for using SQL in this context. When following Anthropic's taxonomy, as of today, SQL let's you get quite far for building workflows. For agents in their terminology, some more work is needed to integrate things like MCP, but I don't see any fundamental reasons for why this couldn't be done.
I’ve built lots of pre-LLM data processing pipelines like this and the more I read people putting “agents” into this kind of context the less they resemble agents like the Anthropics of the world defines and the more they just resemble functions. I wonder if eventually there won’t be a distinction and it’ll just be a way to make processing and branching nodes in a pipeline less deterministic when you need more flexibility than pure code-rules can give you.
Perhaps a charitable take is that this is a musing on how to use immutability for building agents? Maybe a functional approach would be less constrictive than SQL.
Indeed. I'm very curious about the original source of inspiration for the post, "That microservice should have been a SQL query". It's not as catchy without AI though.
An interesting thought experiment is that a service that can both be triggered by, and write to, the same persistent queue can call an LLM in each iteration and create a self-evolving agentic workflow. Much like a Scrapy spider can recursively find new things to crawl, an agent can recursively find new things to do.
SQL is a great language to express “choose something from a priority queue and do something with it” but if you try to think about predefining your entire computation DAG as a single query rather than many queries over time, you lose the loose asynchronous iteration that gives an agent the same grace you’d give to a real-life task assignment. And that grace is what makes agentic workflows powerful.
So this is about running LLM prompts as part of an existing streaming data processing setup.
I guess you could call a trigger-based SQL query an "agent", since that term is wide open to being defined however you want to use it!
reply