Building agents using streaming SQL queries

simonw · 2025-06-18T16:54:21 1750265661

This combines a bunch of different things, but the key idea appears to be having a Flink (stream processing server) SQL query set up that effectively means that new documents added to the data store trigger a query that then uses an LLM (via a custom SQL function) to e.g. calculate a summary of a paper, then feeds that on to something that sends an alert to Slack or similar.

So this is about running LLM prompts as part of an existing streaming data processing setup.

I guess you could call a trigger-based SQL query an "agent", since that term is wide open to being defined however you want to use it!

gunnarmorling · 2025-06-18T19:53:43 1750276423

Author here, thanks for reading and commenting! Indeed my conclusion is that "agent" means different things to different people. The idea for this post was to explore what's there already and what may be missing for using SQL in this context. When following Anthropic's taxonomy, as of today, SQL let's you get quite far for building workflows. For agents in their terminology, some more work is needed to integrate things like MCP, but I don't see any fundamental reasons for why this couldn't be done.

philipodonnell · 2025-06-18T17:38:36 1750268316

I’ve built lots of pre-LLM data processing pipelines like this and the more I read people putting “agents” into this kind of context the less they resemble agents like the Anthropics of the world defines and the more they just resemble functions. I wonder if eventually there won’t be a distinction and it’ll just be a way to make processing and branching nodes in a pipeline less deterministic when you need more flexibility than pure code-rules can give you.

nilirl · 2025-06-18T17:19:10 1750267150

This felt forced.

If you read through the whole thing, they don't manage to build an AI Agent, they make LLM API calls using Flink's SQL.

They admit making it agentic, with choice over tool selection and with agent-like memory, requires workarounds.

Annoying workarounds, imo.

The post mentions work is in progress to build agents that use Flink without using Flink SQL.

Thereby invalidating it's own title.

esafak · 2025-06-18T17:35:24 1750268124

Perhaps a charitable take is that this is a musing on how to use immutability for building agents? Maybe a functional approach would be less constrictive than SQL.

tra3 · 2025-06-18T17:43:09 1750268589

Indeed. I'm very curious about the original source of inspiration for the post, "That microservice should have been a SQL query". It's not as catchy without AI though.

nilirl · 2025-06-18T18:02:00 1750269720

How did it make an argument for immutability?

esafak · 2025-06-19T00:36:53 1750293413

Not very strongly, but SELECTs are immutable unless you are evil and create a mutating function, which I've seen happen.

btown · 2025-06-19T02:51:37 1750301497

An interesting thought experiment is that a service that can both be triggered by, and write to, the same persistent queue can call an LLM in each iteration and create a self-evolving agentic workflow. Much like a Scrapy spider can recursively find new things to crawl, an agent can recursively find new things to do.

SQL is a great language to express “choose something from a priority queue and do something with it” but if you try to think about predefining your entire computation DAG as a single query rather than many queries over time, you lose the loose asynchronous iteration that gives an agent the same grace you’d give to a real-life task assignment. And that grace is what makes agentic workflows powerful.

soco · 2025-06-19T09:04:11 1750323851

Original title: "This AI Agent Should Have Been a SQL Query"