We are building this learned software system at Docflow Labs to solve the integration problem in healthcare at scale ie systems only able to chat with other systems via web portals. RPA historically awful to build and maintain so we've needed to build this to stay above water. Happy to answer any questions!
We are building this at docflowlabs ie a self-healing system that can respond to customer feedback automatically. And youre right that not all customers know what they want or even how to express it when they do, which is why the agent loop we have facing them is way more discovery-focused than the internal one.
And we currently still have humans in the loop for everything (for now!) - e.g, the agent does not move onto implementation until the root cause has been approved
Cool, I tried something similar over a couple weeks but the problem I ran into was that beyond a fairly low level of complexity, the English spec became more confusing than the code itself. Even for a simple multi-step KYC workflow, it got very convoluted and hard to make it precise, whereas in code it's a couple loops and if/else blocks with no possibility of misinterpretation. Have you encountered that at all, or have any techniques you've found useful in these situations?
That's why I feel like iterative workflows have won out so far. Each step gets you x% closer, so you close in on your goal exponentially, whereas the one-shot approach closes in much slower, and each iteration starts from scratch. The advantage is that then you have a spec for the whole system, though you can also just generate that from the code if you write the code first.
that's right, and agents turning specs into software can go in all sorts of directions especially when we don't control the input.
what we've done to mitigate is essentially backing every entrypoint (customer comment, internal ticket, etc) with a remote claude code session with persistent memory - that session essentially becomes the expert in the case. And we've developed checkpoints that work from experience (e.g. the root cause one) where a human has the opportunity to take over the wheel so to speak and drive in a different direction with all the context/history up to that point.
basically, we are creating a assembly line where agents do most of the work and humans increasingly less and less as we continue to optimize the different parts of assembly
as far as techniques, it's all boring engineering
* Temporal workflow for managing the lifecycle of a session
* complete ownership of the data model e2e. we dont use Linear for example; we built our own ticketing system so we could represent Temporal signals, github webhooks and events from the remote claude sessions exactly how we wanted
* incremental automation gains over and over again. We do a lot of the work manually first (like old fashioned hand coding lol) before trying to automate so we become experts in that piece of the assembly line and it becomes obvious how to incrementally automate...rinse and repeat
Ooh, it sounds like you've already got most of the groundwork done for something I was wondering about yesterday: I'd love it if there was some way during an incident, for some system to pull all the PRs included in the latest release, check which agents worked on them (i.e. line in the commit message with an identifier that corresponds to the agent's LLM context and any other data at the time of commit), "rehydrate" these agents from the corresponding stored context, feed them the relevant incident data, and ask if it could be related to their changes and what to do about it.
In most cases it might not be much more valuable than just looking through the diffs from scratch with a new agent, but there are probably going to be some cases where a rehydrated agent is like "Doh, I meant to do X but it looks like I hallucinated Y instead. Here's a PR to fix it!"
I know that's just a small piece of what you're doing, but I think it's something that would be valuable on its own, and soon something that is likely to be "standard infrastructure" for any company that does even a little agentic coding (assuming it works). It'd probably even be "required infrastructure" in regulated industries; the fact that all these agent contexts are ephemeral has to be a red flag from a regulatory perspective.
totally, it's like ai-native github with some linear plus some ability to push the ball forward autonomously. This doesn't exist yet so we had to build a version internally, but also we built it pretty specifically for our needs. The general version might have to be more componentized, not sure. We also as an industry probably need some version control protocol above git that includes all the history around the commit so we don't have to string together root cause documents and conversation history in s3 linked via relational entities in psql.
> Perhaps the key to transparent/interpretable ML is to just replace the ML model with AI-coded traditional software and decision trees. This way it's still fully autonomously trained but you can easily look at the code to see what is going on.
For certain problems I think thats completely right. We still are not going to want that of course for classic ML domains like vision and now coding, etc. But for those domains where software substrate is appropriate, software has a huge interpretability and operability advantage over ML
> We still are not going to want that of course for classic ML domains like vision
It could make sense to decompose one large opaque model into code with decision trees calling out to smaller models having very specific purposes. This is more or less science fiction right now, 'mixture of experts' notwithstanding.
You could potentially get a Turing award by making this work for real ;)
I don’t think that’s the right take. Poetry manipulates common grammatical rules and still communicates meaning from the writer to the reader, perhaps in an even deeper way because of that manipulation. Of course in Java and many other programming languages, grammatical errors will simply not compile. LISP is one of those few languages where grammar can change from program to program, much like with poetry
Even though there is much more freedom in poetry, it is still defined by a specific set of rules/features: verses, rhythm, stanzas, spacing, meter, and rhyming. It's only because of these restrictions that it is so obvious when writing is or isn't poetry. These features and forms can streched, but unlike lisp they cannot be completely redefined.
You’ve got to read more poetry before making assertions like this. In practice, the definition is more fluid than that.
Lisp cannot be completely redefined. You can’t avoid parentheses, and if you stray too far from common idiom, you’re no longer writing Lisp, you’re writing something else using Lisp syntactic forms.
> It's only because of these restrictions that it is so obvious when writing is or isn't poetry. These features and forms can streched, but unlike lisp they cannot be completely redefined.
I disagree here. To take rhyming as an example. It's possible to have a poem where every line rhymes AND a poem where there is no rhyme at all. It's not as simple as saying 'okay the lines in this text don't rhyme so it can't be a poem'. The same is true of the things like spacing and meter. These are all massively variable, and the result doesn't even have to be bound by the usual rules of grammar. English - or any other natural language - is much more variable than Lisp.
For me the defining feature of poetry is that the form and nature of the language used in a text may suggest meaning over and above what the individual words say. This definition is subjective, and suggests that the poetry is in the eye of the beholder, but is more honest than a simplistic checklist of features to look out for.
I didn't say poetry has to have all of those things, but it has to contain some of them or it simply isn't poetry. I would challenge you to find me one good example of poetry that has none of the features I listed.
This whole poetry topic is really besides the point anyways.
> English - or any other natural language - is much more variable than Lisp.
I don't feel like you are actually addressing what I'm saying, so let me reiterate it more clearly. I'm not making any assetions about the absolute creative power of lisp or writing. It is the author of the article who points out that lisp's distinguishing feature, compared to other programming languages, is it's ability to specialize and mutate its own verbiage/syntax to better fit certain problems or modes of thinking. I am simply pointing out the irony that this charactistic of lisp also distinguishes it significantly from natural language, even though the author is attempting to argue that programming lisp and writing literature are similar.
There's a new mode of programming (with AI) that doesn't require english and also results in massive efficiency gains. I now only need to begin a change and the AI can normally pick up on the pattern and do the rest, via subsequent "tab" key hits as I audit each change in real time. It's like I'm expressing the change I want via a code example to a capable intern that quickly picks up on it and can type at 100x my speed but not faster than I read.
I'm using Cursor btw. It's almost a different form factor compared to something like GH copilot.
I think it's also worth noting that I'm using TypeScript with a functional programming style. The state of the program is immutable and encoded via strongly typed inputs and outputs. I spend (mental) effort reifying use-cases via enums or string literals, enabling a comprehensive switch over all possible branches as opposed to something like imperative if statements. All this to say, that a lot of the code I write in this type of style can be thought of as a kind of boilerplate. The hard part is deciding what to do; effecting the change through the codebase is more easily ascertained from a small start.
Provided that we ignore the ridiculous waste of energy entailed by calling an online LLM every time you type a word in your editor - I agree that the utility of LLM-assisted programming as "autocomplete on steriods" can be very useful. It's awfully close to that of a good editor using the type system of a good programming language providing suggestions.
I too love functional programming, and I'm talking about Haskell-levels of programming efficiency and expressiveness here, BTW.
This is quite a different use case than those presented by the post I was replying to though.
The Go programming language has this mantra of "a little bit of copy and paste is better than a little bit of dependency on other code". I find that LLM-derived source code takes this mantra to an absurd extreme, and furthermore that it encourages a though pattern that never leads you to discover, specify, and use adequate abstractions in your code. All higher-level meaning and context is lost in the end product (your committed source code) unless you already think like a programmer _not_ being guided by an LLM ;-)
We do digress though - the original topic is that of LLM-assisted writing, not coding. But much of the same argument probably applies.
TypeScript is intentionally _structurally_ typed (except for enums). The example you describe would only be caught by a _nominally_ typed language (think Java).
Structural typing is great when programming functionally (i.e. with immutability) when the most important thing is the shape of inputs and outputs of functions, instead of named objects (like Person) and properties. In a functional program, for example, "printAge" would likely be called something like "print" or "printNumber" since that is what it is doing to its input _value_.
I think a lot of the misunderstanding I've seen recently around TypeScript (like from the Rails creator) comes from the misuse of TypeScript - if you use TypeScript in an object-oriented way, its going to be significantly less helpful.
I don't follow this one. I've never seen anyone use TypeScript with an OO approach aside from, ironically, the .NET folks.
The code I wrote has nothing OO with it and we can already see the issues. The majority of TS I've ever worked with was written for React and it still would benefit greatly from nominal types as you call them (thanks I didn't know that terminology).
I don't see everyone misusing TS. For me, it's simply a very limited language as far as typed languages go. As a result, it's a shame that it's what is being touted as a good example of why you should use typed languages.
That hasn’t been my experience. ReScript and Elm are much better compared to the fragile types I’ve encountered with Typescript (where I needed to write code to do the type checking). Happy if you’ve found something that works for you though.
We are building this learned software system at Docflow Labs to solve the integration problem in healthcare at scale ie systems only able to chat with other systems via web portals. RPA historically awful to build and maintain so we've needed to build this to stay above water. Happy to answer any questions!
reply