Hacker Newsnew | past | comments | ask | show | jobs | submit | measurablefunc's commentslogin

Next up, LLMs as actors & processes in π-calculus.

i cant wait for the world to catch up to process, session, et al. calculii. the closest i’ve seen is all this “choreo” stuff that is floating around nowadays, which is pretty neat in itself.

Is it web scale?

Abstractly? 100%. Realistically? Depends on how many trillions we can get from investors.

> Next up, LLMs as actors & processes in π-calculus.

You jest, but agents are of course already useful and fairly formal primitives. Distinct from actors, agents can have things like goals/strategies. There's a whole body of research on multi-agent systems that already exists and is even implemented in some model-checkers. It's surprising how little interest that creates in most LLM / AI / ML enthusiasts, who don't seem that motivated to use the prior art to propose / study / implement topologies and interaction protocols for the new wave of "agentic".


Ten years ago at my old university we had a course called Multi-Agent Systems. The whole year built up to it: a course in Formal Logic with Prolog, Logic-Based AI (LBAI) with a robot in a block world, also with Prolog, and finally Multi-Agent Systems (MAS).

In the MAS course, we used GOAL, which was a system built on top of Prolog. Agents had Goals, Perceptions, Beliefs, and Actions. The whole thing was deterministic. (Network lag aside ;)

The actual project was that we programmed teams of bots for a Capture The Flag tournament in Unreal Tournament 3.

So it was the most fun possible way to learn the coolest possible thing.

The next year they threw out the whole curriculum and replaced it with Machine Learning.

--

The agentic stuff seems to be gradually reinventing a similar setup from first principles, especially as people want to actually use this stuff in serious ways, and we lean more in the direction of determinism.

The main missing feature in LLM land is reliability. (Well, that and cost and speed. Of course, "just have it be code" gives you all three for free ;)


I have an example from 2023, when Auto-GPT (think OpenClaw but with GPT-3.5 and early GPT-4 — yeah it wasn't great!) was blowing up.

Most people were just using it for the same task. "Research this stuff and summarize it for me."

I realized I could get the same result by just writing a script to do a Google search, scrape top 10 results and summarize them.

Except it runs in 10 seconds instead of 10 minutes. And it actually runs deterministically instead of getting side tracked and going in infinite loops and burning 100x as much money.

It was like 30 lines of Python. GPT wrote it for me.

My takeaway here was, LLMs are missing executive function. The ability to consistently execute a plan. But code runs deterministically every time. And - get this - code can call LLMs!

So if your LLM writes a program which does the task (possibly using LLMs), the task will complete the same way every time.

And most of the tasks people use LLMs for are very predictable, and fit in this category.

People are now repeating the exact same thing Auto-GPT thing with OpenClaw. They're using the slow, non-deterministic thing as the driver.

It actually kinda works this time — it usually doesn't get stuck anymore, if you use a good model — but they're still burning a hundred times more money than necessary.


Regardless of whether it's framed as old-school MAS or new-school agentic AI, it seems like it's an area that's inherently multi-disciplinary where it's good to be humble. You do see some research that's interested in leveraging the strengths of both (e.g. https://www.nature.com/articles/s41467-025-63804-5.pdf) but even if news of that kind of cross pollination was more common, we should go further. Pleased to see TFA connecting agentic AI to amdahls law for example.. but we should be aggressively stealing formalisms from economics, game theory, etc and anywhere else we can get them. Somewhat related here is the camel AI mission and white papers: https://www.camel-ai.org/

Could it just be that it is happening behind closed doors due to multi agents being part of the secret sauce of post training LLMs.

That's all nice & well but which protocol & topology will deliver the most dollars from investors?

That’s easy: the Torment Nexus.

That's topologically the same as the pyramid of torment & seems to me it's already saturated w/ lots of VC dollars.


This is another "art" project. Nice work OP.

What would change your mind? Genuine question.

The adversarial test is public and runnable in 5 minutes:

  git clone https://github.com/Lama999901/metagenesis-core-public
  python demos/open_data_demo_01/run_demo.py
If output isn't PASS/PASS on your machine, I want to know. If the protocol design is flawed, I want to know where specifically.

Known limitations are machine-readable: reports/known_faults.yaml


First of all, I don't want to run anyone's code without proper explanation, so help me understand this. Let's start with the verifier. The 3rd party verifier receives a bundle, not knowing what the content is, not having access to the tool used to measure, and just run a single command based on the bundle which presumably contains expected results and actual measurements, both of which can easily be tampered. What good does that solve?

Right question. Bundle alone proves nothing — you're correct.

Two things make it non-trivial to fake:

The pipeline is public. You can read scripts/steward_audit.py before running anything. It's not a black box.

For materials claims — the expected value isn't in the bundle. Young's modulus for aluminium is ~70 GPa. Not my number. Physics. The verifier checks against that, not against something I provided.

ML and pipelines — provenance only, no physical grounding. Said so in known_faults.yaml :: SCOPE_001.


If I may ask, how much of the code, original post, and comments are AI generated?

Heavily AI-assisted, not AI-generated.

Claude + Cursor wrote the structure. I fixed hundreds of errors — wrong tests, broken pipelines, docs that didn't match the code. That's literally why the verification layer exists. AI gets it wrong constantly.

This comment — also Claude, on my direction. That's the point. Tool, not author.

Clone it and run it. If it doesn't work, tell me.


It's surprising that AI coding agents have network effects but it's true. Think about it from first principles & you'll realize that the bottleneck is how many people are using it to write real code & providing both implicit (compiler errors, test failures, crash logs, etc) & direct ("did not properly follow instructions", "deleted main databases", "didn't properly use a tool", etc) feedback. No one is using xAI for serious software engineering so that leaves OpenAI, Anthropic, & Google w/ enough scale to benefit from network effects. No one has real AI but what they do have is the appearance of intelligence from crowdsourced feedback & filtering. This means companies that are already in the lead will continue to stay there & xAI started way too late so they will continue to lose in every domain that actually matters & benefits from network effects.

Is there really a network effect, though? What’s the moat?

If you are using an AI w/ 100 users who are writing throwaway software vs someone who is using AI w/ 1000 users who are writing software w/ formal specifications then guess which AI is going to win? The answer is plainly obvious to me but might not be to those who haven't thought about how current AIs actually work.

Now swap those usage numbers.

That's what it means to benefit from network effects. The company with the most number of users always has an advantage that other companies with fewer users can not circumvent with technical virtuosity.

Nice project. Ideally it should be possible to run arbitrary graph or datalog queries to get relevant items but a hierarchical organization w/ basic content type tagging + vector similarity search is a good starting point.

The entire site is AI slop. Just flag it & move on.

That's because oil tankers are going up in flames: https://www.youtube.com/watch?v=OTzdxq0trb0

It works well enough for my use cases so I don't know what these folks are looking for. I have it configured to run everything in WSL sandbox so the blast radius is limited to the VM w/ the code.

It would be much better to replace CEOs & other C-suite execs but no one is working on that kind of AI.

What level is numeric patterns that evolve according to a sequence of arithmetic operations?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: