Hacker Newsnew | past | comments | ask | show | jobs | submit | sidk24's commentslogin

Author here. IMO, we have better observability for a Node.js service than for an AI agent.

I build AI agent infrastructure. The post came from a real debugging session. An agent modified 47 files, the build failed, and I spent twenty minutes scrolling terminal output before giving up and starting over.

The core argument: we solved observability for microservices over the last decade (OpenTelemetry, Datadog, Honeycomb, Grafana). AI agents are also distributed systems. Multiple LLM calls, tool invocations, file operations, decision points. But there is no structured trace, no cost attribution per task, no permission audit trail, and no session replay.

Four questions you cannot answer today:

1. What did the agent do? (no structured trace) 2. Why did it do it? (context is ephemeral) 3. What did it cost? (no per-task attribution) 4. What was it allowed to do? (no permission audit trail)

The patterns exist in distributed systems observability. They need to be adapted, not invented. OpenTelemetry's data model (trace IDs, spans, parent-child relationships) maps directly to agent execution.

Happy to discuss the technical details. Particularly interested in hearing from teams that have built ad-hoc agent logging and what they learned.


Author here. Wrote this after my "AI fatigue is real" post hit #1 here

Earlier this year. The DMs from that post, hundreds of engineers describing the same problems - made it clear there was no single reference covering the full agent stack for engineering teams.

33 chapters, 10 parts. Early version, open source.

There are rough edges and likely mistakes. PRs welcome:

https://github.com/Siddhant-K-code/agentic-engineering-guide

Happy to answer questions about any of the chapters.


Spent the last few months writing about the layer underneath individual agent usage: what happens when you go from one developer using a coding agent to a team shipping agents in production. Authorization, context engineering, cost control, observability, incident response, adoption.

The result is a 33-chapter guide, free to read online: https://agents.siddhantkhare.com

Individual patterns for using agents well are one piece. The infrastructure, security, and team practices for running them safely at scale are another. This covers the second part.


I spent the last few months writing about the layer underneath individual agent usage: what happens when you go from one developer using a coding agent to a team shipping agents in production. Authorization, context engineering, cost control, observability, incident response, adoption.

The result is a 33-chapter guide, free to read online: https://agents.siddhantkhare.com

Early version — open source, CC BY-NC-SA 4.0.

There are rough edges and likely mistakes. Corrections welcome: https://github.com/Siddhant-K-code/agentic-engineering-guide

Individual patterns for using agents well are one piece. The infrastructure, security, and team practices for running them safely at scale are another. This covers the second part.


Author here. I work on authorization infrastructure (OpenFGA maintainer, CNCF Incubating) & have been building agent security tooling.

The Check Point disclosure this week (CVE-2025-59536, CVE-2026-21852) showed that malicious repo configs could execute shell commands and steal API keys before the trust prompt even appeared. Anthropic patched the specific bugs. But the underlying problem is architectural.

Claude Code gives you two options: approve every mkdir & npm test individually, or pass "--dangerously-skip-permissions" & give the agent unrestricted access to your filesystem, network, and shell. Most devs end up on the second option within a week.

We solved this for CI/CD and service accounts decades ago. Declarative policies, scoped permissions, audit trails. None of that exists for AI agents yet.

The post lays out what a real permission model would look like: declarative policy files per project, relationship-based scoping (so a feature branch agent gets different access than a production hotfix agent), and structured audit logs by default.

Happy to answer ques. about the auth patterns or the Check Point findings.


Author here. Wrote the "AI fatigue" post that was on HN a few weeks ago. This is the follow-up.

The short version: AI made code generation fast, but nobody invested in making code verification fast. The human became the bottleneck. That's what causes the fatigue.

The fix is backpressure, a systems engineering concept. Automated feedback (types, tests, linters, architectural rules) that catches agent mistakes before they reach you.

A few things I learned from talking to teams:

- One team cut their test suite from 15 min to 90 seconds specifically for agent iteration speed. Paid for itself in a week.

- Pre-commit hooks went from "annoying" to essential. Agents don't complain. Turn everything on.

- BoundaryML calls this "agentic backpressure" - they did a whole podcast on it. The Ralph Wiggum loop community builds workflows around the same idea.

- The post has a hierarchy (types > tests > linters > architectural rules > human review last) and a Monday-morning checklist.

Backpressure won't catch everything, an agent can pass every test and still be the wrong approach. But it reduces the noise so you can focus on the signal.

Curious what feedback loops you've built around your agents.


Real talk!! This isn’t about latency, it’s about authorship

“My machine” is identity and control. That worked when the human was the execution engine. With agents, undocumented setup turns from annoyance into hard failure.

But that agent workflows expose environment debt the way CI exposed testing debt.

If an environment can’t be provisioned programmatically, it’s not infrastructure, it’s folklore.


I work at Ona (formerly Gitpod). Background agents from Stripe, Ramp, and Shopify are shipping real code in production, but they all depend on one thing: a fully automated, reproducible dev environment. No localhost, no local state, no "works on my machine."

The companies moving fastest right now have all done the boring work first: to standardize their dev environments. The agent harness is a thin layer on top.

Let us know what you think!


Author here: Sir, it is almost fully written by human and english/grammar improved by AI


...and I usually come to doubt my own intuitions that this is the case when people say things like this, but my experience is usually that the LLM is doing more heavy lifting than you realise.

> Distill - deterministic context deduplication for LLMs. No LLM calls, no embeddings, no probabilistic heuristics. Pure algorithms that clean your context in ~12ms.

I simply do not believe that this is human-generated framing. Maybe you think it said something similar before. But I don't believe that is the case. I am left trying to work out what you meant through the words of something that is trying to interpret your meaning for you.


Author here. Not an anti-AI post. It's about the cognitive cost - faster tasks lead to more tasks, reviewing AI output all day causes decision fatigue, and the tool landscape churns weekly. Wrote about what actually helped. Curious if others are hitting similar walls.


Why did you use an LLM to write/change the words in your blog and your post? It really accentuates the sense of fatigue when I can tell I'm not interacting with a human on the other side of a message.



Some of the points raised in the article resonate with me, but I see a lot of trademark phrases inserted by LLMs ("it's not X, it's Y" being the most obvious). Can you share what was your writing process? How much did you write yourself, whether you used LLM to proofread or write the entire text from bullet points, or maybe not at all?


Great post, I certainly feel you. Not just the anxiety but the need to push myself more and accomplish more now that I have some help. Setting right expectations and what is more practical and not every "AI magic post" is worth the attention, has helped me by not being anxious and with the FOMO.


Thanks <3

I've started doing it now, still needs to work on it. Thanks for the tip though, i hope it is working well for you!!


isn't it a bit too ironic that you expect us to read your ai generated slop about ai fatigue?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: