More

schipperai · 2026-05-11T03:13:19 1778469199

This recent article from Semianalysis did a great job explaining part of it: https://newsletter.semianalysis.com/p/are-ai-datacenters-inc...

schipperai · 2026-05-11T02:39:51 1778467191

Very cool. How do you classify negative signals?

davideuler · 2026-05-11T03:33:26 1778470406

I've updated several iterations to improve the accuracy for release stability. And I open sourced the project so that you may contribute to the dashboard to make it more useful: https://github.com/davideuler/agent-watch

THANK YOU for all guys who gives feedback for the tiny project.

davideuler · 2026-05-11T02:44:14 1778467454

GPT would analyze each issue if it is negative. And also it would analyze if it the core features related issue. I iterated it several times. The dashboard seems more reasonable than the initial version. I would open source the project soon so that other could contribute to build a better stability dashboard for the daily Agents we use.

schipperai · 2026-05-11T02:34:46 1778466886

Which platform have you found is most hackable? I have Garmin atm and like it but there’s no easy way to pipe my data into my agent or server for offline analysis.

Orelus · 2026-05-11T07:46:36 1778485596

I’ve only really had trouble integrating Withings.

Working with Apple was also challenging because I had to purchase an Apple Watch or iPhone (the data is stored locally only, with no server or API to call, which is great from a privacy perspective) and then deploy specific code on the device.

I’m not sure if this helps your use case, but I was planning to make the API public and create a CLI (similar to Sentry or Grafana’s gcx) to access it. But if you want a local first option, not the best solution

schipperai · 2026-05-11T02:32:30 1778466750

I like the overall premise and would be curious to learn more. The Amazon overview reads like it was written with or by AI though.

jimako · 2026-05-12T12:18:22 1778588302

Ross, who kindly wrote the first review, was a reviewer of the book before it was published. He is a real person with over 30 years of experience in software development.

schipperai · 2026-05-11T02:17:11 1778465831

A better permissions layer for coding agents. The tool works like auto-mode for Claude Code, so you can stay in the flow and only get prompted to allow or deny tool calls when it truly matters, but it is fully deterministic. My benchmarks surfaced that most Bash calls don’t need an LLM to be classified as safe, ambiguous, or dangerous. A deterministic classifier can auto-allow or block 95% of Bash tool calls as safe or dangerous, with only the remaining 5% being truly ambiguous or unknown.

Conclusion is permission reviews with LLMs like Claude’s auto mode or Codex auto review are like using a data center to flip a light switch - overkill.

The main benefit is that your agent’s autonomy can be governed deterministically through policies that can be stored at the user and repo level. The bonus is that you save tokens vs using auto modes.

https://nah.build

brianjlogan · 2026-05-11T15:20:58 1778512858

You know I'd love an ability to a "lock" a file from being read by agents.

Casual browsing of a .env is probably my top pet peeve of coding agents.

Everytime a secret gets slurped into an API I have to go roll secrets.

Does this tool solve that use case?

schipperai · 2026-05-11T17:01:14 1778518874

Yes, you can define sensitive paths and assign 'ask' or 'block' policies to them.

.env, .ssh, and others are treated as a sensitive filenames by default.

Similarly, with hosts and network access - unknown hosts pause, trusted hosts can be configured.

schipperai · 2026-04-29T17:33:55 1777484035

With most OSS releases being MoEs, and modern GPUs optimized for MoEs, can somebody with knowledge of the topic explain or speculate why Mistral might have opted for a dense model?

ac29 · 2026-04-29T18:02:26 1777485746

Modern GPUs aren't optimized for MoEs though?

The advantage to a dense model like this Mistral one is that it is as smart as a much larger MoE model so it can fit on less GPUs. The tradeoff is that it is much slower since it has to read 100% of its weights for every token, MoE models typically only read about a tenth (though sparsity levels vary).

schipperai · 2026-04-29T23:34:54 1777505694

Thanks, makes sense. I meant Blackwell is explicitly optimized for MoEs.

schipperai · 2026-04-27T10:45:53 1777286753

100%. The exclusivity of the network is the differentiator here.

schipperai · 2026-04-27T09:45:18 1777283118

Agent permissions layer are broken. We need better a permissions layer that doesn’t get in the way but stops destructive commands. Devs get pushed into running yolo mode cause classifying allow / deny by command is not enough. A sandbox would not have prevented this either.

“nah” is a context aware permission layer that clasifies commands based on what they actually do

nah exposes a type taxonomy: filesystem_delete, network_write, db_write, etc

so commands gets classified contextually:

git push ; Sure. git push --force ; nah?

rm -rf __pycache__ ; Ok, cleaning up. rm ~/.bashrc ; nah.

curl harmless url ; sure. curl destroy_db ; nah.

https://github.com/manuelschipper/nah

Better permissions layers is part of the answer here, and a space that has been only narrowly explored.

schipperai · 2026-03-13T13:57:11 1773410231

nah inspects Write and Edit content before it hits disk so destructive patterns like os.unlink, rm -rf, shell injection get flagged. And executing the result (./evil) classifies as unknown resolves to ask, which the LLM can choose to blocks or ask you to approve.

But yeah, a truly adversarial agent needs a sandbox. It's a different threat model - nah is meant to catch the trusted but mistake-prone coding CLI, not a hostile agent.

schipperai · 2026-03-13T13:45:09 1773409509

great callout - tool call can have side-effects outside your box. So unless you run a sandbox with no internet access, you aren't ever 100% safe.

nah does guard some of this - reading .env or ~/.aws/credentials gets flagged, and Write/Edit content is inspected for secrets before it leaves the tool.

Docker + filtered mounts + something like nah on top is a solid layered approach that is still practical.