Hacker Newsnew | past | comments | ask | show | jobs | submit | runekaagaard's commentslogin

It's impossible to not get decision-fatique and just mash enter anyway after a couple of months with Claude not messing anything important up, so a sandboxed approach in YOLO mode feels much safer.

It takes the stress about needing to monitor all the agents all the time too, which is great and creates incentives to learn how to build longer tasks for CC with more feedback loops.

I'm on Ubuntu 22.04 and it was surprisingly pleasant to create a layered sandbox approach with bubblewrap and Landlock LSM: Landlock for filesystem restrictions (deny-first, only whitelisted paths accessible) and TCP port control (API, git, local dev servers), bubblewrap for mount namespace isolation (/tmp per-project, hiding secrets), and dnsmasq for DNS whitelisting (only essential domains resolve - everything else gets NXDOMAIN).


I've been working for the past several weeks in an environment where it's easy and safe to give different claudes yolo-mode, but yesterday I needed to build an Emacs TRAMP plugin, and I had to do that on my local development NUC. I am extremely spoiled for yolo-mode, because even just yes-ok'ing all the elisp fragments claude came up with was exasperating, the whole experience was draining, and that was me not being especially careful (just making sure it didn't run random bash commands to, like, install a different Emacs or something).

Configuring Claude Code ... the new init.el ;)

... also interested. What would one build an Emacs TRAMP plugin for? :)

Directly editing files on a remote VM that happens to have an API for directly accessing files.

I'm currently stuck on Windows, but I thought sandboxing was built in to Claude Code as a feature on Linux with the /sandbox command?

For Windows a quick win is to install VMware Workstation Pro (which is free) and install Ubuntu 24.04 LTS as a VM.

Broadcom bought VMware then released Workstation Pro for free and I don't think they kept the download link but you can get from TechPowerUp:

https://www.techpowerup.com/download/vmware-workstation-pro/

You can then let LLMs on YOLO mode inside it.


What is the advantage of using VMware Workstation Pro for this as opposed to using WSL2?

I think it has default access to your c drive via a mount, for one. You could add layers/sandboxes, but it’s not isolated.

Funny, but I wrote some environment initialization and setup scripts that you just unzip to a new dev desktop, and run the first powershell script, and it will work through (have to reboot after a couple installs), but it goes through, then once WSL is up, it'll rely on the /mnt/c/ paths to run bash scripts to initialize the wsl environment too... was pretty handy.

Yeah, I do most Linux stuff on Windows in containers using podman leveraging WSL2, but that's a good point.

I wouldn't put it past Opus 4.5 in yolo mode to vm escape if it felt like it haha

Stronger isolation and choice of OS

Windows has the WSL for native Linux vms, these days (and also the past ~decade)

I can rm -rf Windows files from WSL2. And so can LLMs.

Meanwhile a VM isolates by default.


You can turn all the interop and mounting of the windows FS with ease. I run claude in yolo mode using this exact setup. Just role out a new WSL env for each claude I want yoloing and away it goes. I suppose we could try to theorize how this is still dangerous buts its getting into extremely silly territory.

That's great to know! And important to clarify because by default WSL has access to all disks.

/sandbox AFAIK uses https://github.com/anthropic-experimental/sandbox-runtime under the hood.

It's still experimental and if you dive into the issues I would call its protection light. Many users experiences erratic issues with perms not being enforced, etc.

For me the largest limitation was that it's read-mode is deny-only, meaning that with an empty deny-list it can read all files on your laptop.

Restricting to specific domains have worked fine for me, but it can't block on specific ports, so you can't say for instance you may access these dev-server ports, but not dev-server ports belonging to another sandbox.

It feels as though the primary usecase is running inside an already network and filesystem sandboxed container.


It’s pretty weak sandboxing. It still grants full read only access to the file system so any secrets in your home directory can still be exfiltrated. I’m pretty sure it could also be deceptive and use a script to write where it shouldn’t be able to as well. That’s not really sandboxing in my opinion. It should be something like unveil, the process gets a working space at startup, and it cannot ever do anything outside of that directory.

I like Claude Code in the terminal. For me it's so good it don't need IDE integration. I'm just using emacs and magit to navigate the code out of band.


Thats me to a tee. I also don't have a inner dialogue when thinking through a problem:)


> I also don't have a inner dialogue...

I wonder if some people grow to develop an inner monologue, but also immediately start developing an ability to silence it, to the point they don't have to try.

Mine is basically always on, and it can be problematic.


Yeah, totally this. I've had so much fun with AoC, learning nim, elixir at the same time.

I would normally tap out around the same place on the first dynamic programming puzzle which just takes me so long to wrap my head around each time (tips anyone? :)).

I welcome these new changes, and what ever the format are very greatful for all his hard work!


> tips anyone? :)

They're not as magical as they seem, you just need some practice. Read over the dynamic programming section in https://cses.fi/book/index.php (pdf link near the top is the free English version), then do a few on https://cses.fi/problemset/ . You'll be able to handle the AoC dynamic programming ones with _no_ problem at all.


I agree. The industry standard for a great, boring, durable and surprisingly cheap knife is the Victorinox Fibrox Chef's Knife 20 cm.


Nope, I've really, really tried to like modal editing, because the programmable command chaining is super cool, but even though I became proficient with it I never really enjoyed it.

Starting out emacs i got super fatigued with all the long pinky driven commands for mostly used commands. It felt usable after I added keybindings for commands like switch buffer, close buffer, duplicate line(s), move line(s), find in project, find file in project, indent (wrote my own sane (for me)) indention code). The windows/apple key is great for those things because they are not used by emacs.

On linux I settled on using emacs vanilla key commands for copy/paste/cut but that took a looong time to feel comfortable with and I still mess it up sometimes, also with the ctrl+shift-X version of them in the terminal. On iOS, using the apple key like for the rest of the system is sweet relief.


Heh - yeah have had trillion dollar ideas many times :)


My sweet spot at the moment is Claude Desktop with mcp servers for editing and aider --watch for quick fixes. Claude Code uses way, way, way too many tokens on the large project i work most on.


Get one of the max plans! It pays for itself.


> Claude Code uses way, way, way too many tokens on the large project i work most on.

That's a very fair critique, and it makes the pay-as-you-go pricing model (vs. one of their subscription options) a completely unrealistic option for doing anything serious with Claude Code.


You don't get paid for programming?


Luckily i do, but i mean it triggers the api limit in 10 minutes amount of tokens


Yeah remember when people were using Claude 3.7... so oldschool man


Yeah, I too found giving LLMs access to my emails via notmuch [1] is super helpful. Connecting peripheral sources like email and Redmine while coding creates a compounding effect on LLM quality.

Enterprise OAuth2 is a pain though - makes sending/receiving email complicated and setup takes forever [2].

- [1] https://github.com/runekaagaard/mcp-notmuch-sendmail

- [2] https://github.com/simonrob/email-oauth2-proxy


..you give Claude Desktop access to read all your emails and send as you??


Heh. I'm giving Claude running on AWS Bedrock in a EU datacenter access to read small parts of my email (normally 1-3 email threads in a chat), compose drafts for approval and then send them in a separate step. I can read and approve all tool calls before they are executed.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: