It's impossible to not get decision-fatique and just mash enter anyway after a couple of months with Claude not messing anything important up, so a sandboxed approach in YOLO mode feels much safer.
It takes the stress about needing to monitor all the agents all the time too, which is great and creates incentives to learn how to build longer tasks for CC with more feedback loops.
I'm on Ubuntu 22.04 and it was surprisingly pleasant to create a layered sandbox approach with bubblewrap and Landlock LSM: Landlock for filesystem restrictions (deny-first, only whitelisted paths accessible) and TCP port control (API, git, local dev servers), bubblewrap for mount namespace isolation (/tmp per-project, hiding secrets), and dnsmasq for DNS whitelisting (only essential domains resolve - everything else gets NXDOMAIN).
I've been working for the past several weeks in an environment where it's easy and safe to give different claudes yolo-mode, but yesterday I needed to build an Emacs TRAMP plugin, and I had to do that on my local development NUC. I am extremely spoiled for yolo-mode, because even just yes-ok'ing all the elisp fragments claude came up with was exasperating, the whole experience was draining, and that was me not being especially careful (just making sure it didn't run random bash commands to, like, install a different Emacs or something).
Funny, but I wrote some environment initialization and setup scripts that you just unzip to a new dev desktop, and run the first powershell script, and it will work through (have to reboot after a couple installs), but it goes through, then once WSL is up, it'll rely on the /mnt/c/ paths to run bash scripts to initialize the wsl environment too... was pretty handy.
You can turn all the interop and mounting of the windows FS with ease. I run claude in yolo mode using this exact setup. Just role out a new WSL env for each claude I want yoloing and away it goes. I suppose we could try to theorize how this is still dangerous buts its getting into extremely silly territory.
It's still experimental and if you dive into the issues I would call its protection light. Many users experiences erratic issues with perms not being enforced, etc.
For me the largest limitation was that it's read-mode is deny-only, meaning that with an empty deny-list it can read all files on your laptop.
Restricting to specific domains have worked fine for me, but it can't block on specific ports, so you can't say for instance you may access these dev-server ports, but not dev-server ports belonging to another sandbox.
It feels as though the primary usecase is running inside an already network and filesystem sandboxed container.
It’s pretty weak sandboxing. It still grants full read only access to the file system so any secrets in your home directory can still be exfiltrated. I’m pretty sure it could also be deceptive and use a script to write where it shouldn’t be able to as well.
That’s not really sandboxing in my opinion. It should be something like unveil, the process gets a working space at startup, and it cannot ever do anything outside of that directory.
I like Claude Code in the terminal. For me it's so good it don't need IDE integration. I'm just using emacs and magit to navigate the code out of band.
I wonder if some people grow to develop an inner monologue, but also immediately start developing an ability to silence it, to the point they don't have to try.
Mine is basically always on, and it can be problematic.
Yeah, totally this. I've had so much fun with AoC, learning nim, elixir at the same time.
I would normally tap out around the same place on the first dynamic programming puzzle which just takes me so long to wrap my head around each time (tips anyone? :)).
I welcome these new changes, and what ever the format are very greatful for all his hard work!
They're not as magical as they seem, you just need some practice. Read over the dynamic programming section in https://cses.fi/book/index.php (pdf link near the top is the free English version), then do a few on https://cses.fi/problemset/ . You'll be able to handle the AoC dynamic programming ones with _no_ problem at all.
Nope, I've really, really tried to like modal editing, because the programmable command chaining is super cool, but even though I became proficient with it I never really enjoyed it.
Starting out emacs i got super fatigued with all the long pinky driven commands for mostly used commands. It felt usable after I added keybindings for commands like switch buffer, close buffer, duplicate line(s), move line(s), find in project, find file in project, indent (wrote my own sane (for me)) indention code). The windows/apple key is great for those things because they are not used by emacs.
On linux I settled on using emacs vanilla key commands for copy/paste/cut but that took a looong time to feel comfortable with and I still mess it up sometimes, also with the ctrl+shift-X version of them in the terminal. On iOS, using the apple key like for the rest of the system is sweet relief.
My sweet spot at the moment is Claude Desktop with mcp servers for editing and aider --watch for quick fixes. Claude Code uses way, way, way too many tokens on the large project i work most on.
> Claude Code uses way, way, way too many tokens on the large project i work most on.
That's a very fair critique, and it makes the pay-as-you-go pricing model (vs. one of their subscription options) a completely unrealistic option for doing anything serious with Claude Code.
Yeah, I too found giving LLMs access to my emails via notmuch [1] is super helpful. Connecting peripheral sources like email and Redmine while coding creates a compounding effect on LLM quality.
Enterprise OAuth2 is a pain though - makes sending/receiving email complicated and setup takes forever [2].
Heh. I'm giving Claude running on AWS Bedrock in a EU datacenter access to read small parts of my email (normally 1-3 email threads in a chat), compose drafts for approval and then send them in a separate step. I can read and approve all tool calls before they are executed.
It takes the stress about needing to monitor all the agents all the time too, which is great and creates incentives to learn how to build longer tasks for CC with more feedback loops.
I'm on Ubuntu 22.04 and it was surprisingly pleasant to create a layered sandbox approach with bubblewrap and Landlock LSM: Landlock for filesystem restrictions (deny-first, only whitelisted paths accessible) and TCP port control (API, git, local dev servers), bubblewrap for mount namespace isolation (/tmp per-project, hiding secrets), and dnsmasq for DNS whitelisting (only essential domains resolve - everything else gets NXDOMAIN).
reply