I've been using Claude for a little over a year, but the recent events with DoW are making me want to explore European alternatives. I'm willing to give Devstral 2 a try, but I'm not sure what to expect. In terms of tool calling and coding abilities, should I expect something closer to Sonnet 3.5 or to Sonnet 4.5?
Domestic mass surveillance might feel tolerable when you live in the country conducting it. But how would you feel about other countries adopting similar policies, and thereby mass-surveilling the American people? Because that's exactly what these policies authorize when applied to the rest of the world.
I would feel much better about other countries mass surveillance than the US. China for instance can’t do nearly as much to me as the US justice system can.
Ok so now connect the mass surveillance system to an automated killing system that can blow you up in the grocery store because you're standing in line next to its target.
Given a choice between someone blowing me up because I’m next to a high value asset and worrying about jack booted masked thugs with qualified immunity killing me and being cheered by 40% of the population - I’ll take my chance with China having my info before ICE or the local police.
The way the anthropic statement was written really stood out to me.
How they posture themselves in favour of surveillance for foreign countries or the existence of fully autonomous weapons if they don't threaten US citizen lifes.
I wonder if this is how some non minority of American thinks or was just worded like that to try to appeal to the "most radical patriots"
I'm pretty neutral in this fiasco, but if a company is willing to consider *in principle* providing services to the *Department of War*, they'd better be OK with their services being used to conduct surveillance or kill people of other countries...
I think war is bad and generally a stupid thing to do, but my point is that if they were negotiating terms with the department at all, it's really a given they'd be OK with the stuff you took issue with.
The bad news for American people is that "others" are pretty good at these technologies. When I read an important AI paper chances are all the names on it are non-American, even for papers from American labs. In a real war, this becomes problematic.
Every nation has some bias but I think Americans have power poisoning for being the dominant power for so long. They think they are entitled to do anything and believe they are the good guys in the history. Well...
How else do you suggest common folks are supposed to view world, or well anything?
Americans do the same, hence whole world got ttump. 95% of the world aint US, so such logic is even easier for almost whole mankind - is US force of good or evil? Different places would give you different answers, and most americans would not like the actual spread these days.
Their "power poisoning" is warranted. The USA's military capabilities dwarf those of the rest of the world. There is not a single country on earth that can stand up to America militarily.
We are lucky that they never went full Roman Empire on us. That's only due to their own restraint. We may see them falter increasingly often as their economic power gets eroded by other nations. Just look at Venezuela.
I don't think it will feel even remotely tolerable in the US. I've been heavily critical of Trump on a regular basis on the public internet ever since he showed up 10 years ago. I doubt a government surveillance AI would miss this. Of course, there are probably millions of people like me, but given the behavior of the government recently, I really have to wonder what they might do to people like me once we've been put on a list.
Other countries can't send armed thugs to my door over petty stuff like my local government can.
Nobody in the history of ever has been concerned that the agents of some foreign country may know what they read, who they associate with or what kind of penis pills they buy or whatever, the threat has always been that those local enough to do violence on you might come into that information.
On my personal coding agent I've introduced a setup phase inside skills.
I distribute my skills with flake.nix and a lock file. This flake installs the required dependencies and set them up. A frontmatter field defines the name of secrets that need to be passed to the flake.
As it is, it works for me because I trust my skill flakes and skills are static in my system:
-I build an agent docker image for the agent in which I inject the skills directory.
-Each skill is setup when building the image
-Secret are copied before the setup phase and removed right after
I pay a Max subscription since a long time, I like their model but I hate their tools:
- Claude Desktop looks like a demo app. It's slow to use and so far behind the Codex app that it's embarassing.
- Claude Code is buggy has hell and I think I've never used a CLI tool that consume so much memory and CPU. Let's not talk about the feature parity with other agents.
- Claude Agent SDK is poorly documented, half finished, and is just thin wrapper around a CLI tool…
Oh and none of this is open source, so I can do nothing about it.
My only option to stay with their model is to build my own tool. And now I discover that using my subscription with the Agent SDK is against the term of use?
I'm not going to pay 500 USD of API credits every months, no way. I have to move to a different provider.
I haven't used opencode but pi agent runs rings around claude code. Never eats tons of CPU on big outputs, no flickering, open source, tree-based context instead of claude's linear context, easy to toggle collapsing/expanding tool outputs, built for extension with runtime reloading of extensions and skills, etc. You can easily build your own amp-code like handoff mechanism, customize the UI (i see models' edit diffs syntax-highlighted with delta, and just added a keybind to list session-edited files + files from git status in fzf), etc.
Meanwhile with Claude Code I've had to get claude to decompile the editor (extract JS from the bun executable) _twice_ to diagnose weird things like why some documented config flags were not taking effect.
Opus is great - but I'd rather use a different model than be forced back into Claude Code.
I regret ever promoting that Claude Code crap. I remember when it was nothing but glowing reviews everywhere. Honestly AI companies should stick to what they are good at: direct API interface to powerful models.
We are heading toward a $1000/month model just to use LLMs in the cloud.
I got so tired of cursor that I started writing down every bug I encountered. The list is currently at 30 entries, some of them major bugs such as pressing "apply" on changes not actually applying changes or models getting stuck in infinite loops and burning 50 million tokens.
It's not HTML purism. It's simply recognizing that HTML and CSS have evolved a lot and many things don't need (or are close to not need) JS anymore.
This shouldn't be taken as an anti-JS article, everyone benefits from these gradial improvements. Especially our users who can now get a uniform experience.
I'm consistently hitting weird bugs with opencode, like escape codes not being handled correctly so the tui output looks awful, or it hanging on the first startup. Maybe after they migrate to opentui it'll be better
I do like the model selection with opencode though
There's a bit of UI around it where you can accept the plan. I personally stopped using it and instead moved to a workflow where I simply ask it to write the plan in a file. It's much easier to edit and improve this way.
Yeah, I just have it generate PRDs/high-level plans, then break it down into cards in "Kanban.md" (a bunch of headers like "Backlog," "In-Progress", etc).
To be honest, Claude is not great about moving cards when it's done with a task, but this workflow is very helpful for getting it back on track if I need to exit a session for any reason.
i've experienced the same thing. usually i try to set up or have it set up a milestone/phase approach to an implementation with checklists (markdown style) but it's 50/50 if it marks them automatically upon completion.
I have this in my CLAUDE.md and it works better than 50/50. Still not 100% though:
### Development Process
All work must be done via TODO.md. If the file is empty, then we need to write our next todo list.
When TODO.md is populated:
1. Read the entire TODO.md file first
2. Work through tasks in the exact order listed
3. Reference specific TODO.md sections when reporting progress
4. Mark progress by checking off todos in the file
5. Never abbreviate, summarize, or reinterpret TODO.md tasks
A TODO file is done when every box has been checked off due to completion of the associated task.
reply