More

rstuart4133 · 2026-04-17T08:08:19 1776413299

The true irony is 10% .. 15% of importers to the USA (Importer of Record in the lingo) are foreign entities. For them, the tariff refund is a transfer directly from the USA consumer to the foreign countries ... that Trump was trying to penalise.

rstuart4133 · 2026-04-17T02:33:25 1776393205

> I've seen this claimed, but I'm not sure it's been true for my use cases?

I'd be surprised if it isn't true for your use cases. If you give GLM-5.1 and Optus 4.6 the same coding task, they will both produce code that passes all the tests. In both cases the code will be crap, as no model I've seen produces good code. GLM-5.1 is actually slightly better at following instructions exactly than Optus 4.6 (but maybe not 4.7 - as that's an area they addressed).

I've asked GLM-5.1 and Opus 4.6 to find a bug caused by a subtle race condition (the race condition leads to a number being 15172580 instead of 15172579 after about 3 months of CPU time). Both found it, in a similar amount of time. Several senior engineers had stared at the code for literally days and didn't find it.

There is no doubt the models do vary in performance at various tasks, but we are talking the difference between Ferrari vs Mercedes in F1. While the differences are undeniable, this isn't the F1. Things take a year to change there. The performance of the models from Anthropic and OpenAI literally change day by day, often not due to the model itself but because of the horsepower those companies choose to give them on the day, or them tweaking their own system prompts. You can find no end of posts here from people screaming in frustration the thing that worked yesterday doesn't work today, or suddenly they find themselves running out of tokens, or their favoured tool is blocked. It's not at all obvious the differences between the open-source models and the proprietary ones are worse than those day to day ones the proprietary companies inflict on us.

frodowtf2 · 2026-04-17T06:01:18 1776405678

> In both cases the code will be crap, as no model I've seen produces good code.

I'm wondering if you have actually used claude code because results are not so catastrophic as you describe them.

rstuart4133 · 2026-04-17T06:27:47 1776407267

I used LLMs to write what seems like far too many lines of code now. This is an example Opus 4.6 running at maximum wrote in C:

    if (foo == NULL) {
       log_the_error(...);
       goto END;
    }
    END:
    free(foo);

If you don't know C, in older versions that can be a catastrophic failure. (The issue is so serious in modern C `free(NULL)` is a no-op.) If it's difficult to get a `FOO == NULL` without extensive mocking (this is often the case) most programmers won't do it, so it won't be caught by unit tests. The LLMs almost never get unit test coverage up high enough to catch issues like this without heavy prompting.

But that's the least of it. The models (all of them) are absolutely hopeless at DRY'ing out the code, and when they do turn it into spaghetti because they seem almost oblivious to isolation boundaries, even when they are spelt out to them.

None of this is a problem if you are vibe coding, but you can only do that when you're targeting a pretty low quality level. That's entirely appropriate in some cases of course, but when it isn't you need heavy reviews from skilled programmers. No senior engineer is going to stomach the repeated stretches of almost the "same but not quite" code they churn out.

You don't have to take my word for it. Try asking Google "do llm's produce verbose code".

random_human_ · 2026-04-17T07:22:32 1776410552

Is foo a pointer in your example? Is free(NULL) not a valid operation?

rstuart4133 · 2026-04-17T07:49:09 1776412149

Yes `foo` is a pointer.

`free(NULL)` is harmless in C89 onwards. As I said, programmers freeing NULL caused so many issues they changed the API. It doesn't help that `malloc(0)` returns NULL on some platforms.

If you are writing code for an embedded platform with some random C compiler, all bets on what `free(NULL)` does are off. That means a cautious C programmer who doesn't know who will be using their code never allows NULL to be passed to `free()`.

In general, most good C programmers are good because they suffer a sort of PTSD from the injuries the language has inflicted on them in the past. If they aren't avoiding passing NULL to `free()`, they haven't suffered long enough to be good.

lelanthran · 2026-04-17T10:09:33 1776420573

> That means a cautious C programmer who doesn't know who will be using their code never allows NULL to be passed to `free()`.

If your compiler chokes on `free(NULL)` you have bigger problems that no LLM (or human) can solve for you: you are using a compiler that was last maintained in the 80s!

If your C compiler doesn't adhere to the very first C standard published, the problem is not the quality of the code that is written.

> If they aren't avoiding passing NULL to `free()`, they haven't suffered long enough to be good.

I dunno; I've "suffered" since the mid-90s, and I will free NULL, because it is legal in the standard, and because I have not come across a compiler that does the wrong thing on `free(NULL)`.

random_human_ · 2026-04-17T08:11:06 1776413466

So what would be the best practice in a situation like that? I would (naively?) imagine that a null pointer would mostly result from a malloc() or some other parts of the program failing, in which case would you not expect to see errors elsewhere?

rstuart4133 · 2026-04-17T11:21:58 1776424918

> imagine that a null pointer would mostly result from a malloc() or some other parts of the program failing, in which case would you not expect to see errors elsewhere?

Oh yes, you probably will see errors elsewhere. If you are lucky it will happen immediately. But often enough millions of executed instructions later, in some unrelated routine that had its memory smashed. It's not "fun" figuring out what happened. It could be nothing - bit flips are a thing, and once you get the error rate low enough the frequency of bit flips and bugs starts to converge. You could waste days of your time chasing an alpha particle.

I saw the author of curl post some of this code here a while back. I immediately recognised the symptoms. Things like:

    if (NULL == foo) { ... }

Every 2nd line was code like that. If you are wondering, he wrote `(NULL == foo)` in case he dropped an `=`, so it became `(NULL = foo)`. The second version is a syntax error, whereas `(foo = NULL)` is a runtime disaster. Most of it was unjustified, but he could not help himself. After years of dealing with C, he wrote code defensively - even if it wasn't needed. C is so fast and the compilers so good the coding style imposes little overhead.

Rust is popular because it gives you a similar result to C, but you don't need to have been beaten by 10 years of pain in order to produce safe Rust code. Sadly, it has other issues. Despite them, it's still the best C we have right now.

incrudible · 2026-04-17T07:50:44 1776412244

C is fundamentally a bad target for LLMs. Humans get C wrong all the time, so we can not hope the nascent LLM, which has been trained on 95% code that does automatic memory management, to excel here.

I always found myself writing verbose copypasta code first, then compress it down based on the emerging commonalities. I think doing it the other way around is likely to lead to a worse design. Can you not tell the LLM to do the same? Honest question.

rstuart4133 · 2026-04-17T08:55:03 1776416103

> I always found myself writing verbose copypasta code first, then compress it down based on the emerging commonalities. I think doing it the other way around is likely to lead to a worse design.

I do pretty much the same thing, which is to say I "write code using a brain dump", "look for commonalities that tickle the neurons", then "refactor". Lather, rinse, and repeat until I'm happy.

> Can you not tell the LLM to do the same?

You can tell them until you're blue in the face. They ignore you.

I'm sure this is a temporary phase. Once they solve the problem, coding will suffer the same fate as blacksmiths making nails. [0] To solve it they need to satisfy two conflicting goals - DRY the code out, while keeping interconnections between modules to a minimum. That isn't easy. In fact it's so hard people who do it well and can do it across scales are called senior software engineers. Once models master that trick, they won't be needed any more.

By "they" I mean "me".

[0] Blacksmiths could produce 1,000 or so a day, but it must have been a mind-numbing day even if it paid the bills. Then automation came along, and produced them at over a nail per second.

lelanthran · 2026-04-17T10:12:25 1776420745

> C is fundamentally a bad target for LLMs.

I found it exceptionally good, because:

a) The agent doesn't need to read the implementation of anything - you can stuff the entire projects headers into the context and the LLM can have a better birds-eye view of what is there and what is not, and what goes where, etc.

and

b) Enforcing Parse, don't Validate using opaque types - the LLM writing a function that uses a user-defined composite datatype has no knowledge of the implementation, because it read only headers.

taormina · 2026-04-17T15:22:59 1776439379

Having used Claude Code extensively, catastrophic is a perfect word to describe it.

rstuart4133 · 2026-04-16T07:33:37 1776324817

I gave up on the app completely when I placed an order via the app, was waved past the payment window, then the order window denied it was placed (and paid for). I showed them the phone with the order number still on it. They said it could be a screenshot. After arguing for a while, I drove away without food.

I eventually got a refund after digging throw their web site for an email address, and emailed them the statement showing where it has been paid. With the back and forward while they asked for evidence, it took over an hour of my time in the end to get the refund. It wasn't the money. It was the principle.

The app is by far the slowest, most unreliable way to place an order with them. Period. The next slowest (although far better) is the kiosks. They also unreliable when the printer doesn't work (which is most of the time), and you make the mistake of forgetting the receipt number. Other fast food outlets have solved this problem by getting you to enter your name. That's beyond McDonalds apparently. The fastest, and most reliable way by far is to talk to a human.

The order should be the reverse. It is beyond me how they get it so badly wrong. Maybe price discrimination is the reason. Nothing else makes much sense for an organisation of the size and resources of McDonalds.

rstuart4133 · 2026-04-15T02:19:21 1776219561

There is some truth to that, but to give a different perspective ...

A long, long time ago, back when VCS's were novel enough to be of academic interest, I read numerous papers describing what these VCS's could be. They talked in terms of change sets, and applying them like a calculus to source code. In the meantime those of us writing code people actually used were using sccs and rcs, or if you were really unlucky Visual SourceSafe. To us in the trenches those academics were dreamers in ivory towers.

With the passage of time we got new VCS that gradually knocked the rough edges off thoase early ones. svn gave us atomic commits, hg gave us a distributed VCS, git gave us that plus speed. But none came close to realising the dreams of those early academics. It seemed like it was an impossible dream. But then along came jj ... and those dreams were realised.

rstuart4133 · 2026-04-15T01:41:57 1776217317

They do publish binaries they work perfectly well on Linux. No need for cargo:

https://github.com/jj-vcs/jj/releases/tag/v0.40.0

rstuart4133 · 2026-04-15T01:33:53 1776216833

> Except there isn't any concept of a remote jj so you have to go through the painful steps of manually naming commits, pushing, pulling, then manually advancing the current working point to match.

All true. I ended up writing my own `jj push` and `jj pull` aliases that automated all this. They aren't simple aliases, but it worked. `jj push` for example "feels" very like `git push --force-with-lease`, except if you've stacked PR's it pushed all branches your on. It hasn't been a problem since.

I ended up wondering if they deliberately left the `jj pull` and `jj push` commands unimplemented just so you could write something custom.

> All files automatically committed is great until you accidentally put a secret in an unignored file in the repository folder.

    jj abandon COMMIT && jj git garbage-collect

> And adding to .gitignore fails if you ever want to wind back in history - if you go back before a file was added to .gitignore, then whoops now it isn't ignored

True, but how is this different to any other VCS?

misnome · 2026-04-15T11:02:39 1776250959

> True, but how is this different to any other VCS?

What other VCS behaves this way by standard? If it's not in e.g. .gitignore "git status" will show that it's aware of it, but won't automatically absorb them into the commit? https://github.com/jj-vcs/jj/issues/5596 doesn't seem to be the oldest instance of this problem, but does seem to be the current discussion, and there seems to be no consensus on how to fix.

Oh, and ISTR submodules were a pain to work with.

Edit: https://github.com/jj-vcs/jj/issues/323#issuecomment-1129016... seems likely to be the earliest mention of this problem

rstuart4133 · 2026-04-15T22:46:12 1776293172

> https://github.com/jj-vcs/jj/issues/5596 doesn't seem to be the oldest instance of this problem, but does seem to be the current discussion, and there seems to be no consensus on how to fix.

Ahh, sorry - I didn't understand the issue. Yes, that is going to catch you unawares.

I can't see a better solution than a warning. It looks to be an unavoidable negative side effect of jj's auto-commit behaviour, but auto-commits bring so many positives I'll wear the occasional stray add. Usually they are harmless and trivial to reverse - but I've made a mental note to revise my jj workflow. When I change .gitignore, do it in the oldest mutable commit.

> Oh, and ISTR submodules were a pain to work with.

Yes, they are, although that's no different to git. JJ's tracking of submodule hashes in the parent repository behaves exactly as you would expect. What jj doesn't have is a command to bring the submodule into sync with the parent repository, like git's submodule command.

However, I find git's submodule command behaviour borders on inexplicable at times. I wrote my own `jj submodule` alias that does one simple job - it checks out every submodule the parent repository owns, at the hash recorded by the parent. I find that far better than the git command, which follows the usual git pattern of being a complex mess of sub-sub-commands with a plethora of options.

If jj automatically recursively updated submodules when you moved to a different change-id, it would be most of the way there. If it added commands to add and remove submodule directories, it would be there, and the reasons people dislike git's submodules would be gone.

rstuart4133 · 2026-04-15T01:12:33 1776215553

What makes jj better requires a mindset change. When you start with jj, you use it as an alternate porcelain for git, meaning you just use the jj commands that map to the way you used git. You have to let go of that mindset. Until you do, you are still using those old git commands; they are just have prettier clothing. The prettier clothing is not worth the effort.

I don't know how to explain a mindset to you, so I'll give one example of something that sounds so grand, it seems impossible. (There are so many unusual aspects to jj, but hopefully this is one you can immediately relate to.) Git famously makes it hard to lose work, but nonetheless there are commands like `git reset --hard` that make you break out in a sweat. There is no jj command that destroys information another jj command can't bring back. And before you ask - yes of course jj has the equivalent of `git reset --hard`.

rstuart4133 · 2026-04-15T01:02:21 1776214941

I'm not sure what you are asking, but you want to look at how a particular jj change-id evolved over time, use `jj evolog`. By default that evolution is hidden. The `change-id/N` syntax has uses beyond conflicts.

rstuart4133 · 2026-04-15T00:49:31 1776214171

> Obviously I can’t argue against your lived experience, but it is neither typical nor common.

I consider myself a proficient jj user, and it was my lived experience too. Eventually you get your head around what is going on, but even then it requires a combination of jj and git commands to bring them into sync, so `jj status` and `git status` say roughly the same things.

The friction isn't that jj handles state differently, it's that `jj git export` doesn't export all of jj's state to git. Instead it leaves you in a detached HEAD state. When you are a newbie looking for reassurance this newfangled tool is doing what it claims by cross checking it with the one you are familiar with, this is a real problem that slows down adoption and learning.

There are good reasons for `jj git export` leaving it in a detached head state of course: it's because jj can be in states it can't export to git. If we had a command, say `jj git sync` that enforced the same constraints as `jj git push` (requiring a tracked branch and no conflicts) but targeted the local .git directory, it would bridge the conceptual gap for Git users. Instead of wondering why git status looks wrong, the user would get an immediate, actionable error explaining why the export didn't align the two worlds.

rstuart4133 · 2026-04-15T00:25:03 1776212703

> and it does, in my understanding, work.

I use submodules with jj, and jj saves and restores submodule hashes perfectly. What it doesn't do is manipulate the sub-repository from its parent. You can do that yourself using jj or git of course, which is what I ended up doing using a few scripts. The result ended up being more reliable than using git's submodule commands directly.

They can take all the time in the world to implement submodules as far as I'm concerned. jj's implementation of workspaces removes all of the hairs of git's worktrees. git submodule implementation has even more hairs than worktrees. If the jj developers need time to do as good a job on submodules as they did with workspaces, then I say give it to them.

steveklabnik · 2026-04-15T01:05:03 1776215103

Gotcha, thank you for the context.