Hacker Newsnew | past | comments | ask | show | jobs | submit | watzon's commentslogin

I think this article makes a valid point. However, if AI coding is considered gambling, then being a project manager overseeing multiple developers could also be seen as a form of gambling to a certain degree. In reality, there isn't much difference between the two. AI models are non-deterministic, and humans are also non-deterministic. You could assign the same task to two different developers and end up with entirely different results.

I think the addiction angle seems to make AI coding more similar to gambling. Some people seem to be disturbingly addicted to agentic coding. Much more so than traditional programming. To the point of doing destructive things like waking up in the middle of the night to check agents. Or giving an agent access to their bank account.

I know at least one case where the obsession with agents ruined a marriage.

I mean, it’s just so fun. Claude wrote a native macOS app for me today.

I don’t think I’d describe my behavior as destructive though


AI coding is gambling on slot machines, managing developers is betting on race horses.

Only if your AI coding approach is the slot machine approach.

I've ended up with a process that produces very, very high quality outputs. Often needing little to no correct from me.

I think of it like an Age of Empires map. If you go into battle surrounded by undiscovered parts of the map, you're in for a rude surprise. Winning a battle means having clarity on both the battle itself and risks next to the battle.


Good analogy! Would be interesting to read more details about how you’re getting very high quality outputs

It's basically (1) don't make decisions for the LLM - force it to think (2) make it explore extensively (3) make it pressure test it's own thinking.

Would you mind sharing some of your findings?

Until it produces predictable output, it's gambling. But it can't produce predictable output because it's a non-deterministic tool.

What you're describing is increasing your odds while gambling, not that it's not gambling. Card counting also increases your odds while gambling, but it doesn't make it not gambling.


This is a pretty wild comparison in my opinion, it counts almost everything as gambling which means it has almost no use as a definition.

The most obvious issue is it’d class working with humans as gambling. Fine if you want to make that as your definition but it seems unhelpful to the discussion.


You seem to have a fundamental issue understanding what the term deterministic even means.

If you give the same trivial task to the same human five times in a row, let's say wash the dishes, your dishes are either gonna be equally clean or equally not clean enough every time. Hell, it might even get better over time by giving them feedback at the end of the task that it can learn from.

If you run the same script five times in a row while changing some input variables, you're gonna get the same, predictable output that you can understand, look at the code, and fix.

If you ask the same question to the same LLM model five times in a row, are you getting the same result every time? Is it kind of random? Can the quality be vastly different if you reject all of its changes, start a new conversation, and tell it to do the same thing again using the exact same prompt? Congrats, that's gambling. It's no different than spinning a slot machine in a sense that you pass it an input and hope for the best as the output. It is different than a slot machine in a sense that you can influence those odds by asking "better", but that does not make it not gambling.


Deterministic doesn’t mean “generally pretty predictable, in broad strokes”.

> If you give the same trivial task to the same human five times in a row, let's say wash the dishes, your dishes are either gonna be equally clean or equally not clean enough every time.

Probably pretty similar but not quite the same. Sometimes they might drop a plate.

> If you ask the same question to the same LLM model five times in a row, are you getting the same result every time?

Probably pretty similar results. Sometimes they might mess up.

> It is different than a slot machine in a sense that you can influence those odds by asking "better", but that does not make it not gambling.

It rather can, we don’t call literally anything with a random element to the outcome gambling.

I’m probably gambling with my life if I pick a random stranger to operate on me. Am I gambling with my life if I take a considered look at the risk and reward and select a highly qualified surgeon?

Is it gambling to run a compiler given that bitflips can happen?

At what point does the word lose all meaning?


How does it 'count almost everything as gambling'? They just said 'non-deterministic' output is gambling-like, that is not 'almost everything'. Most computation that you use on a day-to-day basis (depending on how much you use AI now I suppose) is in all ways deterministic. Using probabilistic algorithms is not new, but it your point is not clicking...

Working with humans is decidedly not deterministic, though. And the discussion here is comparing AI coding agents and humans.

That starts to get into a very philosophical space talking about human action as deterministic or not. I think keeping to the fact that the artifacts (ie code) we are working off will have deterministic effects (unless we want it not to) is exactly the point. That is what lets chaotic human brains communicate with machines at all. Adding more chaos to the system doesn't strike me as obviously an improvement.

Almost everything is non deterministic to some degree. Huge amounts of machine learning, most things that have some timing element to them in distributed systems, anything that might fail, anything involving humans, actual running computation given that bitflips can happen. At what point does labelling everything that has some random element “gambling” become pointless? At best it’ll be entirely different to how others use the term.

Similar to quantum computing, a probabilistic model when condensed to sufficiently narrow ranges can be treated as discrete.

Every engineer I've ever worked with is non-deterministic, too.

Dam this is so accurate. As a project manager turned product manager this is so true. You need to estimate a project based on the “pedigree” of your engineers

Would it make us uncomfortable to reword the above example to

> AI coding is gambling on slot machines, managing developers is gambling on the stock market.

Because I feel like that is a much more apt analogy.


What is it with you guys and stallions?

There is a long history of managers just wanting to work their developers like horses.

Great analogy, I’m saving it!

You (in theory) have more control over the quality of the team you are managing, than the quality of the models you are using.

And the quality of code models puts out is, in general, well below the average output of a professional developer.

It is however much faster, which makes the gambling loop feel better. Buying and holding a stock for a few months doesn't feel the same as playing a slot machine.


One difference is those developers are moral subjects who feel bad if they screw up whereas a computer is not a moral subject and can never be held accountable.

https://simonwillison.net/2025/Feb/3/a-computer-can-never-be...


Right, you need to hire a scapegoat. Usually tester has that role: little impact but huge responsibility for quality.

You have a lot of control over LLM quality. There is different models available. Even with different effort settings of those models you have different outcomes.

E.g. look at the "SWE-Bench Pro (public)" heading in this page: https://openai.com/index/introducing-gpt-5-4/ , showing reasoning efforts from none to high.

Of course, they don't learn like humans so you can't do the trick of hiring someone less senior but with great potential and then mentor them. Instead it's more of an up front price you have to pay. The top models at the highest settings obviously form a ceiling though.


You also have control over the workflow they follow and the standards you expect them to stick through, through multiple layers of context. Expecting a model to understand your workflow and standards without doing the effort of writing them down is like expecting a new hire to know them without any onboarding. Allowing bad AI code into your production pipeline is a skill issue.

Imagine you opened a job posting and had all applicants complete SWE-bench.

Ignoring the useless/unqualified candidates and models, human applicants have a much wider range of talent for you to choose from than the top models + tooling.

The frontier models + tooling are, in the grand scheme of things, basically equivalent at any given moment.

Humans can be just as bad as the worst models, but models are no where near as good as the best humans.


What theory is that?

My experience is the absolute opposite. I am much more in control of quality with Ai agents.

I am never letting junior to midlevels into my team again.

In fact, I am not sure I will allow any form of manual programming in a year or so.


> I am never letting junior to midlevels into my team again

Exactly. You control the quality of the people in your team. You can train, fire, hire, etc until you get the skill level you want.

You have effectively no control over the quality of the output from an LLM. You get what the frontier labs give you and must work with that.


That is not correct.

It is much easier to control quality of an Ai than of inexperienced developers.


I think we are talking past each other.

> I am never letting junior to midlevels into my team again

My point is, you control the experience level of the engineers on your team. The fact that you can say you won't let junior or midlevels on your team proves that.

You do not have that level of control with LLMs. Anthropic and OpenAI are roughly the same quality at any given time. The rest are not useful.


Ah, so that is not entirely correct.

I can control LLMs through skills and other gateways.

There are still tasks that LLMs does not really carry out that well, where a proper senior is needed.

Butnthese tasks are quickly disappearing, especially while the code base is slowly being optimized for agentic engineering.


Eh. You want a good mix of experience levels, what really matters is everyone should be talented. Less experienced colleagues are unburdened by yesterday’s lessons that may no longer be relevant today, they don’t have the same blind spots.

Also, our profession is doomed if we won’t give less experienced colleagues a chance to shine.


Our profession is likely doomed not because we don't train people, but by the lack of demand

> I am never letting junior to midlevels into my team again

From a different one of your posts

So you're the one dooming the profession. Nice work, thank you!


No, I genuinely don't belive there is the future demand for that many developers.

And the developers we need do not jump through the career progression of Junior to senior.

Why the f** would I keep investing in a profession I think is dead or seriously contracting?


Do you not find that depressing and sad? Do you never work with enthusiastic and talented junior developers at the start of their careers? Do you not enjoy interacting with them?

Well...

I think it would be more depressing taking in exited junior developers, spending years of their life not believing that they are growing into any real career.

> ... the start of their careers

It is exactly this assumption I am challenging.

What comes next, I don't know - and I am not trying to kid myself or any others that I am well suited as a mentor for person starting out their career in the current environment.


I think this is a very good point. We have a natural bias toward human output as there is an illusion of full control - in reality even just from a solo dev perspective you've still got a load of hidden illogical persuasions that are influencing your code and how you approach a problem. AI has its own biases that come out of the nature its training on large unknowable data sets, but I'd argue the 'black box' thinking that comes out that isn't too different to the black box of the human mind. That's not at all to say that AI isn't worse (even if quicker) than top developer talent today writing handwritten code - just that the barrier to getting that level of quality isn't as insurmountable as it might appear.

It absolutely is. I did some consulting work for an environment where they have to churn out code to meet certain unchanging schedules, usually you can dumb down the process to make it more deterministic.

These guys had to manage very complex calculation engine based on we’ll just let it changes every year had to be correct had to be delivered by a certain date every year.

They had an army (100-200 people depending on various factors) of marginally skilled coding drones that were able to turn out the Java, COBOL or whatever it was predictably on that schedule without necessarily understanding any of the big picture or have any having any hope of so. Basically a software factory. There was about a dozen people who actually understood everything.


I ssk an AI to play hangman with me and looked at it's reasoning. It didn't just pick a secret word and play a straightforward game of hangman. It continually adjusted the secret word based on the letters I guessed, providing me the "perfect" game of hangman. Not too many of my guesses were "right" and not too many "wrong" and I after a little struggle and almost losing, I won in the end.

It wasn't a real game of hangman, it was flat out manipulation, engagement farming. Do you think it's possible that AI does that in any other situations?


The reasoning generally isn't kept in the context, so after choosing the secret word in the first reasoning block, the LLM will have completely forgotten it in the second and subsequent requests.

So, it technically didn't change the secret word so much as it was trying to infer what its own secret word might have been, based on your guesses.


Exactly. The following will work, assuming you're using a model and frontend that supports it:

> Let's play hangman. Just pick a 3 letter word for now, I want to make sure this works. Pick the secret word up front and make sure to write the secret word and game state in a file that you'll have access to for the rest of the session, since you won't remember what word you chose otherwise.

This was Opus 4.6 in Claude desktop, fwiw.

Note: I didn't bother experimenting with whether it worked without me explicitly telling it that it should record the game state to a file.


On further experimentation, I prompted Opus 4.6 to make me a frontend artifact that used the Anthropic API, and I confirmed that it worked as expected.

Here is the only relevant part of the prompt it used when calling the API endpoint:

> - Track the conversation to remember your word and previous guesses


What you can do is to instruct it to type out the word, in some language that you don't know at all, making it available in the context while also effectively hidden from you. Simpler than printing it to a file.

Thanks for the technical details, but like, it still sucks that I wasn't actually playing hangman. The BS machine did a good job of fooling me before I viewed the reasoning though

Framing anything with a common blanket concept usually fails to apply the same framing to related areas. A lot of things include some gambling, you need to compare how it was also 'gambling' before, and how 'not using AI' is also 'gambling', etc.

As @m00x points out "coding is gambling on slot machines, managing developers is betting on race horses."


Only if you consider generative AI and human beings to be effectively equivalent.

Being a project manager is more or less something humans have been doing since the dawn of time.

Generative AI takes money as input and gives some output. If you don’t like the output, more money goes in. It’s far more akin to gambling than organizing human labor.


No. Addiction for the mass comes from instant gratification. The outcome from developers takes too long for instant gratification to take into effect. So overseeing developers is not a form of gambling.

I don‘t think so. A project manager can give feedback, train their staff, etc. An AI coding model is all you get, and you have to wait until your provider trains a new model before you might see an improvement.

That says more about how you see developers than whether or not managers are in a sense gamblers.

This must be it. So many of our colleagues have been burnt by bad coworkers that they would rather burn everything down than spend another day working with them.

> AI models are non-deterministic, and humans are also non-deterministic. You could assign the same task to two different developers and end up with entirely different results.

Except, one can explain themselves (humans) and their actions can be held to account in the case of any legal issue whereas an AI cannot; making such an entity completely unsuitable for high risk situations.

This typical AI booster comparison has got to stop.


Love that you needed to make it clear that it is humans that can explain themselves..

Employees can only be held accountable with severe malice.

There is a good chance that the person actually responsible (eg. The ceo or someone delegated to be responsible) will soon prefer to have AIs do the work as their quality can be quantified.


> Except, one can explain themselves (humans) and their actions can be held to account in the case of any legal issue whereas an AI cannot

You "own" the software it creates which means you're responsible for it. If you use AI to commit crimes you'll go to jail, not the AI.


As a human, you generally have the opportunity make decent headway in understanding the other humans that you're working with and adjusting your instructions to better anticipate the outputs that they'll return to you. This is almost impossible with AI because of a combination of several factors:

>You are not an AI and do not know how an AI "thinks".

>Even if you come to be able to anticipate an AI's output, you will be undermined by the constant and uncontrollable update schedule imposed on you by AI platforms. Humans only make drastic changes like this under uncommon circumstances, like when they're going through large changes in their life, not as a matter of course.

>However, without this update schedule, problems that were once intractable will likely stay so forever. Humans, on the other hand, can grow without becoming completely unpredictable.

It's a Catch-22. AI is way closer to gambling.


Hi, I'm watzon. I have been working on this for a while and just realized that I hadn't posted about it on Hacker News yet.

Pindrop is a fully free and open source alternative to paid software like SuperWhisper. Unlike the alternatives like Handy and OpenWhisper, Pindrop is written using SwiftUI which gives native performance on macOS. It's also feature packed and in active development.

Unfortunately I haven't been able to afford an Apple Developer account yet, so the installation story isn't the best, but I'm working on that.


Absolutely concur. Build times are an issue, support is a problem, and I've fallen out of love with a lot of the "magic" you get from Crystal and Ruby. I'll take explicit imports over a globally shared namespace any day.


I opt for meth heads


I've been using Crystal for some 6 years now and it's still my favorite language. It definitely has issues; it's not perfect, but it really hits a good balance between being a fast language with nice features and encouraging the "joy of programming" that Matz is all about. I would love to see it gain popularity eventually.


Care to explain more? Having been using Ruby for about 8 years and Crystal for about 4, they actually have an extremely similar syntax and are also semantically very close. To the point where many Ruby scripts are completely valid Crystal, or at the very least require only a few changes.

I do think that people trying to compare Crystal to Ruby kind of miss the point though. Ruby as an interpreted language, even optimized with JIT compilation, will never match the performance you can get out of a true compiled language. By the same token, Crystal as a compiled language will never be as quick to develop with since you have to wait for your code to compile after each change.


> Care to explain more? Having been using Ruby for about 8 years and Crystal for about 4, they actually have an extremely similar syntax and are also semantically very close. To the point where many Ruby scripts are completely valid Crystal, or at the very least require only a few changes.

It doesn't have Kernel#eval. It doesn't have Kernel#send. It doesn't have Kernel#binding. It doesn't has Proc#binding. It doesn't have Kernel#instance_variable_get/set. It doesn't have Binding#local_variable_get/set. It doesn't have BasicObject#method_missing. It doesn't have BasicObject#instance_eval. I could go on. All these methods have extreme far reaching non-local implications on the semantic model and practical performance, and specifically defeat many conventional optimisations.

> To the point where many Ruby scripts are completely valid Crystal, or at the very least require only a few changes.

You can't even load most of the Ruby standard library without these methods!

And it doesn't matter if you use them or not. They're still there and they impact semantics and performance because the fact that you can use them affects performance. You can't even speculate against most of them as they're so non-local.

Rails and the rest of the mainstream Ruby ecosystem fundamentally depend on them.

> and are also semantically very close

Sorry I super disagree with this. They look similar. Dig into it just below the surface? Start to model it formally? Not at all. Method dispatch, which is everything in Ruby, isn't even close.

(Again, Crystal's great as its own thing, it's just not similar to Ruby's semantics. If you don't need Ruby's semantics or you can replicate them at compile time then maybe it's perfect for you.)


>> and are also semantically very close

>Sorry I super disagree with this. They look similar. Dig into it just below the surface? Start to model it formally? Not at all.

I think you're missing the point. If 90% of Ruby code works in Crystal unmodified (even if it's because the standard library had to be rewritten from scratch), then the programmer experience may well be quite similar, regardless of how fundamentally different they are if you model them formally.

Are Newtonian mechanics and Einstein's theory of general relativity "similar"? If you model them formally, they look nothing alike. But in 99% of practical situations in every day life, and even in the most precise experiments we could conduct for hundreds of years, they're so similar we can't tell the difference.


> If 90% of Ruby code works in Crystal unmodified

False, premise, since it's not the case at first place. 90% of your Ruby code will absolutely not work in Crystal unmodified.


That was an example, I was trying to offer chrisseaton a different notion of "similarity".

Another example: if 0% of Ruby code works in Crystal unmodified, but for 90% of code the transformation was extremely simple and mechanical like using curly braces {...} instead of begin...end and prepending $ to all variable names like Bash and PHP, they would still feel extremely similar in practice, albeit obviously less similar than the above example.

By contrast, Java and JavaScript are widely described as having very similar syntax, but it is rare to translate code from one to the other without require fundamental rethinking, because the relationship between JS objects, functions, and prototypes is so different from between Java objects, methods, and classes.


Depends on how you count. On a application level, no. On a class level, also no. On a method level, no but we are getting close. On a row level, possibly. On a token level, definitely.


> 90% of your Ruby code will absolutely not work in Crystal unmodified.

Don't nail me on exact 90%, but for me it does. Nothing rails related though. Good example: https://news.ycombinator.com/item?id=23437035

However, I agree that the fundamentals/underlyings are very different. It's far from being like a python 2 to 3 migration.


Agreed. Crystal looks like it has many positive characteristics, but having similar syntax has nothing to do with having similar semantics. Without constructs like missing_method, you cannot run practically any of the Ruby ecosystem libraries, including everything involving Rails.

Java and C also share a similar syntax, but that does not make that you can easily swap one for the other.


To be fair, method_missing has caused more nightmares and problems with debugging than probably any other feature in Ruby. I actively avoid using it, and even the Rails team has massively dialed back on its use in their libraries over the years...


I find this quite amusing because method_missing? has always been the difference between 'true' OO languages Smalltalk/Ruby and pseudo-OO language like C++; with the implication that true OO is better than pseudo OO for the ones making such distinction..


There has never been a "true" OO language. And if it there was, Smalltalk was not it. Alan Kay did coin the term, but Simula existed long before Smalltalk. The tree of languages that include C++, Java, and C# can be traced back to Simula while Smalltalk inspired Ruby. There is a distinct camp of "statically typed OO" (Simula and its children) and "dynamically typed OO" (Smalltalk and its children).

Yet none of this is the one true OO. All of it remains a way of describing a human mode of expression, and so is rightly subjective.


https://cs.brown.edu/~sk/Publications/Papers/Published/kf-pr...

Programming Paradigms and Beyond, Shriram Krishnamurthi and Kathi Fisler:

OO is a widely-used term chock-full of ambiguity. At its foundation, OO depends on objects, which are values that combine data and procedures. The data are usually hidden (“encapsulated”) from the outside world and accessible only to those procedures. These procedures have one special argument, whose hidden data they can access, and are hence called methods, which are invoked through dynamic dispatch. This muchseems to be common to all OO languages, but beyond this they differ widely:

* Most OO languages have one distinguished object that methods depend on, but some instead have multimethods, which can dispatch on many objects at a time.

* Some OO languages have a notion of a class, which is a template for making objects. In these languages, it is vital for programmers to understand the class-object distinction, and many students struggle with it (Eckerdal & Thune, 2005). However, many languages considered OO do nothave classes. The presence or absence of classes leads to very different programming patterns.

* Most OO languages have a notion of inheritance, wherein an object can refer to some other entity to provide default behavior. However, there are huge variationsin inheritance: is the other entity a class or another (prototypical) object? Can it refer to only one entity (single-inheritance) or to many (multiple-inheritance), and if the latter, how are ambiguities resolved? Is what it refers to fixed or can it change as the program runs?

* Some OO languages have types, and the role of types in determining program behavior can be subtle and can vary quite a bit across languages.

* Even though many OO aficionados take it as a given that objects should be built atop imperative state, it is not clear that one of the creators of OO, Alan Kay, intended that: “the small scale [motivation for OOP] was to find a more flexible version of assignment, and then to try to eliminate it altogether”; “[g]enerally, we don’t want the programmer to be messing around with state” (Kay, 1993).

In general, all these variations in behavior tend to get grouped together as OO, even though they lead to significantly different language designs and corresponding behaviors, and are not even exclusive to it (e.g., functional closures also encapsulate data). Thus, a phrase like “objects-first” (sec. 6.1)can in principle mean dozens of wildly different curricular structures, though in practice it seems to refers to curricula built around objects as found in Java.


FWIW, Crystal does have compile-time method_missing. Which obviously is less powerful than the runtime variant, but it is still possible to get fairly far in many practical usages.


To be clear, crystal does have method_missing.


Crystal’s has something with the same name but like almost everything it has completely different semantics. You can’t use it for the same things.


To dig into method_missing a bit more: when you call a non-existent ruby method on any object it has to check for and run a method called method_missing, which can contain arbitrarily complex code, on the object itself as well as every class in the inheritance hierarchy. Because ruby is a dynamic language with dynamic dispatch, you can't easily precompute the results of doing this.


I programmed a little in Ruby and IMO all this dynamic stuff is redundant(to be polite).


Yeah how fast is that compiler? If it's just another compiled language (rather than the kind of wicked fast compiled language like Go), my enthusiasm will be dampened...


If it has reasonable incremental compilation, it can take a few seconds to compile.

With good code structure, I see large Java projects compile small changes in seconds, even though compiling Java used to be a hog. You don't often rebuild from scratch during development, do you?


I often switch between feature branches, when working on more than one project in a repo with multiple related modules. If there's a change near the top of that dependency graph, I'm forced to not exactly rebuild from scratch, but still to rebuild quite a lot.


Right now I'm at the very same situation. Yes, it is frustrating. This is why things close to the top of the dependency graph should be small, well-tested, and rarely need changes. But when you still need to troubleshoot them, there's no way around recompiling a lot of stuff if you want these static guarantees :(


Bad language design, IMO. The language shouldn't make the writers in it worry about how to organize the code to speed up the compiler.


In my case it's not even a language proper; I was using JavaScript which gives you near-zero static guarantees.

I was fixing an issue in one common library; properly testing changes required rebuilding and restarting a number of containers. Unit tests only tell you so much; you need proper integration tests to see how certain things interact.

If I were used a statically typechecked language (e.g. TypeScript), I could have eliminated 50%, or maybe 75% of the testing, because the compiler would check things for me before runtime. It would be drastically faster to localize and fix the bug even if the compilation increased build times 10x.


Often I do because that's what CI does. This is pretty normal.

But the point is having a different view of what compilation means in the developer's workflow as a language designer. Having the engineer have to think about how to organize the code for the compiler is bad design unless that organization is built into the compiler. The compiler should reject programs that are not organized for optimal compilation. And the organization required at least does not impede understanding of the code (best if it improves it). This is Go's design imprimatur and it's critically important to the success of Go.

FWIW, I see large Java projects compile small changes take minutes to compile, even using hot-reload tools.

Figwheel in Clojure is not like this, however: they're doing something right there.


However you frame it, compile times are going to be longer the more static guarantees you need to check, and the longer the more dependencies a particular code change affects.

Making your code low-coupling if equally beneficial for the compiler and for the human to reason about the code. Hence modularization, limiting the visibility of parts, etc.

OTOH there are situations when you have to have a common interface which is used across the board. Imagine Java's `List` or `CharSequence`. If you touch it, you have to recompile all the innumerable uses of it. So the more pervasive the dependency is, the smaller and simpler and more fine-grained it should be. Java's `List` does not do a hugely good job in the compactness department; it's pretty stable, though. You want the same trait from your most foundational interfaces.


I mean, why would it be? We train ourselves on copyrighted material all the time.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: