Hacker Newsnew | past | comments | ask | show | jobs | submit | rotcev's commentslogin

This is exactly what I first thought. “The user appears to be attempting to decode my previous thought process, …”, the question is whether or not the model will be able to internalize this in such a way that is undetectable to the aforementioned technique.

PS: Just to be clear - even the most expensive humans are unreliable, would make stupid mistakes, and their output MUST be reviewed carefully, so you’re not any different either. You’re just a random next-thought generator based on neuron firing distributions with no real thought process, trained on a few billion years of evolution like all other humans.

Looks like you either have not worked with any human or with an LLM otherwise arriving at such a conclusion is damn impossible.

The humans I did work with were very very bright. No software developer in my career ever needed more than a paragraph of JIRA ticket for the problem statement and they figured out domains that were not even theirs to being with without making any mistakes and rather not only identifying edge cases but sometimes actually improving the domain processes by suggesting what is wasteful and what can be done differently.


I think you are very fortunate. I have worked with plenty of software developers like that, in fact, the overwhelming majority of them have been like that.

Then I was not the smartest person in the room could be the other possibility.

And yes, there were always incompetent folks but those were steered by smarter ones to contain the damage.


I can't tell if you're joking..

I have worked with people like this frequently. The ones you're always happy to see on the team.

Also worked with people who were frustrated that they had to force push git to "save" their changes. Honestly, a token-box I can just ignore, would be an upgrade over this half of the team.


I and everybody else here call BS on that. People make mistakes all the time. Arguably at similar or worse rates.

Uhh what, I speak to llms in broken english with minimal details and they figure it out better than I would have if you told me the same garbage

> The humans I did work with [...] figured out domains that were not even theirs to being with without making any mistakes

Seriously? I would like to remind you that every single mistake in history until the last couple of years has been made by humans.


Holy shit, you've never worked with anyone who made ANY mistakes? You must be one of those 10x devs I hear about. Wow, cool, please stay away from my team.

They're not, but all of their colleagues are.

I'm still not sure what people declaring that they equate human cognition with large language models think they are contributing to the conversation when they do so.

Nevermind the fact that they are literally able to introspect human cognition and presumably find non verbal and non linear cognition modes.


> Nevermind the fact that they are literally able to introspect human cognition and presumably find non verbal and non linear cognition modes.

Are they, though? Or are they just predicting their own performance (and an explanation of that performance) on input the same way they predict their response to that input?

Humans say a lot of biologically implausible things when asked why they did something.


I said introspect, not talk about introspection.

Humans can be held accountable. States have not yet shown the will to hold anyone accountable for LLM failures.

They are tools. You hold the human using it accountable. If that means it's the executive who signed the PO, so be it.

Until LLM's I'd never in my life heard someone suggest we lock up the compiler when it goofs up and kills someone, but now because the compiler speaks English we suddenly want to let people use it as a get out of jail free card when they use it to harm others.


You're free to hold an LLM accountable in the exact same way: fire it if you don't like its work.

Giving something that has no internal concept of time (or identity for that matter) a prison sentence of n years seems kinda ineffectual.

Prison sentence? For writing sloppy code? Now that's an interesting idea...

“Generate 100,000 tokens about why you feel bad.” :P

As fallible as they may be, I've never had a next-thought generator recommend me glue as a pizza ingredient.

No big brother or big sister?

Are you making the pizza for eating or for menu photography? I seem to recall glue being used in menu photography ‘food’ a lot.

You must not have kids

But once a human learns a function their errors are more predictable. And they can predict their own error before an operation and escalate or seek outside review/advice.

For e.g. ask any model "which class of problems and domains do you have a high error rate in?".


Amusing and directionally correct, but as random next-thought generators connected to a conscious hypervisor with individual agency,* humanity still has a pretty major leg up on the competition.

*For some definitions of individual agency. Incompatiblists not included.


Equating human thought to matrix multiplication is insulting to me, you, and humanity.

I hate that I agree with you. But there's a difference between whether AI is as powerful as some say, and whether it's good for humanity. A cursory review of human history shows that some revolutionary technologies make life as a human better (fire, writing, medicine) and others make it worse (weapons, drugs, processed foods). While we adapt to the commoditization of our skills, we should also be questioning whether the technologies being rolled out right now are going to do more harm than good, and we should be organizing around causes that optimize for quality of life as a human. If we don't push for that, then the only thing we're optimizing for is wealth consolidation.

Errr... No. Please take this bullshit propaganda to a billionaires twitter feed.

I use O3-pro not as a coding model, but as a strategic assistant. For me, the long delay between responses makes the model unsuitable for coding workflows, however, it is actually a feature when it comes to getting answers to hard questions impacting my (or my friend's/family's) day to day life.


This is the first article I’ve come across that truly utilizes LLMs in a workflow the right way. I appreciate the time and effort the author put into breaking this down.

I believe most people who struggle to be productive with language models simply haven’t put in the necessary practice to communicate effectively with AI. The issue isn’t with the intelligence of the models—it’s that humans are still learning how to use this tool properly. It’s clear that the author has spent time mastering the art of communicating with LLMs. Many of the conclusions in this post feel obvious once you’ve developed an understanding of how these models "think" and how to work within their constraints.

I’m a huge fan of the workflow described here, and I’ll definitely be looking into AIder and repomix. I’ve had a lot of success using a similar approach with Cursor in Composer Agent mode, where Claude-3.5-sonnet acts as my "code implementer." I strategize with larger reasoning models (like o1-pro, o3-mini-high, etc.) and delegate execution to Claude, which excels at making inline code edits. While it’s not perfect, the time savings far outweigh the effort required to review an "AI Pull Request."

Maximizing efficiency in this kind of workflow requires a few key things:

- High typing speed – Minimizing time spent writing prompts means maximizing time generating useful code.

- A strong intuition for "what’s right" vs. "what’s wrong" – This will probably become less relevant as models improve, but for now, good judgment is crucial.

- Familiarity with each model’s strengths and weaknesses – This only comes with hands-on experience.

Right now, LLMs don’t work flawlessly out of the box for everyone, and I think that’s where a lot of the complaints come from—the "AI haterade" crowd expects perfection without adaptation.

For what it’s worth, I’ve built large-scale production applications using these techniques while writing minimal human code myself.

Most of my experience using these workflows has been in the web dev domain, where there's an abundance of training data. That said, I’ve also worked in lower-level programming and language design, so I can understand why some people might not find models up to par in every scenario, particularly in niche domains.


> “I appreciate the time and effort the author put into breaking this down.”

Let’s be honest. The author was probably playing cookie clicker while this article was being written.


You might be interested in clojure.spec [1] too :)

[1] https://clojure.org/guides/spec


From personal experience, a functional programming style helps one to reason with the math used in artificial intelligence. This might be why lisp is considered one of the original AI languages. The power of being able to express the networks as purely lists of numbers is amazing in my opinion.

Correct me if I'm wrong but I actually think the reason LISP was created was for AI.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: