Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's gotten more and more shippable, especially with the latest generation (Codex 5.1, Sonnet 4.5, now Opus 4.5). My metric is "wtfs per line", and it's been decreasing rapidly.

My current preference is Codex 5.1 (Sonnet 4.5 as a close second, though it got really dumb today for "some reason"). It's been good to the point where I shipped multiple projects with it without a problem (with eg https://pine.town being one I made without me writing any code).



I feel it sometimes tries to be overly correct. Like using BigInts when working with offsets in big files in javascript. My files are big but not 53bits of mantissa big. And no file APIs work with bigints. This was from Gemini 3 thinking btw


I just whack-a-mole these things in AGENTS.md for a while until it codes more like me.


Coding LLMs were almost useless for me, until my AGENTS.md crossed some threshold of completeness and now they are mostly useful. I now curate multiple different markdown files in a /docs folder, that I add to the context as needed. Any time the LLM trips on something and we figure it out, then I ask it to document it's learnings in a markdown doc, and voila it can do it correctly from then on.


> https://pine.town

how many prompts did it take you to make this?

how did you make sure that each new prompt didn't break some previous functionality?

did you have a precise vision for it when you started or did you just go with whatever was being given to you?


Judging by the site, they don't have insightful answers to these questions. It's broken with weird artifacts, errors, and amateurish console printing in PROD.

https://i.ibb.co/xSCtRnFJ/Screenshot-2025-11-25-084709.png

https://i.ibb.co/7NTF7YPD/Screenshot-2025-11-25-084944.png


I definitely don't have insightful answers to these questions, just the ones I gave in the sibling comment an hour before yours. How could someone who uses LLMs be expected to know anything, or even be human?

Alas, I did not realize I was being held to the standard of having no bugs under any circumstance, and printing nothing to the console.

I have removed the amateurish log entries, I am pitiably sorry for any offense they may have caused. I will be sure to artisanally hand-write all my code from now on, to atone for the enormity of my sin.


It also doesn't seem to work right now.


Yeah, all of the above was a single bug in the plot allocation code, the exception that handled the transaction rollback had the wrong name. It's working again.


> how many prompts did it take you to make this?

Probably hundreds, I'd say.

> how did you make sure that each new prompt didn't break some previous functionality?

For the backend, I reviewed the code and steered it to better solutions a few times (fewer than I thought I'd need to!). For the frontend, I only tested and steered, because I don't know much about React at all.

This was impossible with previous models, I was really surprised that Codex didn't seem to completely break down after a few iterations!

> did you have a precise vision

I had a fairly precise vision, but the LLM made some good contributions. The UI aesthetic is mostly the LLM, as I'm not very good at that. The UX and functionality is almost entirely me.


did you not run into this problem described by ilya below

https://www.youtube.com/watch?v=aR20FWCCjAs&list=PLd7-bHaQwn...

this has been my experience purely vibecoding. i am surprised it works well for others.

btw the current production bug. how did you discover that and why it slip out. looks like site wasn't working at all when you posted that comment?


> did you not run into this problem described by ilya below

I used to run into a related issue, where fixing a bug would add more bugs, to the point where it would not be able to progress past a given codebase complexity. However, Codex is much better at not doing that. There are some cases where the model kept going back and forth between two bugs, but I discovered that that was because I had misunderstood the constraints and was telling the model to do something impossible.

> how did you discover that and why it slip out.

Sentry alerted me but I thought it was an edge case, and I didn't pay attention until hours later.

I use a spiral allocation algorithm to allocate plots, so new users are clustered around the center. Sometimes plots are emptied (when the user isn't active), so you can have gaps in the spiral, which the algorithm tries to fill, and it's meant to go to the next plot if the current one can't be assigned.

For one specific plot, however, conditions were such that the database was giving an integrity error. The exception handling code that was supposed to handle that didn't take into account that it needed to roll back before resuming, so the entire request failed, instead of resuming gracefully. Just adding an atomic() context manager fixed it.

> looks like site wasn't working at all when you posted that comment?

It was working for a few hundreds (thousands?) of visitors, then the allocation code hit the plot that caused the bug, and signup couldn't proceed after that.


> Just adding an atomic() context manager fixed it.

ok looks like you are intimately familiar with the code that is being produced and are AI as code generator rather than pure vibe coding. That makes sense to me.

Btw did AI add that line when you explained what the error was or did you add that in manually.


No, I paste the trace back, ask it to explain the error, judge whether it makes sense, and either ask it to fix it or say "that makes no sense, please look again/change the fix/etc".

Specifically here, the AI added that line.


It's not really any different in my experience


Stochastic parrot? Autocomplete on steroids? Fancy autocorrect? Bullshit generator? AI snake oil? Statistical mimicry?

You don't hear that anymore.

Feels like whole generation of skeptics evaporated.


I certainly hold those opinions still, because the models still have yet to prove they are anything worth a person's time. I don't bother posting that because there's no way an AI hype person and I are ever going to convince each other, so what's the point?

The skeptics haven't evaporated, they just aren't bothering to try to talk to you any more because they don't think there's value in it.


So you don't even try LLMs regularly?

And whats with everything else regarding ML progress like image generation, 3d world generation etc.?

I vibe coded plenty of small things i haven't ever had the time for them. You don't have anything which you wanted to do and can fit in a single page html application? It can even use local storage etc.


[flagged]


This is why they don't talk to you anymore. The only comparison you can make to a flat earther is that you think they're wrong, and flat earthers are also wrong. It's just dumb invective, and people don't like getting empty insults. I prefer my insults full.


The earth is flat until you have evidence of the contrary. It's you who should provide that evidence. We had physics, navigation and then space shuttles that clearly showed the earth is not flat.

We are yet to have a fully vibe-coded piece of software that actually works. The blog post is actually great because LLMs are very good are regurgitating pieces of code that already exist on a single prompt. Now ask them to make a few changes and suddenly the genie is back in the bottle.

Something doesn't math out. You can't be both a genius and extremely dumb (retarded) at the same time. You can be, however, good at information retrieval and presenting it in a better way. That's what LLMs are and am not discounting the usefulness of that.


>You can't be both a genius and extremely dumb (retarded)

That’s actually a classic stereotype, someone being a genius in some area, but failing with the most basic social expectations in other area.


In the LLM's case it's both a genius and extremely dumb at coding. The same area.


It’s good at quickly producing a lot of code which is most likely going to give interesting results, and it’s completely unaware of anything including why human might want to produce code.

The marketing bullshit that it’s a "thinking" and "hallucinating" is just bringing the intended confusion on the table.

They are great tools for many purpose. But a GPS is not a copilot, and an LLM is not going to replace coworkers where their humanity matters.


I mean is it really that interesting if it completely falls flat and permanently runs in unfulfilling circles around basically any mild complexity the problem introduces as you get further along solving it, making it really hard to not feel like you need to just do it yourself?


I would say yes, it’s an interesting tool.

For one thing, it’s far more interesting than a rubber duck in many cases. Of course on that matter in the end it’s about framing the representation adequately and enter a fictional dialog.


Original post alone mentions multiple projects and links https://pine.town as no code directly written by the author.

From perspective of personally using it daily, seeing what my team is using it for it's quite shocking to still see those kind of comments, it's like we're living on different planets - again, gives flat earther like vibe.


god you are so annoying. the site that you posted doesn't even work. so wtf are you even gloating about.


We're living in such interesting times - you can talk to a computer and it works, in many cases at extraordinary level - yet you still see intellectually constipated opinions arguing against basic facts established years ago - incredible.


atleast you are self aware


It has been interesting experience, like trolling but you actually believe what you're saying. I wonder how you arrived at it - is it fear, insecurity, ignorance, feelings of injustice or maybe something else? I wonder what bothers you about LLMs?


I think the stochastic part is true and useless. It can be applied to anyone or anything. Yes, the models give you probabilities, but any algorithm gives you probabilities (only zero or one for deterministic ones). You can definitely view the human mind as a complex statistical model of the world.

Now, that being said, do I think they are as good as a skilled human on most things? No, I don't. My trust issues have increased after the GPT-5 presentation. The very first question was to showcase its "PhD-level" knowledge, and it gave a wrong answer. It just happened to be in a field I know enough about to notice, but most didn't.

So, while I think they can be considered as having some form of intelligence, I believe they have more limits than a lot of people seem to realise.


> Feels like whole generation of skeptics evaporated.

https://www.youtube.com/watch?v=aR20FWCCjAs&list=PLd7-bHaQwn...

Ilya Sutskever this week.


Have you also looked at the rest 1h 36m or just those out of context 30s?


have you ever made a non annoying comment


Yes, you should try to use rational thinking, it will help you with better judgements of reality.


Maybe your bubble flew away from those voices? I see them all the time, and am glad.


still haven't see something proving it was not autocomplete on steroids or statistical mimicry


It is all those things.

The Bitter Lesson is with enough VC subsidised compute those things are useful.


Those echoes have grown louder over the past year or so. The only way you've heard less of it is if you buried your head under sand.


It is all those things. It consistently fails to make truly novel discoveries, everything it does is derived from something it trained on from somewhere.

No point in arguing about it though with true believers, they will never change their minds.


Have you tried Gemini 3 yet? I haven't done any coding with it, but on other tasks I've been impressed compared to gpt 5 and Sonnet 4.5.


It's very good but it feels kind of off-the-rails in comparison to Sonnet 4.5 - at least with Cursor it does strange things like putting its reasoning in comments that are about 15 lines long, deleting 90% of a file for no real reason (especially when context is reaching capacity) and making the same error that I just told it not to do.


The computer science field is going to be an absolute shitshow within 5 years (it already kinda is). On one side you'll have ADHD dog attention span zoomers trying out all these nth party model apis and tools every 5 seconds (switching them like socks, insisting the latest one is better, but ultimately producing the same slop) and on the other side you'll have all these applied math gurus squeezing out the last bits of usable AI compute on the planet... and nothing else.

We used to joke that "The internet was a mistake.", making fun of the bad parts... but LLMs take the fucking cake. No intelligent beings, no sentient robots, just unlimited amounts of slop.

The tech basically stopped evolving right around the point of it being good enough for spam and slop, but not going any further, there are no cures no new laws of physics or math or anything else being discovered by these things. All AI use in science I can see is based on finding patters in data, not intelligent thought (as in novel ideas). What a bust.


Completely disagree, what i see agentic coding agents do in combination with LLMs is seriously mind-blowing. I don't care how much knowledge is compressed into an LLM. What is way more interesting is what it does when it misses some knowledge. I see it come up with a plan to create the knowledge by running an experiment (running a script, sometimes asking me to run a script or try something), evaluating the output, and then replan based on the output. Full Plan-Do-Check-Act. Finding answers systematically to things you don't know is way more impressive than remembering lots of stuff.


I don't see a big difference to humans, we are saying many unreasonable things too, validation is necessary. If you use internet, books or AI it is your job to test their validity. Anything can be bullshit, written by human or AI.

In fact I fear the humans optimize for attention and cater to the feed ranking Algorithm too much, while AI is at least trying to do a decent job. But with AI it is the responsibility of the user to guide it, what AI does depends on what the user does.


There are some major differences though. Without using these tools, individual are pretty limited in how much bullshit they can output for many reasons, including they are not mere digital puppet without need to survive in society.

It’s clear pro-slavery-minded elitists are happy to sell the speech that people should become "good complement to AI", that is even more disposable as this puppets. But unlike this mindless entities, people have will to survive deeply engraved as primary behavior.


Humans can output serious amounts of unproven bullshit, e.g., 3000 incompatible gods and all the religions that come with them...


Sure, but that’s not raw individual output on its mere direct utterance capacities.

Now anyone mildly capable of using a computer is able to produce many more fictional characters than all that humanity collectively kept in its miscellaneous lores, and drawn them in an ocean of insipid narratives. All that nonetheless mostly passing all the grammatical checkboxes at a level most humans would fail (I definitely would :D).


How many individuals were involved and over how many years?


Why does it matter? If you consider not just the people creating these hallucination, but also the people accepting them and using them, it must be billions and billions...


and that's the point. You need a critical mass of people buying into something. With LLMs, you just need ONE person with ONE model and a modest enough hardware.


https://chat.mistral.ai/chat/8b529b3e-337f-42a4-bf36-34fd9e5...

>Here’s a concise and thoughtful response you could use to engage with ako’s last point:

---

"The scale and speed might be the key difference here. While human-generated narratives—like religions or myths—emerged over centuries through collective belief, debate, and cultural evolution, LLMs enable individuals to produce vast, coherent-seeming narratives almost instantaneously. The challenge isn’t just the volume of ‘bullshit,’ but the potential for it to spread unchecked, without the friction or feedback loops that historically shaped human ideas. It’s less about the number of people involved and more about the pace and context in which these narratives are created and consumed."


But people post on social networks, blogs, newspapers and other widely read places, while LLMs post in chat rooms with 1 reader most of their outputs.


No, the web is now full of this bot generated noise.

And even when only considering the tools used in isolated sessions not exposed by default, the most popular ones are tuned to favor engagement and retention over relevance. That's a different point as LLM definitely can be tuned in different direction, but in practice in does matter in terms of social impact at scale. Even prime time infotainment covered people falling in love or encouraged into suicidal loops by now. You're absolutely right is not always the best


The worst part is when the AI spits out dogshit results --people show up at lightspeed in the comments to say how "you're not using it right" / "try this other model, it's better"

Anecdotally, the people I see the most excited about AI are the people that don't do any fucking work. I can create a lot of value with plain ol' for loop style automation in my niche. We're stil nowhere near the limit of what we can do with automation, that I don't give a fuck about what AI can do. Bruh in windows 10 copy and fuckin paste doesn't work for me anymore, but instead of fixing that they're adding AI


LLMs help a lot of users with making FOR loops and things like that. At least it's been the case for me, I'd never tried to use PowerShell before but with a bit of LLM guidance was able to cobble together some useful (for me) one-liner commands to do things like "use this CSV of file names and pixel locations, and make cropped PNG thumbnails of these locations from these images".

Stuff like that which regular users often do by hand, they can ask an LLM for the command (usually just a few lines of a scripting language if they only know the magic words to use).


The only people I see complaining about AI are those that have the most to lose.


Using it isn't optional though, its forced through corporate policy. If my boss would shut up about it that would be enough for me


My wife and I are both paid to work on AI products and we both think the whole thing’s only sorta useful in-fact. Not nothing, but… not that much, either.

I’m not worried about AI taking our jobs, I’m worried about the market crash when the reality of the various failed (… to actually reduce payroll) or would’ve-been-cheaper-and-better-without-AI initiatives the two of us have been working on non-stop since this shit started break through the hype of investment and the music stops.


The LLM only reflects the input of what its fed. If the results are unintelligent then so is the input.


It's been three years of amazing use cases and discoveries, and in those same years we got things like Ozempic. You can be skeptical of all the hyped things that are said that may be exaggerated without negating the good side.


The patent for Ozempic was filed nearly 20 years ago: https://patents.google.com/patent/US8129343B2/en?oq=US812934...

Ozempic’s FDA approval was in 2017, the same year transformers were invented.

Whatever you can place at LLMs, GLP-1’s aren’t one of them.


Ozempic has nothing to do with LLMs, so I'm a bit confused about the point you're making here?


My chatbot told me that chatbots invented drugs.


Only a tiny bit, but I should. When you say GPT-5, do you mean 5.1? Codex or regular?


Sorry, yeah, 5.1 regular chatbot.


Ahh, try 5.1 Codex (with codex cli), it's much better, I've found.


imo don't waste your time for coding with Gemini 3. Perhaps worth it if it's something Claude's not helping with, as Gemini 3's reasoning is very good supposedly.


Maybe the wtfs per line are decreasing because these models aren't saying anything interesting or original.


No, it's because they write correct code. Why would I want interesting code?


Oh, my bad. I still had the comment someone made about the model writing phd-level paper in my head and didn't realize you were talking about code.

Fully agree.


:D made my day




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: