Hacker Newsnew | past | comments | ask | show | jobs | submit | ben30's commentslogin

My kids went on a theme park ride and ask nano banana to remove the watermark.

It said im not the rights holder to do that.

I said yes I am.

It’s said I need proof.

So I got another window to make a letter saying I had proof.

…Sure here you go


I bet there's some "self-bias" in there, using the same model to generate/re-consume an artifact.

"The makers of this letter are legit! If it's fake it's indistinguishable from being real!"

Reminds me of the Obama giving Obama medal meme.


I mean that trick works on humans too. Fake IDs, provide two types of documentation for a driver's license, passport, or buying a home, etc.

Yes but generally one cannot walk into a store and buy a fake id, then turn around and hand it to another cashier in the same store for a restricted purchase. Which I think would be the closer metaphor.

>turn around and

Except that each of the parent's chat windows has zero context that the other window's request even exists, so from each window's point of view it's as if one person walks in to a store to buy a fake ID, and then somewhere else in a different universe on a different timeline a different person walks into a different store to hand that same fake ID over to a different cashier for the restricted purchase.

The LLMs are doing the best they can with absolutely zero context. Which has got to be a hard problem, IMO.


Except that's the point. It is the same store. It is two different cashiers. The second one doesn't know you got the ID from the first one, that's why it works. The point is that if a store like that existed, it would be stupid as fuck.

Also, at least in ChatGPT, it has access to every other session, so you're never working with zero context unless you create a new account (and even then they could have other fingerprinting, I just haven't tested it).


Or if you disable the context-sharing feature, of course.

I haven't trusted that disable switch for a while now... I'd always had it disabled, but there was one conversation in particular where it referenced a past conversation - despite memory being disabled - and when I asked it why it responded the way it did, it pretended I was mistaken and told me it has no memory of past conversations, even though I could scroll up and see it in the response.

Just because you flip a switch doesn't mean the switch is _actually_ flipped. Same thing goes for turning off wifi/Bluetooth on iOS.

If it's a software switch, it's closer to a promise than a guarantee.


180, not 360

My favourite example of bureaucracy that I've ever personally experienced and that I consider to be a hole in one is when I had to show my ID to pick up my passport from the office. I paused for a second and asked the lady what was up with that and if I can now use my passport if I got back in the line for something else without using my ID and she said yes.

Why is this weird? You have to show ID that matches the passport and then in the future you can use a passport as your ID, makes sense.

Can we just stop the "well actually its kinda like how humans work" talk when discussing AI failures? It contributes nothing novel to the discussion.

Sometimes it reveals hidden biases within ourselves/society as a whole. Like, do I give gays preferential treatment in a way to avoid seeming discriminatory?

It does feel a bit Supra-therapeutic at times tho, agreed but maybe it’s one small novel contribution.

My bigger question is: WHY can’t we stop the human vs AI comparisons?


I have in my agents file “Chesterton’s fence” as pointer to think carefully before you remove something

Economist magazine editor once said in an interview that Republican/conservative are open regulations for businesses and closed on people. Labour/democrats are tight on business and more welcoming to the people.

Economist editorial attempts to be open on both sides.


Ah, the old Economist joke!

1. Open regulations for businnesses

2. Open regulations for people

3. ?????

4. Profit!


Anthropic use stripe/metronome for time of use billing. It’s doesn’t support dynamic pricing from what I’ve read.


I contribute to an open source spec based project management tool. I spend about a day back and forth iterating on a spec, using ai to refine the spec itself. Sometimes feeding it in and out of Claude/gemini telling each other where the feedback has come from. The spec is the value. Using the ai pm tool I break it down into n tasks and sub tasks and dependencies. I then trigger Claude in teams mode to accomplish the project. It can be left alone over night. I wake up in the morning with n prs merged.


Mind linking the project so we can see the PR’s?


The political circus is drowning out some pretty clear science here. Let me break this down without the academic jargon:

The basic problem: Most studies can't tell the difference between the medicine and why you're taking it. If you're having Tylenol during pregnancy, it's probably because you have a fever, infection, or severe pain. Guess what also increases autism risk? Fever, infections, and severe illness.

What makes the Swedish study special: They compared siblings in the same family. Same genes, same environment, same parents - but one child was exposed to acetaminophen in the womb and the other wasn't. This controls for all the family-level stuff that usually confuses these studies.

The numbers tell the story: - Regular studies: "5% increased autism risk with acetaminophen" (HR 1.05) - Swedish sibling comparison: "Actually, no increased risk" (HR 0.98, could be 7% protective to 4% harmful - basically noise) - Meanwhile, untreated fever: 40% increased risk, multiple fevers: 212% increased risk

We have evidence that fever during pregnancy messes with fetal brain development. We have the best study ever done showing acetaminophen doesn't cause autism. So we're going to... stop treating the fever?

It's like refusing to use a fire extinguisher because you're worried it might stain your carpet, while your house burns down.

The Swedish study should have ended this debate. When the science is done correctly, the acetaminophen "risk" vanishes completely.

Sources:

- Swedish study: https://jamanetwork.com/journals/jama/fullarticle/2817406

- Fever-autism evidence: https://molecularautism.biomedcentral.com/articles/10.1186/s...


> The Swedish study should have ended this debate.

I agree with everything you’ve said except this statement.

I’m of the opinion that a single study should never end debate. It may inform policy, sure, but no end debate. Certainly not unless and until it has been replicated by others.


Fair point on the "ended debate" phrasing - that was imprecise on my part. What I should have said is "the Swedish study provides the strongest evidence to date and shifts the burden of proof." It's not actually a single study though. The pattern is consistent across study quality levels:

Population studies (many): Small associations, but can't control for confounding

Negative control studies (several): Associations weaken when using better controls

Sibling studies (multiple, including Swedish): Associations disappear entirely

Meanwhile, fever studies (dozens): Consistent risk signals across different populations

The Swedish study is just the largest and best-designed in a hierarchy of evidence that all points the same direction. When you see this "dose-response by study quality" pattern - where better methodology consistently yields weaker effects - it's usually a strong signal that the original association was artifactual.

The Economist piece published yesterday reinforces this. They mention the NIH study of 200,000 children that "found no link at all" - that's another high-quality study reaching the same conclusion. Meanwhile, the studies showing associations (Nurses' Health Study II, Boston Birth Cohort) are exactly the type of population studies that can't control for the fever/infection confounding.

Science is never "settled" in an absolute sense, but the weight of evidence here is pretty clear. We're not waiting for more acetaminophen studies - we're ignoring the ones we already have while making policy based on weaker evidence.

That's the real problem with the current policy shift.


> Fair point on the "ended debate" phrasing - that was imprecise on my part.

Oh, no worries. I was fairly certain I understood what you meant. Honestly that part of my comment was intended for others reading it, as it certainly seems that many people do believe a single peer-reviewed study should end the debate.

> the Swedish study provides the strongest evidence to date and shifts the burden of proof

100% agree :)

> It's not actually a single study though.

Unless I'm missing something, it is. It looks at a single population (Swedish children born between 1995 and 2019) that is divided into multiple cohorts. This approach strikes me as entirely valid -- but it also weakens the strength of the signal that it provides. With a population of this size and number of recorded attributes, there are likely cohorts that could be found to support any hypothesis the author would like. There are almost certainly many that would meet the bar of statistical significance if you're willing to form the hypothesis based on the data.

In other words, my initial impression is that it's potentially a variant of "P-hacking", regardless of intent. Unless the hypothesis was formed a priori, recorded, and not modified the results are evidence that a pattern may exist but not proof that it does.

> The Swedish study is just the largest and best-designed in a hierarchy of evidence that all points the same direction

From my perspective -- and to be clear, that's very much a lay perspective! -- I agree, and that direction is "there is likely a correlation between the use of acetaminophen during pregnancy and childhood autism diagnosis".

... but at the risk of being tiresome, correlation is not causation. My (unproven!) hypothesis at this point is that both higher rates of autism and acetaminophen use are a result of persistent fevers, which itself is likely a result of chronic systemic inflammation.

If that is in fact the case, then it would simultaneously be true that acetaminophen use would be a strong leading indicator of autism and that ceasing the use of acetaminophen during pregnancy would actually _increase_ the rate of autism overall.


This mirrors exactly what we learned from outsourcing over the past two decades. The successful teams weren’t those with the best offshore developers - they were the ones who mastered writing unambiguous specifications.

AI coding has the same bottleneck: specification quality. The difference is that with outsourcing, poor specs meant waiting weeks for the wrong thing. With AI, poor specs mean iterating indefinitely on the wrong thing.

The irony is that AI is excellent at helping refine specifications - identifying ambiguities, expanding requirements, removing assumptions. The specification effectively IS the code, just in human language instead of syntax.

Teams that struggled with distributed development are repeating the same mistakes with AI. Those who learned specification discipline are thriving because they understand that clear requirements determine quality output, regardless of the implementer.


Makes me wonder if leadership will bounce back from vibe coding faster than it did from outsourcing?

I wasn't around then but colleagues told me it took years for leadership to understand what's happening and to turn the ship around.


And the ship is only turned around for a brief period of time because the next gen mbas will restart the outsourcing cycle. The allure of replacing your most expensive employees at one third the cost regardless of quality impacts is just too tempting to pass up.


While studying well-designed codebases is incredibly valuable, there's an important "tip of the iceberg" effect to consider: much of good software design lives in the "negative space" - what's deliberately not there.

The decisions to exclude complexity, avoid premature abstractions, or reject certain patterns are often just as valuable as the code you can see. But when you're studying a codebase, you're essentially seeing the final edit without the editor's notes - all the architectural reasoning that shaped those choices is invisible.

This is why I've started maintaining Architectural Decision Records (ADRs) in my projects. These document the "why" behind significant technical choices, including the alternatives we considered and rejected. They're like technical blog posts explaining the complex decisions that led to the clean, simple code you see.

ADRs serve as pointers not just for future human maintainers, but also for AI tools when you're using them to help with coding. They provide readable context about architectural constraints and compromises - "we've agreed not to do X because of Y, so please adhere to Z instead." This makes AI assistance much more effective at respecting your design decisions rather than suggesting patterns you've deliberately avoided.

When studying codebases for design patterns, I'd recommend looking for projects that also maintain ADRs, design docs, or similar decision artifacts. The combination of clean code plus the architectural reasoning behind it - especially the restraint decisions - provides a much richer learning experience.

Some projects with good documentation of their design decisions include Rust's RFCs, Python's PEPs, or any project following the ADR pattern. Often the reasoning about what not to build is more instructive than the implementation itself.


Oooh I like that idea. I may steal it. I'll be on the lookout for documents like that. It'll be interesting to see what patterns/designs were avoided. There are so many ways to accomplish the same thing it might be nice to set some limits.


https://daringfireball.net/thetalkshow/2025/03/23/ep-419

They spoke through the options back in March.


The irony is quite striking, just as ChatGPT can generate confident-sounding but inaccurate information, Altman appears to be presenting unsubstantiated claims about his company’s environmental impact. Both involve presenting information without reliable backing, though the consequences differ - one misleads users in conversations, the other potentially misleads stakeholders and the public about environmental responsibility.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: