Hacker Newsnew | past | comments | ask | show | jobs | submit | Tade0's commentslogin

> Interesting, so someone submitting a paper for review could also submit one with hidden instructions for LLMs to summarise or review it in a very positive light.

I may or may not know a guy who added several hidden sentences in Finnish to his CV that might have helped him in landing an interview.


>several hidden sentences in Finnish

Is this a reference to something?


My understanding is that something among those lines happened:

> All Policy A (no LLMs) reviews that were detected to be LLM generated were removed from the system. If more than half of the reviews submitted by a Policy A reviewer were detected to be LLM generated, then all of their reviews were deleted, and the reviewer themselves was removed from the reviewer pool.

Half is a bit lenient in my view, but I suppose they wanted to avoid even a single false positive.


Here's a chart that presents some context:

https://www.reddit.com/r/texas/comments/1grxqur/the_austin_t...

This appears to be a correction to an unsustainable market.


I may or may not know a guy who bought a laptop identical to his work machine from place A to do projects for B while still physically being at A.

That trick only works until place B demands RTO as well!

(And this might be why the CEOs all seem to have coordinated changing back to RTO at the same time).


Smart advice!

Also not all "WFH" is from home.

I for one am renting a desk at an office. I have all the usual office amenities and an environment in which I can focus properly, but I don't have to involve myself geographically with the company I work for.


> I've seen Claude go and lazily fix a test by loosening invariants.

He does pull a sneaky on you from time to time, even nowadays, in v4.6, doesn't he?

To me it's analogous to the current situation at the strait of Hormuz - it's an enormous crisis but since almost everyone has a buffer of oil stockpiles, we can pretend it's not there.


A huge part of such stuff is deliberately hidden to avoid getting the government too involved in day to day lives.

Case in point: for a while we had an arrangement with our neighbour that we'll pick up their child from preschool and stay with her until her parents get home and in exchange they would prepare dinner for us.

No money exchanged hands, so no GDP generated, yet everyone's quality of life improved.


I guess a lot of the 'free market' stuff is also about avoiding too much government involvement. It tends to be a pain the neck when you have to fill tax returns and apply for permits.

> I'm really struggling to understand what we've grown into

A population of eight billion, at least three of which live in industrialized regions.

My currently 99yo grandpa was born when there were approximately two billion people. He spent a huge chunk of his childhood running around barefoot.

Whenever he talks about that time I can't help but think this world doesn't exist any more and hasn't for a long time now.


Running barefoot can feel nice, but in Africa wearing shoes could eliminate significant source of tropical diseases. https://www.theguardian.com/global-development/2022/nov/17/h...

Daily Claude user via Cursor.

What works:

-Just pasting the error and askig what's going on here.

-"How do I X in Y considering Z?"

-Single-use scripts.

-Tab (most of the time), although that doesn't seem to be Claude.

What doesn't:

-Asking it to actually code. It's not going to do the whole thing and even if, it will take shortcuts, occasionally removing legitimate parts of the application.

-Tests. Obvious cases it can handle, but once you reach a certain threshold of coverage, it starts producing nonsense.

Overall, it's amazing at pattern matching, but doesn't actually understand what it's doing. I had a coworker like this - same vibe.


Opus 4.5 max (1m tokens) and above were the tipping point for me, before that, I agree with 100% of what you said.

But even with Opus 4.6 max / GPT 5.4 high it takes time, you need to provide the right context, add skills / subagents, include tribal knowledge, have a clear workflow, just like you onboard a new developer. But once you get there, you can definitely get it to do larger and larger tasks, and you definitely get (at least the illusion) that it "understands" that it's doing.

It's not perfect, but definitely can code entire features, that pass rigorous code review (by more than one human + security scanners + several AI code reviewers that review every single line and ensure the author also understands what they wrote)


No we don't.

For one, I never saw a "full spec" (if such a thing even exists) back in my days of making 8k. Annually.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: