Hacker Newsnew | past | comments | ask | show | jobs | submit | ceroxylon's commentslogin

I once saw "now that I've slept on it" in Gemini's CoT... baffling.

This was a couple of years ago, but I remember using ChatGPT to try and study for a certification by generating quiz questions.

It would always start to make every correct answer option "C" over time, no matter what I tried. Eventually I was so focused on whether or not it was stuck in a "C" loop that I started overthinking all of the questions and wasting time.

Flash forward to testing Sonnet 4.6 recently to try and see if it could effectively teach me something new, I got about 5 prompts in before I had to point out an oversight, and it gave me the classic "you're absolutely right, ignore that suggestion".

This is anecdotal of course, but at least LLMs are helping to build my skills of fact verification and citation checking!


It is not working on Firefox 147.0.4 either.

Strangely enough, my first test with Sonnet 4.6 via the API for a relatively simple request was more expensive ($0.11) than my average request to Opus 4.6 (~$0.07), because it used way more tokens than what I would consider necessary for the prompt.

This is an interesting trend with recent models. The smarter ones get away with a lot less thinking tokens, partially to fully negating the speed/price advantage of the smaller models.

Just like humans :-)

Eg a smart person will automate a task instead of executing the task repeatedly.


Reminds me of Dan Harumi

> Tech people are always talking about dinner reservations . . . We're worried about the price of lunch, meanwhile tech people are building things that tell you the price of lunch. This is why real problems don't get solved.


Good estimate, the official website for the lamp says 580W


Didn't Thariq make it clear three weeks ago when they shut down 3rd party tool access and the OpenCode users were upset?

> Third-party harnesses using Claude subscriptions create problems for users and are prohibited by our Terms of Service.

https://xcancel.com/trq212/status/2009689809875591565


i think thats conflating two things (am not an expert). opencode exploited unauthorized use/api access, but obviously whatever that is using claude code sdk is kosher because its literally anthropic's blessed way to do this

thariq did a good intro here https://www.youtube.com/watch?v=TqC1qOfiVcQ


What are you building with the code you are generating?


Adblockers lit up when accessing it so I believe there is something going on.


I found this skybrary article on cockpit automation really interesting, since the detail in aviation literature is so thorough and small topics like this get considered carefully.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: