More

ceroxylon · 2026-02-19T23:26:53 1771543613

I once saw "now that I've slept on it" in Gemini's CoT... baffling.

ceroxylon · 2026-02-18T23:10:08 1771456208

This was a couple of years ago, but I remember using ChatGPT to try and study for a certification by generating quiz questions.

It would always start to make every correct answer option "C" over time, no matter what I tried. Eventually I was so focused on whether or not it was stuck in a "C" loop that I started overthinking all of the questions and wasting time.

Flash forward to testing Sonnet 4.6 recently to try and see if it could effectively teach me something new, I got about 5 prompts in before I had to point out an oversight, and it gave me the classic "you're absolutely right, ignore that suggestion".

This is anecdotal of course, but at least LLMs are helping to build my skills of fact verification and citation checking!

ceroxylon · 2026-02-18T19:38:22 1771443502

It is not working on Firefox 147.0.4 either.

ceroxylon · 2026-02-17T18:57:40 1771354660

Strangely enough, my first test with Sonnet 4.6 via the API for a relatively simple request was more expensive ($0.11) than my average request to Opus 4.6 (~$0.07), because it used way more tokens than what I would consider necessary for the prompt.

svachalek · 2026-02-17T21:21:39 1771363299

This is an interesting trend with recent models. The smarter ones get away with a lot less thinking tokens, partially to fully negating the speed/price advantage of the smaller models.

smartbit · 2026-02-18T04:31:43 1771389103

Just like humans :-)

Eg a smart person will automate a task instead of executing the task repeatedly.

ceroxylon · 2026-02-04T19:03:54 1770231834

Reminds me of Dan Harumi

> Tech people are always talking about dinner reservations . . . We're worried about the price of lunch, meanwhile tech people are building things that tell you the price of lunch. This is why real problems don't get solved.

ceroxylon · 2026-02-03T22:35:50 1770158150

Good estimate, the official website for the lamp says 580W

ceroxylon · 2026-02-02T02:19:59 1769998799

Didn't Thariq make it clear three weeks ago when they shut down 3rd party tool access and the OpenCode users were upset?

> Third-party harnesses using Claude subscriptions create problems for users and are prohibited by our Terms of Service.

https://xcancel.com/trq212/status/2009689809875591565

swyx · 2026-02-02T08:41:40 1770021700

i think thats conflating two things (am not an expert). opencode exploited unauthorized use/api access, but obviously whatever that is using claude code sdk is kosher because its literally anthropic's blessed way to do this

thariq did a good intro here https://www.youtube.com/watch?v=TqC1qOfiVcQ

ceroxylon · 2026-01-25T02:31:03 1769308263

What are you building with the code you are generating?

ceroxylon · 2026-01-25T02:27:32 1769308052

Adblockers lit up when accessing it so I believe there is something going on.

ceroxylon · 2026-01-23T01:57:30 1769133450

I found this skybrary article on cockpit automation really interesting, since the detail in aviation literature is so thorough and small topics like this get considered carefully.