More

sbinnee · 2026-05-08T03:36:16 1778211376

I have been telling this to my team that 1000 lines of instructions are deemed to fail no matter how great of instruction following capability of a model. I have been reviewing hundreds of line changes daily basis for about a month. I couldn’t help becoming a prayer.

sbinnee · 2026-05-04T14:02:01 1777903321

After some time replacing gemini 3 flash preview with deepseek v4 flash for a chat model, the biggest difference is the auto reasoning effort. Gemini flash is super fast and perfect for a chat model. But when I need some thought experiments with a handful of constraints, it struggles a bit and I switch to sonnet. But with deepseek v4 flash, it can do long complex reasoning and it gets things often right. Generating a lot of reasoning tokens means that it takes a lot of time of course. But I am happy to find a cheaper model and excited to try something other than gemini flash. Gemini flash has been so good that I was locked on it for a while.

sbinnee · 2026-04-28T23:40:38 1777419638

GitHub has become a place where you seek people’s attention. There are other places you can freely host your projects. GitLab was always available. I just haven’t logged in for I don’t know how long. An open source project is essentially a show window to the internet by a lonely developer. Ghostty has already established a great community. It’s already on display on a skyscraper. The project is mature enough that it needs a dedicated discussion forum or something like that. I am excited to see where it will find home and how it will evolve.

sbinnee · 2026-04-28T03:47:45 1777348065

In Korean, we have an adjective "푸르다". It is somewhere between blue and green. You can say trees are that, oceans are that. It also means unripe.

Yeah, so to me, tortoise is definitely blue.

Edit: typo tortoise -> turquoise

sbinnee · 2026-04-27T23:44:18 1777333458

> (academia, hating everything modern, will also hate you if you use typst)

I chuckled. I'd love to try out typst when the time comes. But for writing a journal paper, it's still going to be latex.

dgacmu · 2026-04-28T13:43:30 1777383810

I've been testing it out by using it to create the quizzes for a course I'm teaching this semester. My conclusion is that it's well worth finding a way to try it out. Drastically reduces the amount of boilerplate.

(I haven't yet tried to write a full paper in it.)

sbinnee · 2026-04-24T04:31:56 1777005116

Price is appealing to me. I have been using gemini 3 flash mainly for chat. I may give it a try.

input: $0.14/$0.28 (whereas gemini $0.5/$3)

Does anyone know why output prices have such a big gap?

girvo · 2026-04-24T05:56:27 1777010187

Output is what the compute is used for above all else; costs more hardware time basically than prompt processing (input) which is a lot faster

tokenmaxxinej · 2026-04-24T06:24:09 1777011849

input tokens are processed at 10-50 times the speed of output tokens since you can process then in batches and not one at a time like output tokens

sbinnee · 2026-04-22T21:48:08 1776894488

I don’t think I ever heard you said excellent for the pelican test. It looks excellent indeed!

The trend went to MoE model for some times and this time around is dense model again. I wonder if closed models are also following this trend: MoE for faster ones and dense for pro model.

sbinnee · 2026-04-17T00:27:19 1776385639

The comment section is already long, but I knew that I could found comments about "hmm" that I started noticing. Yes, it is so irritating to me too. Also, one additional thing I noticed was that verbose information has been more and more being obfuscated. I run CC with --verbose option for months, and I can see verbose mode is not verbose anymore. I wish I can do -vvv maximum verbose mode.

sbinnee · 2026-04-15T22:35:56 1776292556

I was going to try hermes agent after hearing OpenClaw constantly breaks and this hermes buzz is a better one. I t all sounds a lot of maintenance work.

sbinnee · 2026-04-14T23:10:51 1776208251

It looked cool, and I thought that it might be a new community where articles belong to this site. But when I clicked two articles, Seoul and Singapore, both were behind paywalls. So it seems it's just an aggregation of internet articles it seems?