More

XCSme · 2025-12-25T21:17:38 1766697458

Not sure if it's a joke, but I don't think LLM is a bijective function.

croemer · 2025-12-25T23:34:43 1766705683

If you had all the token probabilities it would be bijective. There was a post about this here some time back.

XCSme · 2025-12-26T01:15:42 1766711742

Kind of, LLMs still use randomness when selecting tokens, so the same input can lead to multiple different outputs.

XCSme · 2025-12-25T21:15:53 1766697353

I am not sure what's the stereotype, but I tried using langchain and realised most of the functionality actually adds more code to use than simply writing my own direct API LLM calls.

Overall I felt like it solves a problem doesn't exist, and I've been happily sending direct API calls for years to LLMs without issues.

teruakohatu · 2025-12-25T21:57:58 1766699878

JSON Structured Output from OpenAI was released a year after the first LangChain release.

I think structured output with schema validation mostly replaces the need for complex prompt frameworks. I do look at the LC source from time to time because they do have good prompts backed into the framework.

majormajor · 2025-12-25T22:28:19 1766701699

IME you could get reliable JSON or other easily-parsable output formats out of OpenAI's going back at least to GPT3.5 or 4 in early 2023. I think that was a bit after LangChain's release but I don't recall hitting problems that I needed to add a layer around in order to do "agent"-y things ("dispatch this to this specialized other prompt-plus-chatgpt-api-call, get back structured data, dispatch it to a different specialized prompt-plus-chatgpt-api-call") before it was a buzzword.

nostrebored · 2025-12-26T02:55:30 1766717730

Can guarantee this was not true for any complicated extraction. You could reliably get it to output json but not the json you wanted

Even on smallish ~50k datasets error was still very high and interpretation of schema was not particularly good.

avaer · 2025-12-26T06:04:07 1766729047

It's still not true for any complicated extraction. I don't think I've ever shipped a successful solution to anything serious that relied on freeform schema say-and-pray with retries.

avaer · 2025-12-26T06:01:40 1766728900

To this day many good models don't support structured outputs (say Opus 4.5) so it's not a panacea you can count on in production.

The bigger problem is that LangChain/Python is the least set up to take advantage of strong schemas even when you do have it.

Agree about pillaging for prompts though.

teruakohatu · 2025-12-26T10:00:20 1766743220

> so it's not a panacea you can count on in production.

OpenAI and Gemini models can handle ridiculously complicated and convoluted schemas, if I needed complicated JSON output I wouldn’t use anything that didn’t guarantee it.

I have pushed Gemini 2.5 Pro further than I thought possible when it comes to ridiculously over complicated (by necessity) structured output.

jmogly · 2025-12-27T00:32:00 1766795520

100% Gemini + pydantic you don’t need a wrapper library in 2025

Insanity · 2025-12-25T23:21:40 1766704900

When my company organized an LLM hackathon last year, they pushed for LangChain.. but then instead of building on top of it I ended up creating a more lightweight abstraction for our use-cases.

That was more fun than actually using it.

XCSme · 2025-12-24T23:39:08 1766619548

I am coaching table-tennis, and sometimes I tell people that we only actually "learn" while we sleep. So, without sleeping, the brain doesn't have time to "save" the new information for future use.

Not sure if it's factually correct, but it seems about right, sleeping seems to be the magic sauce, and the time when all memories are written from RAM to disk.

XCSme · 2025-12-23T23:05:55 1766531155

Only 10b active params and close to SOTA?

XCSme · 2025-12-22T20:29:32 1766435372

Funny how they didn't include Gemini 3.0 Pro in the bar chart comparison, considering that it seems to do the best in the table view.

jychang · 2025-12-22T20:44:08 1766436248

Also, funny how they included GPT-5.0 and 5.1 but not 5.2... I'm pretty sure they ran the benchmarks for 5.0, then 5.1 came out, so they ran the benchmarks for 5.1... and then 5.2 came out and they threw their hands up in the air and said "fuck it".

rynn · 2025-12-22T23:17:36 1766445456

gpt-5.2 codex isn't available in the API yet.

If you want to be picky they could've compared it against gpt-5 pro gpt-5.2 gpt-5.1 gpt-5.1-codex-max gpt-5.2 pro

all depending on when they ran benchmarks (unless, of course, they are simply copying OAI's marketing).

At some point it's enough to give OAI a fair shot and let OAI come out with their own PR, which they doubtlessly will.

XCSme · 2025-12-22T20:50:25 1766436625

I didn't even notice that, I assumed it was the latest GPT version.

amelius · 2025-12-22T22:35:52 1766442952

after or before running the benchmarks?

guluarte · 2025-12-22T21:49:27 1766440167

Gemini is garbage and does it's own thing most of the time ignoring the instructions

XCSme · 2025-12-21T21:34:28 1766352868

Good point. It also reminded me of when I was trying to optimize my app for some scenarios, then I realized it's better to optimize it for ALL scenarios, so it works fast and the servers can handle no matter what. To be more specific, I decided NOT to cache any common queries, but instead make sure that all queries are fast as possible.

XCSme · 2025-12-21T21:26:10 1766352370

I've recently added error tracking to my self-hosted analytics app (UXWizz), and the way I did it is simply add extra events to each user/session. Once you have the concept of a session or user, you can simply attach errors or logs as Events stored for that user. This solves the main problem mentioned in the article, where you don't know what happened, plus being an Event stored in a MySQL database, you can still query it.

Why not simply use Events for logging, instead of plain strings?

XCSme · 2025-12-19T17:13:49 1766164429

I hope they'll release a new AM4 CPU

Something like 5900x on 2nm or 4nm

XCSme · 2025-12-19T16:46:03 1766162763

Awesome!

Is it actually working well? Not really, at least not at this stage. But it's cool to see a new UX idea.

XCSme · 2025-12-19T15:08:35 1766156915

Anyone else getting ERR_TOO_MANY_REDIRECTS trying to access the post?

moebrowne · 2025-12-19T15:36:44 1766158604

https://web.archive.org/web/20251219055833/https://www.jeffg...

pstoll · 2025-12-19T15:09:10 1766156950

Yes only on that page, not the rest of his blog. Guessing he ansible’d it to redirect ;)

XCSme · 2025-12-19T16:30:20 1766161820

It's working now.