More

7777777phil · 2026-04-23T06:14:05 1776924845

> The model is available today under the Apache 2.0 license on Hugging Face (opens in a new window) and Github (opens in a new window).

Bringing back the Open to OpenAI..

7777777phil · 2026-04-20T13:42:49 1776692569

> whether the final result is actually better, or whether it is just a more polished hallucinatio

Agents sampled from the same base model agreeing with each other isn't validation, it's correlation. Cheaper orchestration mostly amplifies whatever bias the model already has. Neat hack though.

7777777phil · 2026-04-20T11:42:13 1776685333

In 2018 the Swiss voted if cows are allowed to have horns (1). It was called the horned cow initiative.

(1) https://www.admin.ch/en/horned-cow-initiative

LadyCailin · 2026-04-20T12:05:03 1776686703

Switzerland? Cows? I can’t help but be reminded of https://m.youtube.com/watch?v=ySxyPMZkrwU

7777777phil · 2026-04-17T19:26:00 1776453960

MFN clauses are common in retail, what's different about Amazon's is the enforcement. Manufacturer MAP policies threaten "we won't ship you more," which you can recover from (unlike a listing demotion)

7777777phil · 2026-04-17T19:25:07 1776453907

>Generation became cheap, validation didn't this is basically the whole post imo. Also maps to why productivity hasn't really moved despite 93% adoption.. the oversight bandwidth eats the generation gains.

Jevons for code is right but the bottleneck is review, not typing: https://philippdubach.com/posts/93-of-developers-use-ai-codi...

7777777phil · 2026-04-09T14:14:31 1775744071

KL(P||Q) penalizes Q heavily when it assigns low probability to things P considers likely, but barely cares when Q wastes probability on rare events. That's why KL regularization in RLHF pushes models toward typical, average-sounding outputs..

krackers · 2026-04-12T05:24:19 1775971459

I think you probably meant this, but when used with RL it's usually KL(π || π_ref), which has high loss when the in-training policy π produces output that's unlikely in the reference. But yeah as you noted, I guess this also means that there is no penalty if π _does not_ produce output in π_ref, which leads to a form of mode-collapse.

This collapse in variety matches with what I've seen some studies show that "sloppification" is not present in the base model, and is only introduced during the RL phase.

7777777phil · 2026-04-09T14:09:46 1775743786

Auto-switching across model providers basically concedes the model layer is commodity, which I think is right (1)

tbd whether the skill registry develops network effects or just stays a flat directory. Portable skills as APIs tracks with the broader pattern of agent stacks decomposing into specialized swappable layers, where the defensible asset is whatever process knowledge orgs encode, not the deployment infra.

(1) wrote about it here from an enterprise perspective: https://philippdubach.com/posts/dont-go-monolithic-the-agent...

Tarcroi · 2026-04-09T17:26:32 1775755592

I agree on the commodity point, that's why I went multi-model from start.

The registry question is the one I'm thinking about the most. Right now it's flat. I plan to integrate usage data (success rates, cost, trust scores). So the registry tells you which skills actually work well, and that's valuable.

Your article looks interesting, I'll read it.

7777777phil · 2026-04-06T14:42:31 1775486551

This sentiment will probably resonate with a lot of people here. I literally won’t use a service if they try to force me onto their app..

MiddleEndian · 2026-04-06T14:48:21 1775486901

It's already been beaten into acceptance that I have to use the Ticketmaster app (shockingly awful) or Dice app (not quite as bad but still sucks) to get into a lot of music venues in Boston.

But at one club they wanted me to install another app just to check my coat. I elected to hide it under a some furniture instead lol

7777777phil · 2026-03-28T12:25:11 1774700711

cool idea, how far back (in time) do those 27k commits go?

Just thinking how this could maybe used for (automated) research / visualization on the evolution of (spanish - in this case) law

codethief · 2026-03-28T12:38:37 1774701517

> how far back (in time) do those 27k commits go

Looking at the commit dates (which seem to be derived from the original publication dates) the history seems quite sparse/incomplete(?) I mean, there have only been 26 commits since 2000.

Meneth · 2026-03-28T13:09:17 1774703357

It seems the commits aren't in proper date order. Here are some newer changes, placed before the latest commits: https://github.com/EnriqueLop/legalize-es/commits/master/?af...

forgotpwd16 · 2026-03-28T15:14:13 1774710853

It's related to commits actually having a parent-child structure (forming a graph) and timestamps (commit/author) being metadata. So commits 1->2->3->4 could be modified to have timestamps 1->3->2->4. I know GitHub prefers sorting with author over commit date, but don't know how topology is handled.

codethief · 2026-03-28T16:36:23 1774715783

> It's related to commits actually having a parent-child structure (forming a graph) and timestamps (commit/author) being metadata.

Yeah, I think everyone is aware. It's just that the last couple dozen commits, to me, looked like commits had been created in chronological order, so that topological order == chronological order.

> I know GitHub prefers sorting with author over commit date, but don't know how topology is handled.

Commits are usually sorted topologically.

7777777phil · 2026-03-28T09:33:59 1774690439

A quick reminder of this gem: This Video Has 74,991,092 Views (https://www.youtube.com/watch?v=BxV14h0kFs0)

sylware · 2026-03-28T10:06:34 1774692394

That would mean he left, but people did not...