Hacker Newsnew | past | comments | ask | show | jobs | submit | 7777777phil's commentslogin

> whether the final result is actually better, or whether it is just a more polished hallucinatio

Agents sampled from the same base model agreeing with each other isn't validation, it's correlation. Cheaper orchestration mostly amplifies whatever bias the model already has. Neat hack though.


In 2018 the Swiss voted if cows are allowed to have horns (1). It was called the horned cow initiative.

(1) https://www.admin.ch/en/horned-cow-initiative


Switzerland? Cows? I can’t help but be reminded of https://m.youtube.com/watch?v=ySxyPMZkrwU

MFN clauses are common in retail, what's different about Amazon's is the enforcement. Manufacturer MAP policies threaten "we won't ship you more," which you can recover from (unlike a listing demotion)

>Generation became cheap, validation didn't this is basically the whole post imo. Also maps to why productivity hasn't really moved despite 93% adoption.. the oversight bandwidth eats the generation gains.

Jevons for code is right but the bottleneck is review, not typing: https://philippdubach.com/posts/93-of-developers-use-ai-codi...


KL(P||Q) penalizes Q heavily when it assigns low probability to things P considers likely, but barely cares when Q wastes probability on rare events. That's why KL regularization in RLHF pushes models toward typical, average-sounding outputs..

I think you probably meant this, but when used with RL it's usually KL(π || π_ref), which has high loss when the in-training policy π produces output that's unlikely in the reference. But yeah as you noted, I guess this also means that there is no penalty if π _does not_ produce output in π_ref, which leads to a form of mode-collapse.

This collapse in variety matches with what I've seen some studies show that "sloppification" is not present in the base model, and is only introduced during the RL phase.


Auto-switching across model providers basically concedes the model layer is commodity, which I think is right (1)

tbd whether the skill registry develops network effects or just stays a flat directory. Portable skills as APIs tracks with the broader pattern of agent stacks decomposing into specialized swappable layers, where the defensible asset is whatever process knowledge orgs encode, not the deployment infra.

(1) wrote about it here from an enterprise perspective: https://philippdubach.com/posts/dont-go-monolithic-the-agent...


I agree on the commodity point, that's why I went multi-model from start.

The registry question is the one I'm thinking about the most. Right now it's flat. I plan to integrate usage data (success rates, cost, trust scores). So the registry tells you which skills actually work well, and that's valuable.

Your article looks interesting, I'll read it.


This sentiment will probably resonate with a lot of people here. I literally won’t use a service if they try to force me onto their app..


It's already been beaten into acceptance that I have to use the Ticketmaster app (shockingly awful) or Dice app (not quite as bad but still sucks) to get into a lot of music venues in Boston.

But at one club they wanted me to install another app just to check my coat. I elected to hide it under a some furniture instead lol


cool idea, how far back (in time) do those 27k commits go?

Just thinking how this could maybe used for (automated) research / visualization on the evolution of (spanish - in this case) law


> how far back (in time) do those 27k commits go

Looking at the commit dates (which seem to be derived from the original publication dates) the history seems quite sparse/incomplete(?) I mean, there have only been 26 commits since 2000.


It seems the commits aren't in proper date order. Here are some newer changes, placed before the latest commits: https://github.com/EnriqueLop/legalize-es/commits/master/?af...


It's related to commits actually having a parent-child structure (forming a graph) and timestamps (commit/author) being metadata. So commits 1->2->3->4 could be modified to have timestamps 1->3->2->4. I know GitHub prefers sorting with author over commit date, but don't know how topology is handled.


> It's related to commits actually having a parent-child structure (forming a graph) and timestamps (commit/author) being metadata.

Yeah, I think everyone is aware. It's just that the last couple dozen commits, to me, looked like commits had been created in chronological order, so that topological order == chronological order.

> I know GitHub prefers sorting with author over commit date, but don't know how topology is handled.

Commits are usually sorted topologically.


A quick reminder of this gem: This Video Has 74,991,092 Views (https://www.youtube.com/watch?v=BxV14h0kFs0)


That would mean he left, but people did not...


The amendment 34 (by Markéta Gregorová, Greens/EFA) to the Sippel Report A10-0040/2026 significantly restricts the ePrivacy derogation (chat control extension until 2027).

It replaces Art. 3 para. 1 lit. a of Regulation (EU) 2021/1232: Processing (scanning) may only be - strictly necessary for technologies to detect/remove known CSAM material (hashes, no unknown content), - proportionate, - limited to necessary technologies and content data.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: