Why not rev the numbers? "3.5" vs. "3.5 New" feels weird -- is there a particula...

abeppu · 2024-10-22T15:51:15 1729612275

The confusing choice they seem to have made is that "Claude 3.5 Sonnet" is a name, rather than 3.5 being a version. In their view, the model "version" is now `claude-3-5-sonnet-20241022` (and was previously `claude-3-5-sonnet-20240620`).

https://docs.anthropic.com/en/docs/about-claude/models

dragonwriter · 2024-10-22T16:33:48 1729614828

OpenAI does exactly the same thing, by the way; the named models also have dated versions. For instance, there current models include (only listing versions with more than one dated version for the same "name" version):

  gpt-4o-2024-08-06 
  gpt-4o-2024-05-13
  gpt-4-0125-preview
  gpt-4-1106-preview
  gpt-4-0613
  gpt-4-0314
  gpt-3.5-turbo-0125
  gpt-3.5-turbo-1106

coder543 · 2024-10-22T17:01:51 1729616511

On the one hand, if OpenAI makes a bad choice, it’s still a bad choice to copy it.

On the other hand, OpenAI has moved to a naming convention where they seem to use a name for the model: “GPT-4”, “GPT-4 Turbo”, “GPT-4o”, “GPT-4o mini”. Separately, they use date strings to represent the specific release of that named model. Whereas Anthropic had a name: “Claude Sonnet”, and what appeared to be an incrementing version number: “3”, then “3.5”, which set the expectation that this is how they were going to represent the specific versions.

Now, Anthropic is jamming two version strings on the same product, and I consider that a bad choice. It doesn’t mean I think OpenAI’s approach is great either, but I think there are nuances that say they’re not doing exactly the same thing. I think they’re both confusing, but Anthropic had a better naming scheme, and now it is worse for no reason.

dragonwriter · 2024-10-22T17:16:17 1729617377

> Now, Anthropic is jamming two version strings on the same product, and I consider that a bad choice. It doesn’t mean I think OpenAI’s approach is great either, but I think there are nuances that say they’re not doing exactly the same thing

Anthropic has always had dated versions as well as the other components, and they are, in fact, doing exactly the same thing, except that OpenAI has a base model in each generation with no suffix before the date specifier (what I call the "Model Class" on the table below), and OpenAI is inconsistent in their date formats, see:

  Major Family  Generation    Model Class Date
  claude        3.5           sonnet      20041022
  claude        3.0           opus        20240229
  gpt           4             o           2024-08-06
  gpt           4             o-mini      2024-07-18
  gpt           4             -           0613
  gpt           3.5           turbo       0125

coder543 · 2024-10-22T17:19:19 1729617559

But did they ever have more than one release of Claude 3 Sonnet? Or any other model prior to today?

As far as I can tell, the answer is “no”. If true, then the fact that they previously had date strings would be a purely academic footnote to what I was saying, not actually relevant or meaningful.

nisten · 2024-10-22T15:22:40 1729610560

For a company selling intelligence, that's a pretty stupid way of labelling a new product.

riffraff · 2024-10-22T15:24:23 1729610663

"computer use" is also as bad a marketing choice as possible for something that actually seems pretty cool.

pglevy · 2024-10-22T16:27:39 1729614459

I had no idea what the headline meant before reading the article. I wasn't even sure how to pronounce "use." (Maybe a typo?) I think something like "Claude adds Keyboard & Mouse Control" would be clearer.

barrell · 2024-10-22T18:11:29 1729620689

I read the headline 5-10 times trying to make sense of it before even clicking on the link.

Native English speaker, just used the other “use” many times

accrual · 2024-10-22T16:11:18 1729613478

I'm not sure what a better term is. It's kind of understated to me. An AI that can "use a computer" is a simple straightforward sentence but with wild implications.

ok_dad · 2024-10-22T16:05:42 1729613142

It’s simple and easy to understand what it is, that’s good marketing to my ears.

swyx · 2024-10-22T16:22:09 1729614129

it makes sense in contrast to "tool use". basically, either fly-by-vision or fly-by-instruments, same dilemma you have in self driving cars

dartos · 2024-10-22T15:39:37 1729611577

It worked for Nintendo.

The 3ds and “new 3ds” were both big sellers.

Zambyte · 2024-10-22T15:54:30 1729612470

3ds doesn't have a version number to bump. Claude 3.5 does.

cooper_ganglia · 2024-10-22T16:03:01 1729612981

I hear the Nintendo 4DS was very popular with the higher dimensional beings!

dartos · 2024-10-22T17:25:15 1729617915

The 3 was the version number ;)

Ds and ds lite were version 1

Dsi was 2 (as there was dsi software that didn’t run on ds or ds lite)

And the 3ds was version 3.

Zambyte · 2024-10-31T20:40:56 1730407256

https://www.perplexity.ai/search/what-is-the-meaning-of-the-...

kurisufag · 2024-10-22T17:42:06 1729618926

there /was/ a 2DS, though, and it came after the 3DS.

r00fus · 2024-10-22T16:18:26 1729613906

You can always add a version number (e.g. 3DS2) or a changed moniker (3DS+).

dragonwriter · 2024-10-22T16:37:51 1729615071

Every major AI vendor seems to do it with hosted models; within "named" major versions of hosted models, there are also "dated" minor versions. OpenAI does it. Google does it (although for Google Gemini models, the dated instead of numbered minor versions seem to be only for experimental versions like gemini-1.5-pro-exp-0827, stabled minor versions get additional numbers like gemini-1.5-pro-002.)

quantadev · 2024-10-22T17:08:16 1729616896

Speaking of "intelligence", isn't it ironic how everyone's only two words they use to describe AI is "crazy" and "insane". Every other post on Twitter is like: This new feature is insane! This new model is crazy! People have gotten addicted to those words almost as badly as their other new addiction: the word "banger".

fragmede · 2024-10-22T18:01:38 1729620098

Well yeah. This new model is mentally unwell! and This model is a total sociopath! didn't test as well in focus groups.

HarHarVeryFunny · 2024-10-22T16:30:02 1729614602

Well, by calling it 3.5, they are telling you that this is NOT the next-gen 4.0 that they presumably have in the works, and also not downplaying it by just calling it 3.6 (and anyways they are not advancing versions by 0.1 increments - it seems 3.5 was just meant to convey "half way from 3.0 to 4.0"). Maybe the architecture is unchanged, and this just reflects more pre and/or post-training?

Also, they still haven't released 3.5 Opus yet, but perhaps 3.5 Haiku is a distillation of that, indicating that it is close.

From a competitive POV, it makes sense that they respond to OpenAI's 4o and o1 without bumping the version to Claude 4.0, which presumably is what they will call their competitor to GPT-5, and probably not release until GPT-5 is out.

I'm a fan of Anthropic, and not of OpenAI, and I like the versioning and competitive comparisons. Sonnet 3.5 still best coder, better than o1, has to hurt, and a highly performant cheap Haiku 3.5 will hit OpenAI in the wallet.

therealmarv · 2024-10-22T15:28:47 1729610927

exactly my thought too, go up with the version number! Some negative examples: Claude Sonnet 3.5 for Workstations, Claude Sonnet 3.5 XP, Claude Sonnet 3.5 Max Pro, Claude Sonnet 3.5 Elite, Claude Sonnet 3.5 Ultra

xnx · 2024-10-22T15:47:47 1729612067

Claude Sonnet 3.5 360, Claude Sonnet 3.5 One

scrlk · 2024-10-22T15:38:58 1729611538

2007: "Choose a Vista" - https://www.youtube.com/watch?v=5-feCRQBkSs

2024: "Choose a Claude"?

r00fus · 2024-10-22T16:20:34 1729614034

Super Claude Sonnet 3.5 Champion Edition, Alpha 3

oezi · 2024-10-22T15:55:08 1729612508

Let's just say that the LLM companies still are learning how to do versioning in a customer friendly way.

GaggiX · 2024-10-22T15:38:39 1729611519

Similar to OpenAI when they update their current models they just update the date, for example this new Claude 3.5 Sonnet is "claude-3-5-sonnet-20241022".

afro88 · 2024-10-22T18:18:17 1729621097

Just guessing here, but I think the name "sonnet" is the architecture, the number is the training structure / method, and the model date (not shown) is the data? So presumably with just better data they improved things significantly? Again, just a guess.

KaoruAoiShiho · 2024-10-22T15:44:32 1729611872

My guess is they didn't actually change the model, that's what the version number no change is conveying. They did some engineering around it to make it respond better, perhaps more resources or different prompts. Same cutoff date too.

m3kw9 · 2024-10-22T15:32:08 1729611128

Maybe they notice 3.5 Sonnet has become a brand and pivot it away from a version

sureIy · 2024-10-22T15:36:17 1729611377

Is it OS X all over again?

pella · 2024-10-22T15:47:37 1729612057

claude-3-5-sonnet-20241022

moffkalast · 2024-10-22T16:15:29 1729613729

claude-3-5-sonnet-20241022-final-final-2

bloedsinnig · 2024-10-22T15:37:15 1729611435

Because its a finetune of 3.5 optimized for the use case of computer use.

Its actually accurate and its not a 3.6.

therealmarv · 2024-10-22T15:38:05 1729611485

So 3.5.1 ?

dotancohen · 2024-10-22T16:10:07 1729613407

I think that was the last version number for KDE 3.

Stands out for me as I once replaced a 2.3 Turbo in a TurboCoupe with a 351 Windsor ))

r00fus · 2024-10-22T16:21:04 1729614064

For networks

usaar333 · 2024-10-22T15:41:58 1729611718

I don't think that's correct. This looks like a new model. Significant jump in math and gpqa scores.

diggan · 2024-10-22T15:45:45 1729611945

If the architecture is the same, and the training scripts/data is the same, but the training yielded slightly different weights (but still same model architecture), is it a new model or just a iteration on the same model?

What if it isn't even a re-training from scratch but a fine-tune of an existing model/weights release, is it a new version then? Would be more like a iteration, or even a fork I suppose.

cooper_ganglia · 2024-10-22T16:04:25 1729613065

Yes, it's a new model, but not a Claude 4.

It's the same, but a bit different; Claude 3.6 makes sense to me.

bloedsinnig · 2024-10-23T08:06:31 1729670791

I would assume that 3.5 means that the base training (which takes weeks/month) wasn't changed but only finetuning happened.

HarHarVeryFunny · 2024-10-22T17:39:27 1729618767

Could be just additional post-training (aka finetuning) for coding/etc.