Grok3 Launch [video]

CSMastermind · 2025-02-18T06:03:07 1739858587

Karpathy gave his initial impression: https://x.com/karpathy/status/1891720635363254772

The pull quote is: The impression overall I got here is that this is somewhere around (OpenAI) o1-pro capability

sigmoid10 · 2025-02-18T12:29:37 1739881777

The impression seems to be warranted: Grok 3 has directly jumpted to the top of all leaderboard categories in Chatbot Arena: https://lmarena.ai/?leaderboard

In math it shares the top spot with o1 and is just a few points behind (well within errors). In creative writing it is basically ex-aequo with the latest ChatGPT 4o and in coding it's actually significantly ahead of everyone else and represents a new SOTA.

jessfyi · 2025-02-18T15:27:51 1739892471

lmarena/lmsys is beyond useless, looking at prior rankings of models vs formal benchmarks or testing for accuracy + correctness on batches of real world data. It's a bit like using a poll of Fox News to discern the opinions of every American; the audience voting is consistently found wanting. Not even getting into how easily a bad actor with means + motivation (in this "hypothetical" instance wanting to show that a certain model is capable of running the entire US government) can manipulate votes which has been brought up in the past (yes I'm aware of the lmsys publication on how they defend against attacks using cloudflare + recaptcha, there are ways around that.)

sigmoid10 · 2025-02-21T11:11:44 1740136304

So you're saying that either A: users interacting with models can't objectively rate what responses seem better to humans, B: xAi as a newcomer has somehow managed to game the leaderboard better than all those other companies, or C: all those other companies are not doing it. By those standards every test ever devised for anything is beyond useless. But simply not having the model creator running the evaluation is already going a long way.

jessfyi · 2025-02-22T05:31:48 1740202308

No I'm saying that some companies are doing it (OpenAI at the very least), the company in question has motive and capability to game the system (kudos to them for pushing the boundaries there), AND the userbases' rankings have been historically, statistically misaligned with data from evals (though flawed) and especially when it comes to testing for accuracy + precision on real world data (outside of their known or presumed dataset). Take a look at how well Qwen or Deepseek actually performed vs the counterparts that were out at the same time vs their corresponding rankings.

In the nicest way possible I'm saying this form of preference testing is ultimately useless, primarily due to a base of dilettantes with more free time than knowledge parading around as subject matter experts and secondarily due to presumed malfeasance. The latter is more apparent to more of the masses (that don't blindly believe any leaderboard they see) now that access to the model itself is more widespread and people are seeing the performance doesn't match the "revolution" promised [0]. If you're still confused why selecting a model based on a glorified Hot or Not application is flawed, perhaps ask yourself why other evals exist in the first place (hint: some tests are harder than others.)

[0](One such instance of someone competent testing it and realizing it's not even close to the "best" model out) https://www.youtube.com/watch?v=WVpaBTqm-Zo

Breza · 2025-02-25T18:08:06 1740506886

At work, developed our own suite of benchmarks. Every company with a serious investment in AI-powered platforms needs to do the same. Comparing our results to the Arena turns up some pleasant surprises, like DBRX hitting way above its weight for some reason.

sigmoid10 · 2025-02-23T13:56:04 1740318964

You say no, but then go on and explain why you believe a combination of both option A and option B. That's fine I guess, I just don't consider it particularly likely given the currently available information.

numpad0 · 2025-02-18T13:20:56 1739884856

Considering that OpenAI subscription is $200 per month, and "Premium Plus" subscription that includes this thing is only $40 per month, does that mean instantaneous "Elon factor" is now at least -$160 per month per user, or is it supposed to be added up to more than -$240 per month?

How would the math change after factoring in that OpenAI isn't even covering entirety of opex with the sub anyway, and/or people finding associating their money and Twitter accounts to be weird, and/or this thing is supposedly running on a bigger cluster than that for OpenAI?

coder543 · 2025-02-18T14:38:19 1739889499

No... sigmoid10 was comparing with o1 (not o1-pro), which is accessible for $20/mo, not $200/mo. So, the "Elon factor" in your math is +$20/user/month (2x) for barely any difference in performance (a hard sell), not -$160/user/month, and while we have no clear answer to whether either of them are making a profit at that price, it would be surprising if OpenAI Plus users were not profitable, given the reasonable rate limits OpenAI imposes on o1 access, and the fact that most Plus users probably aren't maxing out their rate limits anyways. o1-pro requires vastly more compute than o1 for each query, and OpenAI was providing effectively unlimited access to o1-pro to Pro users, with users who want tons of queries gravitating to that subscription. The combination of those factors is certainly why Sam Altman claimed they weren't making money on Pro users.

lmarena has also become less and less useful over time for comparing frontier models as all frontier models are able to saturate the performance needed for the kind of casual questions typically asked there. For the harder questions, o1 (not even o1-pro) still appears to be tied for 1st place with several other models... which is yet another indication of just how saturated that benchmark is.

layer8 · 2025-02-18T15:24:45 1739892285

“The impression overall I got here is that this is somewhere around o1-pro capability”.

“Grok 3 + Thinking feels somewhere around the state of the art territory of OpenAI's strongest models (o1-pro, $200/month)”.

coder543 · 2025-02-18T15:28:37 1739892517

The comment I was replying to had replied to an lmarena benchmark link. Perhaps you think that person should have replied to someone else? And, if you want to finish the quote, Karpathy's opinion on this is subjective. He admits it isn't a "real" evaluation.

"[...] though of course we need actual, real evaluations to look at."

His own tests are better than nothing, but hardly definitive.

layer8 · 2025-02-18T15:38:43 1739893123

I understood numpad0 to continue the comparison to o1-pro, after sigmoid10 expressed the opinion that the comparison is warranted.

coder543 · 2025-02-18T15:46:54 1739893614

Yes, numpad0 did... but I was pointing out that this choice was illogical. The lmarena results they were replying to only supported a comparison against o1, since o1 effectively matches Grok 3 on the benchmark being replied to (with o1-pro nowhere to be found), and then they immediately leapt into a bunch of weird value-proposition math. As I said, perhaps you think they should have replied to someone else? Replying to an lmarena benchmark indicates that numpad0 was using that benchmark as part of the justification of their math. I also pointed out the limitations of lmarena as a benchmark for frontier models.

I don't think anyone is arguing that ChatGPT Pro is a good value unless you absolutely need to bypass the rate limits all the time, and I cannot find a single indication that Premium+ has unlimited access to Grok 3. If Premium+ doesn't have unlimited rate limits, then it's definitely not comparable to ChatGPT Pro, and other than one subjective comment by Karpathy, we have no benchmarks that indicate that Grok 3 might be as good as o1-pro. You already get 99% of the value with just ChatGPT Plus compared to ChatGPT Pro for half the price of Premium+.

numpad0 was effectively making a strawman argument by ignoring ChatGPT Plus here... it is very easy for anyone to beat up a strawman, so I am here to point out a bad argument when I see one.

AtlanticThird · 2025-02-18T19:57:31 1739908651

You're the one that came in and told him about the "factor in your math". Like you said, it's his comparison, not yours. If you want to do your own comparison, feel free. But don't come in and tell him he's not allowed to do his comparison. I for one like is comparison.

cyanydeez · 2025-02-18T22:49:06 1739918946

Guys, yall forget GIGO. First principles.

This thing is produced by musk.

srid · 2025-02-18T14:44:11 1739889851

Where do you see that Premium+ is $40 per month?

The official source says "Starts at $22/month or $229/year on web", https://help.x.com/en/using-x/x-premium

This is pretty much what I paid a couple of months ago, as a Canadian.

nickthegreek · 2025-02-18T14:50:11 1739890211

They just announced a price increased today. The link you posted has this info in a blue box at the top.

Also visible here: https://help.x.com/en/using-x/x-premium#tbpricing-bycountry

srid · 2025-02-18T15:01:07 1739890867

Interesting. In that table, I see $40 for US users. Yet the price remains $30 for Canadian users, despite their low dollar value.

AlchemistCamp · 2025-02-18T20:15:27 1739909727

And it’s only £17 in the UK or €21 in the EU.

colechristensen · 2025-02-18T18:11:54 1739902314

>Considering that OpenAI subscription is $200 per month

This plan is 75 days old. I didn't know it existed until last week.

OpenAI is starting to try to get a little more realistic revenue in, Grok is acquiring customers.

ben_w · 2025-02-18T14:11:12 1739887872

Given how fast-moving the field is, it's very difficult to confidently state how much inference costs. Perhaps he's under-charging, perhaps OpenAI is over-charging, one may be more optimised than the other, but new models come out and change everything in less time than is normally takes for actual costs to become public knowlege.

visarga · 2025-02-18T14:25:53 1739888753

Sometimes it's a matter of approach, some approach could be 5% better and 10x more expensive. So they will find the sweetspot, takes a few iterations.

arjunaaqa · 2025-02-18T20:56:25 1739912185

Yes, better to avoid annual subscriptions.

JumpCrisscross · 2025-02-18T14:09:43 1739887783

Masa Son top ticks a market is somehow still news in 2025.

jimbokun · 2025-02-18T18:39:32 1739903972

What do we do to assess the intelligence of these models after they are smarter than any human? From the kinds of questions it's answering seems like they are almost there.

Do we have a way to tell if one model is smarter than another at that point?

HarHarVeryFunny · 2025-02-18T18:57:58 1739905078

Nah, at the end of the day "things that are easy for humans are [still] hard for computers, and vice versa". DeepBlue was super-human at chess and couldn't play tic tac toe. Today's AI is (almost?) super-human at math yet only very recently learned to play tic tac toe, and still can't learn to do anything - because it can't learn, and has no innate drives to expose itself to learning situations even if it could.

Here's a real world intelligence test. Take on each AI as a remote intern/new-hire, and try to train it to become a useful team member (solving math puzzles or manufacturing paperclips does not count).

gf000 · 2025-02-19T17:30:02 1739986202

Almost there? Are we looking at the same thing?

flir · 2025-02-18T20:18:51 1739909931

> Do we have a way to tell if one model is smarter than another at that point?

Ask them to design a ranking mechanism for you. They are superhuman, after all.

(I really don't think we're going to have to worry about this).

thefourthchime · 2025-02-18T19:00:28 1739905228

There are things besides measuring intelligence, like humor. Currently, all the bots struggle with making jokes.

tcascais · 2025-02-18T20:43:29 1739911409

What you probably mean is puzzle solving intelligence. Humor is a form of intelligence. It's just not only about intelligence - it's also about values, and context, for instance. But all this reflects a form of intelligence. Neverthless, intelligence shouldn't be ranked, at least not in the way we are used to talk about it.

yodsanklai · 2025-02-18T13:05:50 1739883950

Naive question from a bystander , but since DeepSeek is open source and is on par with o1-pro (is it?), shouldn't we expect that anybody with the computer power is capable to compete with o1-pro?

tucnak · 2025-02-18T13:30:39 1739885439

> DeepSeek is open source and is on par with o1-pro (is it?)

There is no being "on par" in this space. Model providers are still mostly optimising for a handful of benchmarks / goals, like we can already see that Grok 3 is doing incredibly well on human preference (LM Arena) however with Style Control, it's suddenly behind ChatGPT-4o-latest and Gemini 2.0 is out the picture. So even within a single domain, goal, benchmark—it's not as straightforward as to say that one model is "on par" with another.

> shouldn't we expect that anybody with the computer power is capable to compete with o1-pro?

Not necessarily. I know it may be tempting to think that Grok 3 is entirely a result of xAI having lots of "computer power", but you have to recognise that this mindset is coming from a place of ignorance, not wisdom. Moreover, it doesn't even pass off as "cynical" view, because it's common knowledge that model training is really, really complicated. DeepSeek results are note-worthy, and really influential in some respects, but it hasn't magically "solved" training, or made training necessarily easier / less expensive for the interested parties. They never shared the low-level performance improvements, just model weights and lots of insight. For talented researchers, this is valuable, of course, but it's not like "anybody" could easily benefit from it in their training regimes.

Update: RFT (contra SFT) is becoming really popular with service providers, and it's not been "standardised" beyond whatever reproductions to have emerged in the weeks prior, moreover R1 cost is still pretty high[1] at something like $7/Mtok, & bandwidth is really not great. Consider something like Google Vertex AI's batch pricing for Gemini 1.5 Pro and Gemini 2.0 Flash models, which is at 50% discount, and their prompt caching which is at 75% discount. R1 is still got a way to go.

[1]: https://openrouter.ai/deepseek/deepseek-r1/providers?sort=th...

dtquad · 2025-02-18T13:44:13 1739886253

The full-sized DeepSeek-R1 is on par with o1.

o1-pro is "o1 on steroids" and was the first selling point of the $200/month Pro subscription but they later also added "Deep Research" and Operator to the Pro subscription.

guax · 2025-02-18T14:12:25 1739887945

Every year seems like we get worse at naming things in non confusing ways. I am waiting for the o1-pro-max now, pro max ultra and pro max ultra plus.

AlchemistCamp · 2025-02-18T20:19:05 1739909945

Microsoft already mastered this decades ago with a dozen different license tiers for Windows with unintuitive feature sets.

tartuffe78 · 2025-02-19T13:44:16 1739972656

Not to mention the Xbox iterations...

guax · 2025-02-22T10:06:39 1740218799

Chat GPT series X, not compatible with Chat GPT X.

Frankly. Whoever decided on this last gen naming at MS needs to come forward. I would love to know what crazy unacceptable collection of circumstances allowed that to happen.

polski-g · 2025-02-18T18:00:13 1739901613

I was recently laid off from OpenAI. My job was coming up with names for their models.

fragmede · 2025-02-18T20:00:09 1739908809

It's not a "layoff" if you were fired for poor performance/picking bad names.

rvnx · 2025-02-18T18:24:09 1739903049

You are welcome to join the Bard team

meekaaku · 2025-02-18T19:46:06 1739907966

Is this because of USAID funding cut?

theptip · 2025-02-18T17:59:25 1739901565

It’s hard because there are multiple dimensions being upgraded at different cadences. Architecture, parameter count, etc.

jkestner · 2025-02-18T14:31:50 1739889110

Then you’ll know what the naming AI was trained on.

gulfofamerica · 2025-02-21T14:44:00 1740149040

Off by one and naming things.

hmottestad · 2025-02-18T15:45:12 1739893512

For me I was sold on the consistency. o1 does really great at several programming problems, but o1-pro does great on these problems 4 of 4 tries. I get a good answer more often with o1-pro than with just o1, or even o3-mini-high.

CamperBob2 · 2025-02-19T01:00:38 1739926838

o1-pro is indeed pretty great, but I find that I can iterate several times with Gemini 2.0 Pro Experimental (or whatever their latest reasoning model is called these days) between o1-pro's responses. It's almost too slow for interactive use cases.

hmottestad · 2025-02-19T10:48:04 1739962084

Yeah. I've found out that you can start out with o3-mini-high and then switch over to o1-pro, or the other way around. Helps to iterate a bit faster.

ritz_labringue · 2025-02-18T13:48:00 1739886480

It's not on par with o1, let alone o1-pro

csomar · 2025-02-18T15:20:43 1739892043

It's on par/better/worse depending on the problem. o1 is significantly worse, for example, in Rust programming than Claude 3.5; at least for me.

gopher_space · 2025-02-18T19:15:18 1739906118

Claude really likes producing code, that’s for sure. I feel like it’s a useful tool once I’ve deconstructed a project past a certain point.

nwienert · 2025-02-18T15:05:17 1739891117

It’s pretty on par with o1, better at many coding questions.

flir · 2025-02-18T20:20:55 1739910055

I found it better at reasoning, worse at coding.

Not doubting your experience, just thinking how subjective it all is.

golol · 2025-02-18T13:18:06 1739884686

Deepseek is not on par with o1.

jamalaramala · 2025-02-18T14:05:41 1739887541

It probably depends on the benchmark you choose; according to Chatbot Arena, Deepseek-R1 ranks similarly to o1-2024-12-17; and Grok3 is just 3% above these models in "Arena Score" points.

golol · 2025-02-18T23:37:49 1739921869

Chatbot Arena is not really a great benchmark imo

resters · 2025-02-18T13:22:17 1739884937

Yes it is!

kragen · 2025-02-18T15:53:33 1739894013

No DeepSeek model is open source; they're freely redistributable, but without source.

samsepi01 · 2025-02-18T21:34:02 1739914442

I guess when it comes to LLM's what is considered the "source" - the weights or the code used to build the weights?

kragen · 2025-02-18T21:52:42 1739915562

To the extent that the concept is applicable, it would be the training data and the training code.

Grimblewald · 2025-02-19T02:50:33 1739933433

You're forgetting the flour in your cake recipie, the data, arguably the singularly most important part.

menaerus · 2025-02-18T13:30:15 1739885415

You'd still need a fairly large amount of compute power to be able to run DeepSeek R1 locally, no?

roblabla · 2025-02-18T13:36:21 1739885781

Well yes, but not so large that it's completely prohibitive. People have been running the full models on computers going as low as $6000: https://x.com/carrigmat/status/1884244369907278106

Of course this is for a personal instance, you'd need a much more expensive setup to handle concurrent users. And that's to run it, not train it.

plagiarist · 2025-02-18T14:09:34 1739887774

Sortof a letdown that after 24 32Gb RAM sticks you only get 6-8 tokens per second.

Mekoloto · 2025-02-18T15:01:40 1739890900

But a token is not just a character.

"hello how are you today?" - 7 tokens.

And this is so much better than I could have imagined in a very short span of time.

acchow · 2025-02-18T17:12:50 1739898770

And only get to use 20k context length before it OOMs

mechagodzilla · 2025-02-18T21:20:27 1739913627

I have a used workstation I got for $2k (with 768GB of RAM) - using the Q4 model, I can get about 1.5 tokens/sec and use very large contexts. It's pretty awesome to be able to run it at home.

nomel · 2025-02-19T19:25:46 1739993146

For me, where electricity is $0.45/kWh, assuming 1kW consumption, it would be around $80 USD/million!

CyberDildonics · 2025-02-20T03:49:53 1740023393

I think you might have to show your math on that one.

nomel · 2025-02-20T18:08:44 1740074924

They said 1.5 tokens/second. 1 mil tokens is 667k seconds is 185 hours per million tokens. 1kW * 185hr * $0.45/kWh = $80 per million tokens. Again, assuming 1kW, which may be high (or low). The cost of the physical computation is electricity cost.

CyberDildonics · 2025-02-20T18:36:52 1740076612

They said it has a crappy GPU, the whole computer probably only uses 200 - 250 watts.

nomel · 2025-02-20T20:49:28 1740084568

No way. 768GB of ram will have significant power draw. DDR4 (which this probably is) is something like 3W/8GB. That's > 250W alone.

So, say 500W. That's, for me in my expensive electricity city, $40/million tokens, with the pretty severe rate limit of 5600 tokens/hours.

If you're in Texas, that would be closer to $10/million tokens! Now you're at the same price as GPT-4o.

menaerus · 2025-02-21T08:42:48 1740127368

But you can run and experiment with any model of your liking. And your data does not leave your desktop environment. You can build services. I don't think anybody doing this is doing it to save $20 a month.

nomel · 2025-02-21T19:43:31 1740167011

Yes. I was only making a monetary comparison.

Related, you can get a whole lot of cloud computing for $2k, for those same experiments, on much faster hardware.

But yes, the data stays local. And, it's fun.

This comment chain is pretty funny.

MysticFear · 2025-02-18T22:05:31 1739916331

Would love to know more info & specs of your workstation.

mechagodzilla · 2025-02-19T01:52:37 1739929957

It's an HP Z8 G4 (dual-socket 18-core, 3 GHz Xeons, 24x32GB of DDR4-2666, and then a crappy GPU, 8TB HDD, 1TB SSD). It can accommodate 3 dual-slot GPUs, but I was mostly interested in playing with frontier models where holding all the weights in VRAM requires a ~$500k machine. It can run the full Deepseek R1, Llama3-405B, etc, usually around 1-2 tokens/sec.

fspeech · 2025-02-18T19:57:53 1739908673

A better approach is to split the model with MOEs running on CPUs and MLAs running on GPU. See the ktransformers project: https://github.com/kvcache-ai/ktransformers/blob/main/doc/en...

This takes advantage of the sparsity of MOE and the efficient KV-cache of MLA.

menaerus · 2025-02-19T11:01:43 1739962903

You perhaps forgot to mention that for their AMX optimizations to be even feasible you'd need to spend ~$10k for a single CPU, let alone the whole system which is probably ~$100k.

phonon · 2025-02-19T11:46:18 1739965578

Granite Rapids-W (Workstation) is coming out soon for likely much less than half that per CPU. (Xeon W-3500/2500 launched at $609 to $5889 per CPU less than a year ago and also has AMX).

menaerus · 2025-02-19T12:11:44 1739967104

Point being? Workstations that are fresh on the market and which have comparable performance of the server counterparts still easily cost anywhere between $20k and $40k. At least this is according to Dell workstations last time I looked.

phonon · 2025-02-20T00:38:23 1740011903

Supermicro X13SWA-TF Motherboard (16 DIMM slots with Xeon W-3500)= ~$1,000

E-ATX case = ~$300

Power Supply= ~$300

Xeon W-3500 (8 channel memory) = $1339 - $5889

Memory = $300-$500 per 64GB DDR5 RDIMM

Memory will be the major cost. The rest will be around $5,000. A lot less than "$100,000"!

menaerus · 2025-02-20T07:46:51 1740037611

I acknowledged in my last comment that the cost doesn't have to be $100k but that it would still be very high if you opted for the workstation design. You're gonna need to add one more CPU to your design, add another 8 memory channels, beefier PSU, and a new motherboard that can accommodate this. So, 8k (memory) + 10k (cpus) + the rest. As I said, not less than $20k.

phonon · 2025-02-20T12:56:56 1740056216

Why does it have to be a dual CPU design? 8 channels of DDR5 4800 will still get you something like 300 GB per second bandwidth. Not amazing, but OK. Granite Rapids-W will likely be something like 50% better (cores and bandwidth).

And the original message you were responding to was using a CPU with AMX and mixing it with a GPU like Nvidia 4900/5900. That way the large part of the model sits in the larger slower memory, and the active part in the GPU with the faster memory. Very cost effective and fast. (Something like generating 16 Tokens/s of 671B Deepseek R1 with a total hardware cost of $10-$20k.) They tried both single and dual CPU, with the latter about 30% faster....not necessarily worth it.

https://github.com/kvcache-ai/ktransformers/blob/main/doc/en...

menaerus · 2025-02-20T13:32:02 1740058322

> 8 channels of DDR5 4800 will still get you something like 300 GB per second bandwidth.

That's the theory. In practice, Sapphire Rapids needs 24-28 cores to hit the 200 GB/s mark and it doesn't go much further than that. Intel CPU design generally has a hard time saturating the memory bandwidth so it remains to be seen if they managed to fix this but I wouldn't hold my breath. 200 GB/s is not much. My dual-socket Skylake system hits ~140 GB/s and it's quite slow for larger LLMs.

> Why does it have to be a dual CPU design?

Because memory bandwidth is one of the most important limiting (compute) factors for larger models inference. With dual-socket design you're essentially doubling the available bandwidth.

> And the original message you were responding to was using a CPU with AMX and mixing it with a GPU like Nvidia 4900/5900.

Dual-socket CPU that costs $10k on a server that costs probably couple of factors more. Now you claimed that it doesn't have to be that expensive but I beg to differ - you still need $20k-$30k of worth equipment to run it. That's a lot and not quite "cost effective".

phonon · 2025-02-20T15:37:41 1740065861

The proof of the pudding is in the eating. Read the link above. It's one or two mid range[1] Sapphire Rapids CPUs and a 4090. Dual CPU is faster (partially because 32->64 cores, not just bandwidth) but also hit data locality issues, limiting the increase to about 30%.

(Dual Socket Skylake? Do you mean Cascade Lake?)

If you price it out, it's basically the most cost effective set-up with reasonable speed for large (more than 300 GB) models. Dual socket basically doubles the motherboard[2] and CPU cost, so maybe another $3k-$6k for a 30% uplift.

[1] https://www.intel.com/content/www/us/en/products/sku/231733/... $3,157

[2] https://www.serversupply.com/MOTHERBOARD/SYSTEM%20BOARD/LGA-... $1,800

menaerus · 2025-02-20T16:10:55 1740067855

Yes, dual socket Skylake. What's strange about that?

Please price it out for us because I still don't see what's cost effective in a system that costs well over $10k and runs at 8 tok/s vs the dual zen4 system for $6k running at the same tok/s.

phonon · 2025-02-21T00:38:10 1740098290

Sorry. Didn't realize you meant Skylake-SP.

I am not sure what your point is? There are some nice dual socket Epyc examples floating around as well, that claim 6-8 tokens/s. (I think some of those are actually distilled versions with very small context sizes...I don't see any as thoroughly documented/benchmarked as the above). This is a dual socket Sapphire Rapids example with similar sized CPUs and a consumer graphics card that gives about 16 tokens/second. Sapphire Rapids CPU and MB are a bit more expensive, and a 4090 was $1500 until recently. So for a few thousand more you can double the speed. Also the prompt processing speed is waaaaay faster as well. (Something like 10x faster than the Epyc versions.)

In any case, these are all vastly cheaper approaches than trying to get enough H100s to fit the full R1 model in VRAM! A single H100 80 GB is more than $20k, and you would need many of them + server just to run R1.

menaerus · 2025-02-21T07:36:26 1740123386

I don't argue their idea, which is sound, but I argue that the cost needed to achieve the claimed performance is not "for a few thousand more" as you stubbornly continue to claim.

The math is clear: single-socket ktransformers performance is 8.73 tok/s and it costs ~$12k to build such a rig. The same performance one gets from a $6k dual-EPYC system. It is a full-blown version of R1 and not a distilled one as you say.

Your claim about 16 tok/s is also misleading. It's a figure for 6 experts while we are comparing R1 with 8 experts against llama with 8 experts. 8 experts on dual-socket system per ktransformer benchmarks runs at 12.2 - 13.4 tok/s and not 16 tok/s.

So, ktransformers can roughly achieve 50% more in dual-socket configuration and 50% more than dual-EPYC system. This is not double as you say. And finally, the cost of such dual-socket system is ~$20k and therefore isn't the "best cost effective" solution since it is 3.5x more expensive for 50% better output.

And tbh llama.cpp is not quite optimized for pure CPU inference workloads. It has this strange "compute graph" framework which I don't understand what is it there for. It appears completely unnecessary to me. I also profiled couple of small-, mid- and large-sized models and the interesting thing was that majority of them turned out to be bottlenecked by the CPU compute on a system with 44 physical cores and 192G of RAM. I think it could do a much better job there.

phonon · 2025-02-22T11:42:15 1740224535

Are we doing this?

Cheapest 32 core latest EPYC (9335) x 2 = $3,079.00 x 2

Intel 32 Core CPU used above x 2 = $3,157 x 2 (I would choose the Intel Xeon Gold 6530 which is going for around $2k now, and with with higher clock speeds, and a 100 MB of more cache)

AMD Epyc Dual Socket Motherboard Supermicro H13DSH = $1899

Intel Supermicro X13DEG-QT = $1,800

Memory, PSU, Case = Same

4090 GPU = $1599 - $3,000 (temporary?)

Besides the GPU cost, the rest is about the same price. You only get a deep discount with AMD setups if you use EPYCs a few years old with cheaper (and slower) DDR4.

And again, if you go single CPU, you save over $4,000, but lose around 30% in token generation.

The "$6,000" AMD examples I've seen are pretty vague on exactly what parts were used and exactly what R1 settings including context length they were run at, making true apple to apple comparisons difficult. Plus the Sapphire Rapids + GPU example is about 10x faster in prompt processing. (53 seconds to 6 seconds is no joke!)

menaerus · 2025-02-22T13:37:12 1740231432

> Are we doing this?

Yes, you're blatantly misrepresenting information and moving goalposts. Right now it has become clear that you're doing this because you're obviously affiliated with ktransformers project.

$6k for 8 tok/s or $20k for 12 tok/s. People are not stupid. I rest my case here.

menaerus · 2025-02-18T16:48:18 1739897298

6k is not that bad considering that top of the line Apple laptop costs as much. However, I don't have X so unfortunately I can't read the details.

longitudinal93 · 2025-02-18T18:09:21 1739902161

You can read the whole thread through nitter:

https://xcancel.com/carrigmat/status/1884244369907278106

dang · 2025-02-18T18:34:14 1739903654

Related ongoing thread:

Andrej Karpathy: "I was given early access to Grok 3 earlier today" - https://news.ycombinator.com/item?id=43092066 - Feb 2025 (48 comments)

xiphias2 · 2025-02-18T15:01:07 1739890867

I don't see the Think button, and for me the answer is much below deepseek-r1 even thought I have Premium+ subscription. I'm just getting instant stupid answer instead of thinking.

joeevans1000 · 2025-02-18T22:04:26 1739916266

How can anyone repeatedly use a question like this without new models getting trained on it via online discussion?

rendang · 2025-02-18T04:59:39 1739854779

Grok has gotten to the top of one benchmark:

https://x.com/lmarena_ai/status/1891706264800936307

It's been said before but it is great news for consumers that there's so much competition in the LLM space. If it's hard for any one player to get daylight between them & the 2nd best alternative, hopefully that means one monopolistic firm isn't going to be sucking up all the value created by these things

qingcharles · 2025-02-18T06:56:25 1739861785

I've spent the last hour testing it and I'm blown away. And this is coming from a very hardcore user of OpenAI/Claude products on a daily basis.

It passed every goofy test I have for writing articles which involves trying to surface arcane obscure details. (it certainly means however they are scraping the Web they are doing a good job here)

It made the database code I wrote over the last week with o3/o1/GPT4o/Claude3.5 look like a joke.

It fills me with rage over who owns this thing.

Even if people tank Tesla's car business and run Twitter into the ground, I think our new Galactic Edgelord is going to win his first trillion on xAI and Teslabots anyway.

btw: it tried to charge me $40/mo for this thing: https://imgur.com/a/QXslgBo

RobinL · 2025-02-18T08:22:28 1739866948

Apologies for possibly stupid question but where can you use it right now? Just on 'direct chat' on https://lmarena.ai/ or is there a better alternative? Or do you have early access?

int_19h · 2025-02-18T08:51:03 1739868663

You need an X Premium Plus subscription on Twitter.

qingcharles · 2025-02-18T08:27:35 1739867255

I was using it on grok.com, logged in via a Twitter account. But I notice it just got added to the Grok tab on Twitter a moment ago.

Also, the "Deep Search" button was not available when it first went live, so I'm retesting everything again with this feature enabled, which I assume is a reasoning version of the same model.

giancarlostoro · 2025-02-19T15:16:00 1739978160

One neat feature is you can use Grok on any tweet, its helped me find context to obscure tweets many times over, very quickly.

jug · 2025-02-19T00:51:17 1739926277

Hopefully, you’ll be able to avoid the whole X Premium Plus thing in the near future with OpenRouter. It’ll still use xAI backend but via your OpenRouter API key. Then you can use it with any web or mobile app that supports OpenRouter.

Personally, I wouldn’t use it though. What’s going on with Elon Musk right now is completely insane. I hope to see OpenAI’s GPT-4.5 & GPT-5 releases to catch up soon, if nothing else. Announced for this year.

giancarlostoro · 2025-02-18T15:38:38 1739893118

For whatever it is worth, I frequently enough see the devs asking for feedback, so I suspect, if you tweet about Grok, or reply to any of those threads, they definitely read them, even if they don't respond / interact. It shows. I've seen improvements based on feedback I see others make almost instantly.

TMWNN · 2025-02-18T12:17:58 1739881078

[flagged]

Mekoloto · 2025-02-18T12:23:58 1739881438

Thats not how musk looks like at all :D

And lets see if Musk is pushing too many people too far. Everything he currently does can blow up in his face very fast

crocowhile · 2025-02-18T09:06:45 1739869605

It's not good news when this competition comes at cost of a gigantic over inflated bubble, in which all the big players keep on sucking billions from investors without even having a business model.

This hype will burst sooner than later and will trigger yet another global recession. This is untenable.

bobxmax · 2025-02-18T13:34:31 1739885671

ChatGPT is literally generating billions in revenue. Cursor is the fastest growing company of all time.

This lame HN trope of LLMs having no business model needs to die.

latexr · 2025-02-18T13:56:49 1739887009

> ChatGPT is literally generating billions in revenue.

It’s losing more billions than what it’s generating. Revenue does not equate profit.

https://www.cnbc.com/2024/09/27/openai-sees-5-billion-loss-t...

spacebanana7 · 2025-02-18T14:17:31 1739888251

True, but presence of significant revenue is still promising. It's much better to have an "expensive compute" problem than a "nobody wants to pay for the product" problem.

jsheard · 2025-02-18T14:43:55 1739889835

Keep in mind that not only is OpenAI being directly propped up by investor hype, the downstream API users who contribute much of their revenue are also being propped up by investor hype. A big chunk of OpenAIs revenue is actually even more VC money in a trenchcoat.

mullingitover · 2025-02-18T18:27:56 1739903276

The biggest marker of a bubble, to me, is that you have money-losing startups selling to other money-losing startups. On paper you see a lot of 'line go up' but it's just a lot of circulation in a closed body of water which will eventually evaporate.

holoduke · 2025-02-18T20:29:22 1739910562

Uber doesn't agree

krainboltgreene · 2025-02-18T21:27:52 1739914072

Uber is an outlier because in a functioning economy that valued workers we wouldn’t have shipped all our jobs overseas and made gig economies the last line between housing and street schizophrenia.

mechagodzilla · 2025-02-18T21:29:28 1739914168

Uber's revenue was never coming from other food delivery startups.

1shooner · 2025-02-19T04:03:59 1739937839

>True, but presence of significant revenue is still promising.

If started selling 5 dollar bills for 1 dollar, I could generate a lot of revenue with $150B. You wouldn't believe the demand we would see for $5 bills.

latexr · 2025-02-18T14:22:07 1739888527

> It's much better to have an "expensive compute" problem than a "nobody wants to pay for the product" problem.

That is only true is your primary concern in life is personal wealth and you’re burning other people’s money.

spacebanana7 · 2025-02-18T14:38:25 1739889505

YouTube is an optimistic example.

The bandwidth costs made it deeply loss making for a long time despite having loads of engagement and ad revenue. However over time they became more cost efficient at sending video over the internet and became profitable.

This strategy obviously doesn't always work, with WeWork being the canonical example. But it's not guaranteed to fail either.

athrowaway3z · 2025-02-18T15:53:07 1739893987

YouTube's network effect creating a winner-take-most was recognized, pitched, and valued from the very start.

The capabilities of LLMs are impressive, but none of them have published an idea I consider to have the same potential for a trillion $ monopoly that the current hype looks like.

There are far more similarities with the dot-com hype.

No critical first mover advantage in sight. All parts are replaceable for the cheapest variant with little to no down side to users.

mike_hearn · 2025-02-18T21:20:16 1739913616

It wasn't obvious at the time YouTube would have a network effect though. It was very dependent on coming up with a great recommendation algorithm, along with monetization and revenue sharing. At the time, YouTube didn't have anything like that, iirc.

athrowaway3z · 2025-02-20T10:31:44 1740047504

Even the basic front page of youtube was of immediate and obvious value to a creator, and would increase disproportionately in value the more people were on YouTube. The same goes for Amazon, and the same goes for Facebook.

All the LLM providers are - extremely useful - tools. Currently I can only see the 'non-monopoly' proportional improvement when their userbase grows from 100 to 1000.

But I might be wrong, and I wouldn't be surprised if in hindsight it will be obvious what the real disproportionate advantages there were to be found.

Gothmog69 · 2025-02-18T17:10:49 1739898649

They bought youtube for 1.65 billion which is pennies on the dollar compared to what it is worth today.

latexr · 2025-02-18T17:33:03 1739899983

Technically true, though in fairness it is unlikely the original owners would have gotten YouTube to where it is today. On the other hand there are companies who didn’t recognise they were nothing more than passing fads, refused buyouts, and crumbled.

bloomingkales · 2025-02-18T18:40:28 1739904028

[flagged]

athrowaway3z · 2025-02-18T19:43:35 1739907815

You could have asked any one of the dozen available LLMs to review this comment.

Most of them would have responded by explaining what a monopoly is, and why this reply makes little sense.

crocowhile · 2025-02-18T15:46:22 1739893582

YouTube did not have competitors and certainly not open source competitors.

sebastiennight · 2025-02-22T22:39:39 1740263979

I was there when YouTube became a thing, and I was running a music video-hosting website that I had built myself (on top of phpBB, even). We were encoding videos in Windows Media and RealPlayer formats.

There were LOTS of funded competitors to YouTube between 2006 and 2009, including Viddler (who paid Gary Vaynerchuk a small fortune to host his WineLibraryTV show there exclusively), DailyMotion (which is still alive today, although no longer a threatening contender), etc.

In 2009 I had a coaching business and was buying marketing courses and software which would deploy your videos across 40+ different video websites (including Google Video which was a separate thing until they acquired YouTube and merged those), and YouTube wasn't yet amounting to 50% of our video traffic.

I think you might be mistaken with the bold statement above.

sebzim4500 · 2025-02-18T18:24:48 1739903088

This is a bit before my time but I remember a bunch of competitors to YouTube. They just all sucked.

iteratethis · 2025-02-18T23:33:17 1739921597

It's loss-making at current usage, and usage per user will exponentially grow.

CraigRood · 2025-02-18T15:49:17 1739893757

I'm not sure how promising? I can't help but see how easy it would be to change API endpoints to a different platform.

jonas21 · 2025-02-18T18:54:11 1739904851

They're still early on the growth curve where there's enough opportunity for future growth that investing in scaling and improvement is more important than turning an immediate profit.

Remember when everyone on HN was sure Uber would never be profitable? Or Tesla? Or Amazon?

devin · 2025-02-18T19:33:44 1739907224

I do remember that, and I would say that they are still largely correct. Tesla needed government subsidies, Amazon needed AWS, and Uber needed a pandemic and Uber eats. The core businesses that HNers were referring to are still weak.

krainboltgreene · 2025-02-18T21:29:23 1739914163

Tesla needs government subsidies. Uber needs a broken economy.

bobxmax · 2025-02-18T15:10:39 1739891439

As did almost every large tech company today. Amazon lost money for decades.

Are we really still doing this nonsense? If Open AI wanted to become profitable they could do it inside of 12 months - growing companies don't care about profitability (nor should they)

athrowaway3z · 2025-02-18T16:04:59 1739894699

You're way too smug for spewing what is clearly survivorship-bias.

All currently known profitable use-cases are competing on price. All the unicorn examples you're biased for had in their pitch deck the network effect of being the largest.

OpenAI, Grok, etc, have shown no unique value prop or an idea with monopoly potential.

Hamuko · 2025-02-18T13:37:33 1739885853

Revenue or profit? WeWork at a time also did billions in revenue.

phillipcarter · 2025-02-18T13:50:28 1739886628

WeWork trapped themselves into a real estate hole, selling services for less than they rented property for.

OpenAI is currently in an explicit non-profit seeking mode using a technology that we have demonstrated 10-100x or greater decreases in compute to achieve the same outcomes.

This is not a declaration that OpenAI will become wildly profitable. This is just me saying that these aren't comparable companies.

linuxftw · 2025-02-18T14:15:22 1739888122

WeWork was a scam to enrich the founders of the company. They owned or had interests in many of the properties that WeWork leased from. I'm surprised no one was thrown in prison.

LorenDB · 2025-02-18T13:54:41 1739886881

OpenAI is losing money on their $200/mo (!!) Pro subscription[0].

[0]: https://www.theregister.com/2025/01/06/altman_gpt_profits/

sebzim4500 · 2025-02-18T18:25:38 1739903138

But making money off their subscriptions in general. They lose it all training models and on R&D

mullingitover · 2025-02-18T18:30:30 1739903430

I wonder how long it'll last. Just using myself as a demo customer: I canceled my subscription because Google AI Studio was doing more for me, and it's free. OpenAI is not really competitive at $20 a month anymore.

sebzim4500 · 2025-02-18T22:40:10 1739918410

Yeah not sure. I cancelled a while ago but I subscribed again once o1-preview came out and now o3-mini exists I still find it useful.

Of course, they are clearly cooking something or they wouldn't have just published a benchmark in which they do badly.

HarHarVeryFunny · 2025-02-18T17:19:52 1739899192

What source(s) are there for cursor's growth rate/revenue ?

HarHarVeryFunny · 2025-02-20T21:13:40 1740086020

So, answering my own question, there is this.

https://sacra.com/research/cursor-at-100m-arr/

Sounds legit.

aprilthird2021 · 2025-02-18T23:56:35 1739922995

Yeah I would be shocked to see that Cursor is the fastest growing company of all time by a good metric...

giancarlostoro · 2025-02-19T15:20:13 1739978413

> Cursor is the fastest growing company of all time.

I assume you're referring to this:

https://sacra.com/research/cursor-at-100m-arr/

It went from 10M MRR to 100M

crocowhile · 2025-02-18T17:17:42 1739899062

You are comparing apple with oranges. Cursor is not an LLM and yes, it has a business model. So does openrouter and million other applications that can switch API to the low bidder any moment.

ben_w · 2025-02-18T14:20:00 1739888400

Lots of people dervive great value from things that are too easily reproduced to be directly profitable.

Google gives everyone free access to a good spreadsheet tool, even though Microsoft Office exists.

Web browsers are free, despite the value of the entire internet.

Compilers are free, despite the value of all software collectively.

LLMs being really valuable doesn't automatically mean anyone can get rich from them.

I think everyone last year parroting "moat!" was cringe (like Altman of all people wouldn't know about this already, c'mon), but you do actually need something that other people don't have. I expect Altman's already got stuff in mind, but he's hardly the only one, and that means it's a game of "which multibillionaire with lots of experience building companies will win?", and that's hard for non-skilled normies (in this case incluing me) to guess.

KKKKkkkk1 · 2025-02-18T17:44:13 1739900653

HN has turned into the Slashdot of the 2000s. No wireless? Less space than a Nomad? Lame.

fragmede · 2025-02-18T20:05:03 1739909103

HN already had its ipod moment back in 2007. /.'s ipod moment was in 2001, not as longer before that as I would have guessed.

Re: Dropbox, from a well known user. It didn't age well and we've been asked not to repeat it because it makes the author with connections to this site's operator look bad.

> 1. For a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. From Windows or Mac, this FTP account could be accessed through built-in software.

https://news.ycombinator.com/item?id=8863

the controversy: https://news.ycombinator.com/item?id=27067281

jedberg · 2025-02-18T18:25:34 1739903134

I suspect a lot of the active users on HN today don't remember Slashdot of 2000...

Ray20 · 2025-02-18T13:03:46 1739883826

I think business model there pretty simple: to be in the front line when AI will go into the category of landscape-changing trillion dollar technologies. And investors keep pouring their billions exactly for that business model.

>This hype will burst sooner than later and will trigger yet another global recession.

It seems to small of bubble for global recession. I mean if it is a bubble at all, there is all the reasons to believe that the strategy will work with a significant probability.

moduspol · 2025-02-18T13:31:13 1739885473

IMO that's not really a business model. That's a hope that you can come up with one by being at (or near) the front of the pack if one materializes.

jsheard · 2025-02-18T14:35:44 1739889344

See also: Meta's previous push into VR/AR/Metaverse. They spent a hundred billion to be at the front of the pack when that revolutionary world-changing paradigm-shift took off... which simply didn't happen.

satvikpendem · 2025-02-18T18:34:39 1739903679

Their Orion glasses are apparently mindblowing in fidelity as well as the lightness of the glasses. Someone will absolutely make the smart glasses paradigm work so that we don't need to carry around phones anymore, and Zuck is racing to be first. This is because he lost out on the platform wars and was at the mercy of Apple and Google; remember Apple's privacy update that killed much of Meta's revenue? Zuck doesn't want a repeat of that by owning his own platform.

AlchemistCamp · 2025-02-18T20:26:23 1739910383

Seconded. I’m very excited for the day when/if their dev platform is opened up and it’s possible to access pass-through vision.

There’s a whole class of educational apps that could open up for people learning in the physical world. Whether it’s building physical things, sports or reading books or notes written in non-latin scripts... the impact will be enormous!

satvikpendem · 2025-02-18T20:30:09 1739910609

The only thing I'm concerned about is it'd be another locked down platform, like Oculus / Quest already is, only now much more disruptive just like Apple's and Google's (to a smaller extent). I want something more like Windows Mixed Reality or Steam VR to succeed more.

crocowhile · 2025-02-18T15:45:18 1739893518

That's a bit different though. META invested in a product that, as of now, as very little competition. The QUEST is sold at a slight loss but at least at an approachable price and to a volume that makes them the clear leader in the market at the moment. Moreover, their OS is open source. Clearly, what they want is to sell enough VRs to get the monopoly on the ecosystem and its apps (they basically want to make an Android play store for VRs). You may argue they are far but at least that's a clear business model.

OpenAI's business model was literally "we don't have one: we'll make AGI and we'll let AGI tell us how to make money". This is so idiotic it's not even a scam. xAI will compete on the same plane field. Not sure about Anthropic: they seem a bit more sane.

idiotsecant · 2025-02-18T13:50:32 1739886632

If that scenario comes to fruition, it's literally the only viable business model. Everyone else gets eaten alive .

moduspol · 2025-02-18T14:08:10 1739887690

Apparently not. Apparently xAI can catch up in a year. And we already saw what happened with DeepSeek.

What does the scenario look like where everyone else gets eaten alive?

logicchains · 2025-02-18T14:52:48 1739890368

xAI had the help of the world's richest and arguably the world's most powerful man; most other companies don't have that.

thegeomaster · 2025-02-18T14:32:36 1739889156

Every bubble has a narrative.

qgin · 2025-02-18T11:18:38 1739877518

The premise is that this ultimately replaces all intellectual and physical labor for the rest of time. It’s possible it becomes commoditized as soon as it exists, but in terms of investment dollars it’s either worth as much as you can spend or nothing at all.

pjc50 · 2025-02-18T12:58:16 1739883496

> ultimately replaces all intellectual and physical labor for the rest of time

Sounds incredibly valuable, but in reality collapses into Butlerian Jihad fairly quickly when you have 90% unemployment.

Edit: if the claims are true, then this will be far more destabilizing than social media. What do elections mean when the AI-guided political parties are putting out AI press releases for the AI press summaries, which voters have read out to them through their AI? What happens when >50% of the voters ask the AI who they should vote for? You end up with an AI dictatorship where the levers of discontent are fake.

ben_w · 2025-02-18T18:28:21 1739903301

> Sounds incredibly valuable, but in reality collapses into Butlerian Jihad fairly quickly when you have 90% unemployment.

But nobody really knows if that happens or not as a consequence, let alone quickly, because the transition itself only happened at all so far in fiction.

Whatever does happen, I think it's going to be a surprise to at a minimum 80% of the population.

conartist6 · 2025-02-18T13:06:06 1739883966

Yeah at some point it seems inevitable that if machines do all the work that creates real "value" and people have no comparable value, in a very practical sense we will all be slaves to machines

thrance · 2025-02-18T13:47:11 1739886431

Like in Dune, we won't be slaves to the machines, but to the people owning the machines.

mwigdahl · 2025-02-18T14:27:58 1739888878

"Once men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them." -- Frank Herbert, _Dune_

kadushka · 2025-02-19T05:33:09 1739943189

Are we free today? For example, I have to work for a living. If I don't, my family and I will be miserable. Let's just hope that future "men with machines" don't decide to kill the rest of us - I'm not sure what use we will be to them.

qgin · 2025-02-19T14:52:39 1739976759

I think this is the key change. We’re already beholden to a “machine” (the economy) that none of us completely understand or control or created explicitly. It has its own goals and tendencies that emerged from the complexity.

What AI and robotics does is actually create a machine that has no use for humans at all. Ñ

fragmede · 2025-02-19T09:53:13 1739958793

Your choice is corporations, government, or billionaires. One of those is going to be the "men with machines" that has use for the rest of us.

jimbokun · 2025-02-18T18:43:44 1739904224

The dot com bubble wiped out many billions of dollars in valuation.

The dot com bubble also gave us the most valuable companies in history, like Google, Apple, Amazon, Facebook, etc.

rendang · 2025-02-18T22:55:50 1739919350

The big companies could crash significantly, but if the technology keeps bringing productivity gains, it will have a big positive impact on GDP over the next decade

rukuu001 · 2025-02-18T19:54:24 1739908464

It’s a battle royale. Whoever lasts longest gets to profit at leisure

aprilthird2021 · 2025-02-18T05:28:58 1739856538

I think it's already clear that these are going to be commoditized and the free / open source versions will be good enough to capture enough of the value that the remaining players will not be Facebook-level monopolies on the space

rendang · 2025-02-18T05:38:53 1739857133

Apparently it isn't clear to the investors valuing OpenAI at >300B. Possibly they're betting that the ecosystem & integrations around their models will generate a certain amount of lock-in or otherwise make the difference in a close-to-even field

riffraff · 2025-02-18T07:28:25 1739863705

Investors thought someone renting office space was going to revolutionize the world and valued their company 50B.

jeswin · 2025-02-18T12:24:30 1739881470

I don't think it's automatically a bad idea. Offices require a lot of support, networking, security, maintenance, certifications etc. There are efficiency gains in scaling. In addition, Wework is useful for companies which hire employees in different cities.

snowwrestler · 2025-02-18T12:52:30 1739883150

Lots of things are good ideas but investing is about price vs value. Good ideas can be overpriced as easily as bad ideas.

hmottestad · 2025-02-18T05:46:43 1739857603

Funnily enough a lot of the open source world has landed on an API that is basically a copy of OpenAI. So if you develop against OpenAI it’s almost a slot in solution to switch to an open source solution.

pzo · 2025-02-18T07:02:48 1739862168

and on top of that you have solutions like openrouter.ai where you can route inference easily with a combobox

aprilthird2021 · 2025-02-18T23:17:29 1739920649

This reminds me of a comedy sketch where a guy is interviewing for a job at a startup, finally gets to the last round and meets the founder, and he tells him the whole thing is an illusion for investors

fragmede · 2025-02-19T10:21:31 1739960491

the bet is if they can produce ai that can replace a level of generic office worker. a bot that you can add to slack and give tasks to do.

apples_oranges · 2025-02-18T09:27:02 1739870822

Well, now their job is to keep up the illusion until they have cashed out or offloaded the investment to somebody else.

bigbones · 2025-02-18T07:43:18 1739864598

the IP rights holders have yet to bare their teeth. I don't think the outcome you suggest is clear at all, in fact I think if anything entirely the opposite is the most probable outcome. I've lost count of the number of technology epochs that at the time were either silently or explicitly dependent on ignoring the warez aspects while being blinded by the possibilities, Internet video, music and film all went through this phase. GPTs are just a new medium, and by the end of it royalties will in all likelihood still end up being paid to roughly the same set of folk as before

I quite like the idea of a future where the AI job holocaust largely never happened because license costs ate up most of the innovation benefit. It's just the kind of regressive greed that keeps the world ticking along and wouldn't be surprised if we ended up with something very close to this

beeflet · 2025-02-18T08:03:27 1739865807

Good historical comparison, but I doubt it this time because there is plausible deniability that a model wasn't trained on a given piece of data.

Also, the pool of public domain data is always increasing, so the AI will eventually win in any case, even if we have to wait 100 years

bigbones · 2025-02-18T08:06:41 1739866001

As I recall it, there was a time when copyright infringement on YouTube was so prolific that the rightsholders essentially forced creation of the first watermarking system that worked at massive scale. I do wonder if any corners of research are currently studying the attribution problem with the specific lens of licensing as its motivation

beeflet · 2025-02-18T08:12:20 1739866340

Yeah that was the old Viacom vs Youtube days. Here is a great video if you have half an hour to spare: https://www.youtube.com/watch?v=qV2h_KGno9w . Pretty funny court case where it turns out viacom was violating their OWN copyright... set a massive precedent.

But one thing this reminds me of is the idea of a "trap street", something mapmakers used to do was put in false locations on their maps to prove that other mapmakers were copying them: https://en.wikipedia.org/wiki/Trap_street . I figure you could do something similarly adversarial with AI to pollute the public training data on the internet. IDK like adversarial attacks on image classifiers https://www.youtube.com/watch?v=AOZw1tgD8dA . With an LLM you could try to make them into a manchurian candidate.

kragen · 2025-02-18T16:35:29 1739896529

An environment where royalties inflate the pricing of ChatGPT by orders of magnitude seems like an environment where hosted models would be at a big disadvantage against whatever you can manage to get running on a pile of Macs in your garage.

tiahura · 2025-02-18T13:10:25 1739884225

If your business model depends on the Roberts’ court kneecapping AI, pivot.

Ray20 · 2025-02-18T13:21:45 1739884905

>I quite like the idea of a future where the AI job holocaust largely never happened because license costs ate up most of the innovation benefit.

Not quite realistic. You are talking about very huge benefits, in favor of which licenses will be abandoned. And who don't abandoned them... I mean you can look at the Amish settlements.

bigbones · 2025-02-18T18:43:05 1739904185

I'd put solid money on Warner earning a few cents every time an AI girlfriend somewhere sings happy birthday within 10 years

thefourthchime · 2025-02-18T19:03:50 1739905430

Exactly. I use GPT4o for nearly everything, and occasionally, I'll need o1. For 95% of what I do, it's already good enough.

bobxmax · 2025-02-18T13:49:52 1739886592

The vast majority of people couldn't care less about open source

jonlucc · 2025-02-18T14:36:06 1739889366

If you're paying $200/month for something I can do with open source software and $10/month of compute, why wouldn't I offer you the service for $100/month? And then someone offer it for $50?

Not everyone has to know about, understand, or use open source solutions for it to open the field.

Mekoloto · 2025-02-19T11:35:29 1739964929

Right now you can't run it that cheap at home.

You need to pay energy bill, do the update/upgrade and you need to build a LLM rig.

Nvidias Digits Project could be very interesting, but this box will cost 3k.

We are a lot closer to running it at home than i assumed we would but plenty of people prefer SaaS over doing stuff themselves.

bobxmax · 2025-02-18T15:11:42 1739891502

If you can do a $200/mo service for $10/mo, the closed source will reduce their prices to $15/mo and beat you

This is just a weird dichotomy you're introducing. Open source will introduce price pressure as any competition will - that doesn't mean you won't have a monopoly.

dauhak · 2025-02-18T15:48:03 1739893683

If you have virtually no pricing power and have to drop your $200/mo to $15/mo that's a big deal if your $300bn valuation is implying that not happening, which is what OP's point is about

Idk what you mean by saying this doesn't preclude a monopoly - having your pricing power eroded by competition is kinda one of the key features of what a monopolistic market isn't

bobxmax · 2025-02-20T02:16:07 1740017767

Not at all. Monopolies don't imply an anti-rigid price curve. In fact, monopolies almost never have that.

A monopoly means a company has enough leverage to corner and disproportionately own the market. This is entirely possible (and usually the case) even with significant pricing pressure.

gopher_space · 2025-02-18T20:24:17 1739910257

I think you're both missing a bigger picture. How many of these services can now be replicated in-house by a single developer? Which part of the service actually costs money once that dev deconstructs the process?

Feels like I won't be paying for anything that isn't real-time. And that any time delay I can introduce in my process will come with massive savings. Picture hiding the loading of loot info behind a treasure chest opening animation in a game, except that time difference means you can pull all the work in-house.

Openrouter.ai seems like a step in the right direction but I'd want to do all their calculations myself as well as factor in local/existing gear in a way they don't.

nkozyra · 2025-02-18T14:59:50 1739890790

That's true, but if someone sells you a one-time-purchase box/gadget/phone that will do a snapshot SOTA work and not cost you $20-$200/mo in subscriptions, a lot of people would be in.

Right now the average person has to go through a vendor with a web app, there's not a lot of room for the public to explore.

Things could change in a hurry.

guax · 2025-02-18T14:43:56 1739889836

They don't seem to care about Ai either. The vast majority of people care about the value they're getting, companies care abut open source because its usually free.

I don't think we expect a company to exist solely making a proprietary web server anymore and be a behemoth of 300B. OpenAi might end up at the same model as Nginx or Docker if they don't pivot or find a different model.