MP_1729's comments

MP_1729 · 2024-05-27T17:12:31

“You want to know how to paint a perfect painting? It's easy. Make yourself perfect and then just paint naturally.” - Robert M. Pirsig

The Musk reasoning here is stupid, but smart. If he makes a superhuman intelligence, he can only ask it "What is dark matter?" and it might figure out.

I have some big problems with this idea, but it isn't 100% stupid. Just 98% stupid.

MP_1729 · 2024-05-13T17:47:16

This thing continues to stress my skepticism for AI scaling laws and the broad AI semiconductor capex spending.

1- OpenAI is still working in GPT-4-level models. More than 14 months after the launch of GPT-4 and after more than $10B in capital raised. 2- The rhythm that token prices are collapsing is bizarre. Now a (bit) better model for 50% of the price. How people seriously expect these foundational model companies to make substantial revenue? Token volume needs to double just for revenue to stand still. Since GPT-4 launch, token prices are falling 84% per year!! Good for mankind, but crazy for these companies. 3- Maybe I am an asshole, but where are my agents? I mean, good for the consumer use case. Let's hope the rumors that Apple is deploying ChatGPT with Siri are true, these features will help a lot. But I wanted agents! 4- These drop in costs are good for the environment! No reason to expect them to stop here.

hn_throwaway_99 · 2024-05-13T18:05:36

I'm ceaselessly amazed at people's capacity for impatience. I mean, when GPT 4 came out, I was like "holy f, this is magic!!" How quickly we get used to that magic and demand more.

Especially since this demo is extremely impressive given the voice capabilities, yet still the reaction is, essentially, "But what about AGI??!!" Seriously, take a breather. Never before in my entire career have I seen technology advance at such a breakneck speed - don't forget transformers were only invented 7 years ago. So yes, there will be some ups and downs, but I couldn't help but laugh at the thought that "14 months" is seen as a long time...

ertgbnm · 2024-05-13T18:38:33

Over a year they have provided an order of magnitude improvements on latency, context length, and cost, while meaningfully improving performance and adding several input and output modalities.

asadotzler · 2024-05-13T19:36:06

Your order of magnitude claim is off by almost an order of magnitude. It's more like half again as good on a couple of items and the same on the rest. 10X improvement claims is a joke people making claims like that ought to be dismissed as jokes too.

ertgbnm · 2024-05-13T20:26:32

$30 / million tokens to $5 / million tokens since GPT-4 original release = 6X improvement

4000 token context to 128k token context = 32X improvement

5.4 second voice mode latency to 320 milliseconds = 16X improvement.

I guess I got a bit excited by including cost but that's close enough to an order of magnitude for me. That's ignoring the fact that's it's now literally free in chatGPT.

hn_throwaway_99 · 2024-05-13T21:59:03

Thanks so much for posting this. The increased token length alone (obviously not just with OpenAI's models but the other big ones as well) has opened up a huge number of new use cases that I've seen tons of people and other startups pounce on.

jononor · 2024-05-13T19:51:11

All while not addressing the rampant confabulation at all. Which is the main pain point, to me at least. Not being able to trust a single word that it says...

MP_1729 · 2024-05-13T18:17:23

I am just talking about scaling laws and the level of capex that big tech companies are doing. One hundred billion dollars are being invested this year to pursue AI scaling laws.

You can be excited, as I am, while also being bearish, as I am.

hn_throwaway_99 · 2024-05-13T18:35:10

If you look at the history of big technological breakthroughs, there is always an explosion of companies and money invested in the "new hotness" before things shake out and settle. Usually the vast majority of these companies go bankrupt, but that infrastructure spend sets up the ecosystem for growth going forward. Some examples:

1. Railroad companies in the second half of the 19th century.

2. Car companies in the early 20th century.

3. Telecom companies and investment in the 90s and early 2000s.

spiderfarmer · 2024-05-13T18:43:28

Comments like yours contribute to the negative perception of Hacker News as a place where launching anything, no matter how great, innovative, smart, informative, usable, or admirable, is met with unreasonable criticism. Finding an angle to voice your critique doesn't automatically make it insightful.

MP_1729 · 2024-05-13T18:50:50

I am sure that people at OpenAI, particularly former YC CEO Sam Altman, will be fine, even if they read the bad stuff MP_1729 says around here.

layer8 · 2024-05-13T19:05:21

It’s reasonable criticism, and more useful than all the hype.

candiddevmike · 2024-05-13T18:51:26

What is unreasonable about that comment?

spiderfarmer · 2024-05-14T05:59:40

Moving the goalposts directly after someone scores a goal.

barrell · 2024-05-13T19:01:20

Well, I for one am excited about this update, and skeptical about the AI scaling, and agree with everything said in the top comment.

I saw the update, was a little like “meh,” and was relieved to see that some people had the same reaction as me.

OP raised some pretty good points without directly criticizing the update. It’s a good balance the the top comments (calling this *absolutely magic and stunning*) and all of Twitter

I wish more feedback on HN was like OPs

015a · 2024-05-13T18:52:52

Peoples' "capacity for impatience" is literally the reason why these things move so quick. These are not feelings at-odds with each other; they're the same thing. Its magical; now its boring; where's the magic; let's create more magic.

Be impatient. Its a positive feeling, not a negative one. Be disappointed with the current progress; its the biggest thing keeping progress moving forward. It also, if nothing else, helps communicate to OpenAI whether they're moving in the right direction.

idopmstuff · 2024-05-13T19:13:53

> Be disappointed with the current progress; its the biggest thing keeping progress moving forward.

No it isn't - excitement for the future is the biggest thing keeping progress moving forward. We didn't go to the moon because people were frustrated by the lack of progress in getting off of our planet, nor did we get electric cars because people were disappointed with ICE vehicles.

Complacency regarding the current state of things can certainly slow or block progress, but impatience isn't what drives forward the things that matter.

015a · 2024-05-13T22:37:30

Tesla's corporate motto is literally "accelerating the world's transition to sustainable energy". Unhappy with the world's previous progress and velocity, they aimed to move faster.

tsunamifury · 2024-05-13T18:12:59

It's pretty bizarre how these demos bring out keyboard warriors and cereal bowl yellers like crazy. Huge breakthroughs in natural cadence, tone and interaction as well as realtime mutlimodal and all the people on HN can rant about is token price collapse

It's like the people in this community all suffer from a complete disconnect from society and normal human needs/wants/demands.

spaceman_2020 · 2024-05-13T20:39:06

People fume and fret about startups wasting capital like it was their own money.

GPT and all the other chatbots are still absolutely magic. The idea that I can get a computer to create a fully functional app is insane.

Will this app make me millions and run a business? Probably not. Does it do what I want it to do? Mostly yes.

seydor · 2024-05-13T18:16:49

We re just logarithmic creatures

layer8 · 2024-05-13T19:06:21

I’d say we are derivative creatures. ;)

belter · 2024-05-13T18:11:31

Chair in the sky again...

hn_throwaway_99 · 2024-05-13T18:41:42

Hah, was thinking of that exact bit when I wrote my comment. My version of "chair in the sky" is "But you are talking ... to a computer!!" Like remember stuff that was pure Star Trek fantasy until very recently? I'm sitting here with my mind blown, while at the same time reading comments along the lines of "How lame, I asked it some insanely esoteric question about one of the characters in Dwarf Fortress and it totally got it wrong!!"

layer8 · 2024-05-13T19:08:04

The AI doesn’t behave like the computer in Star Trek, however. The way in which it is a different thing is what people don’t like.

belter · 2024-05-13T19:44:01

They should have used superior Klingon Technology...

financypants · 2024-05-13T18:51:24

There are well talked about cons to shipping so fast, but on the bright side, when everyone is demanding more, more, more, it pushes cost down and demands innovation, right?

nsagent · 2024-05-14T01:48:55

Reminds me of the Louis CK bit about internet:

https://youtube.com/watch?v=me4BZBsHwZs

laweijfmvo · 2024-05-13T19:48:34

Sounds like the Jeopardy answer for "What is a novelty?"

ThrowawayTestr · 2024-05-13T18:13:10

> How quickly we get used to that magic and demand more.

Humanity in a nutshell.

fnordpiglet · 2024-05-13T18:54:10

IMO, for fear of being label a hype boy, this is absolutely a sign of the impending singularity. We are taking an ever accelerating frame of cultural reference as a given and our expectation is that exponential improvement is not just here but you’re already behind once you’ve released.

I spend the last two years dismayed with the reaction but I’ve just recently begun to realize this is a feature not a flaw. This is latent demand for the next iteration expressed as impatient dissatisfaction with the current rate of change inducing a faster rate of change. Welcome to the future you were promised.

ineedaj0b · 2024-05-13T19:16:03

I would disagree. I remember iPhones getting similarly criticized on here. And not iPhone 13 to 14, it was iPhone to iPhone 3g!

The only time people weren’t displeased was increasing internet speeds 15mb to 100mb.

You will keep being dismayed! People only like good things, not good things that potentially make them obsolete

fnordpiglet · 2024-05-15T06:43:36

Sorry we disagree. But I think we agree!!

bamboozled · 2024-05-13T18:12:18

You just be new here?

madeofpalk · 2024-05-13T18:25:53

> Token volume needs to double just for revenue to stand still

I'm pretty skeptical about all the whole LLM/AI hype, but I also believe that the market is still relatively untapped. I'm sure Apple switching Siri to an LLM would ~double token usage.

A few products rushed out thin wrappers ontop of chatgpt ai, developing pretty uninspiring chat bots of limited use. I think there's still huge potential for this LLM technology to be 'just' an implementation detail of other features, just running in the background doing its thing.

That said, I don't think OpenAI has much of a moat here. They were first, but there's plenty of others with closed or open models.

adtac · 2024-05-13T18:07:46

>Token volume needs to double just for revenue to stand still

Profits are the real metric. Token volume doesn't need to double for profits to stand still if operational costs go down.

hehdhdjehehegwv · 2024-05-13T17:58:17

This is why think Meta has been so shrewd in their “open” model approach. I can run Llama3-70B on my local workstation with an A6000, which after the up-front cost of the card, is just my electricity bill.

So despite all the effort and cost that goes into these models, you still have to compete against a “free” offering.

Meta doesn’t sell an API, but they can make it harder for everybody else to make money on it.

kmeisthax · 2024-05-13T18:33:01

LLaMA still has an "IP hook" - the license for LLaMA forbids usage on applications with large numbers of daily active users, so presumably at that point Facebook can start asking for money to use the model.

Whether or not that's actually enforceable[0], and whether or not other companies will actually challenge Facebook legal over it, is a different question.

[0] AI might not be copyrightable. Under US law, copyright only accrues in creative works. The weights of an AI model are a compressed representation of training data. Compressing something isn't a creative process so it creates no additional copyright; so the only way one can gain ownership of the model weights is to own the training data that gets put into them. And most if not all AI companies are not making their own training data...

lolinder · 2024-05-13T18:58:42

> LLaMA still has an "IP hook" - the license for LLaMA forbids usage on applications with large numbers of daily active users, so presumably at that point Facebook can start asking for money to use the model.

No, the license prohibits usage by Licensees who already had >700m MAUs on the day of Llama 3's release [0]. There's no hook to stop a company from growing into that size using Llama 3 as a base.

[0] https://llama.meta.com/llama3/license/

Salgat · 2024-05-13T20:29:46

The whole point is that the license specifically targets their competitors while allowing everyone else so that their model gets a bunch of free contributions from the open source community. They gave a set date so that they knew exactly who the license was going to affect indefinitely. They don't care about future companies because by the time the next generation releases, they can adjust the license again.

lolinder · 2024-05-13T20:43:05

Yes, I agree with everything you just said. That also contradicts what OP said:

> LLaMA still has an "IP hook" - the license for LLaMA forbids usage on applications with large numbers of daily active users, so presumably at that point Facebook can start asking for money to use the model.

The license does not forbid usage on applications with large numbers of daily active users. It forbids usage by companies that were operating at a scale to compete with Facebook at the time of the model's release.

> They don't care about future companies because by the time the next generation releases, they can adjust the license again.

Yes, but I'm skeptical that that's something a regular business needs to worry about. If you use Llama 3/4/5 to get to that scale then you are in a place where you can train your own instead of using Llama 4/5/6. Not a bad deal given that 700 million users per month is completely unachievable for most companies.

ugh123 · 2024-05-13T20:15:13

>How people seriously expect these foundational model companies to make substantial revenue?

My take on this common question is that we haven't even begun to realize the immense scale of which we will need AI in all sorts of products, from consumer to enterprise. We will look back on the cost of tokens now (even at 50% of price a year or so ago) and look at it with the same bewilderment of "having a computer in your pocket" compared to mainframes from 50 years ago.

For AI to be truly useful at the consumer level, we'll need specialized mobile hardware that operates on a far greater scale of tokens and speed than anything we're seeing/trying now.

Think "always-on AI" rather than "on-demand".

spacebanana7 · 2024-05-13T18:04:05

Sam Altman gave the impression that foundation models would be a commodity on his appearance in the All in Podcast, at least in my read of what he said.

The revenue will likely come from application layer and platform services. ChatGPT is still much better tuned for conversation than anything else in my subjective experience and I’m paying premium because of that.

Alternatively it could be like search - where between having a slightly better model and getting Apple to make you the default, there’s an ad market to be tapped.

mrkramer · 2024-05-13T18:15:45

>This thing continues to stress my skepticism for AI scaling laws and the broad AI semiconductor capex spending.

Imagine you are in 1970s and saying computers suck, they are expensive, there is not that many use cases....fast forward to 90s and you are using Windows 95 with GUI and chip astronomically more powerful that we had in 70s and you can use productivity apps , play video games and surf Internet.

Give AI time, it will fulfill its true protentional sooner or later.

MP_1729 · 2024-05-13T18:21:36

That's the opposite of what I am saying.

What I am saying is that computers are SO GOOD that AI is getting VERY CHEAP and the amount of computing capex being done is excessive.

It's more like you are in 1999, people are spending $100B in fiber, while a lot of computer scientists are working in compression, multiplexing, etc.

mrkramer · 2024-05-13T19:01:04

>It's more like you are in 1999, people are spending $100B in fiber, while a lot of computer scientists are working in compression, multiplexing, etc.

But nobody knows what's around the corner and what future brings....for example back in day Excite didn't want to buy Google for $1m because they thought that's a lot of money. You need to spend money to make money and yea, you need to spend sometimes a lot of money on "crazy" projects because it can pay off big time.

MP_1729 · 2024-05-13T20:00:56

Was there ever a time when betting that computer scientists would not make better algorithms was a good idea?

jameshart · 2024-05-13T18:25:17

Which of those investments are you saying would have been a poor choice in 1999?

MP_1729 · 2024-05-13T18:30:49

All of them, without exception. Just recently, Sprint sold their fiber business for $1 lmfao. Or WorldCom. Or NetRail, Allied Riser, PSINet, FNSI, Firstmark, Carrier 1, UFO Group, Global Access, Aleron Broadband, Verio...

All fiber went bust because despite internet's huge increase in traffic, the amount of packets per fiber increased a handful of magnitudes.

jameshart · 2024-05-13T20:22:44

But you’re saying investing in multiplexing and compression was also dumb?

MP_1729 · 2024-05-13T21:17:47

Nope, I'm not

jameshart · 2024-05-13T21:29:09

Then your overarching thesis is not very clear. Is it simply ‘don’t invest in hardware capital, software always makes it worthless’?

MP_1729 · 2024-05-13T23:29:33

It's more like: don't invest $100B in capital when there's still orders of magnitude improvements in software to be made.

htrp · 2024-05-13T17:53:42

Did we ever get confirmation that GPT 4 was a fresh training run vs increasingly complex training on more tokens on the base GPT3 models?

saliagato · 2024-05-13T19:40:19

gpt-4 was indeed trained on gpt-3 instruct series (davinci, specifically). gpt-4 was never a newly trained model

whimsicalism · 2024-05-13T20:02:06

what are you talking about? you are wrong, for the record

fooker · 2024-05-13T20:14:45

They have pretty much admitted that GPT4 is a bunch of 3.5s in a trenchcoat.

whimsicalism · 2024-05-13T20:17:48

They have not. You probably read "MoE" and some pop article about what that means without having any clue.

matsemann · 2024-05-13T20:44:02

If you know better it would be nice of you to provide the correct information, and not just refute things.

whimsicalism · 2024-05-13T20:50:19

gpt-4 is a sparse MoE model with ~1.2T params. this is all public knowledge and immediately precludes the two previous commentators assertions

fnordpiglet · 2024-05-13T19:00:46

Where I work in the hoary fringes of high end tech we can’t secure enough token processing for our use cases. Token price decreases means opening of capacity but we immediately hit the boundaries of what we can acquire. We can’t keep up with the use cases - but more than that we can’t develop tooling to harness things fast enough and the tooling we are creating is a quick hack. I don’t fear for the revenue of base model providers. But I think in the end the person selling the tools makes the most and in this case I think it continue to be cloud providers. I think in a very real way OpenAI and Anthropic are commercialized charities driving change and commoditizing rapidly their own products and it’ll be infrastructure providers who win the high end model game. I don’t think this is a problem I think this is in fact inline with their original charters but a different path than most people view nonprofit work. A much more capitalist and accelerated take.

Where they might make future businesses is in the tooling. My understanding from friends within these companies is their tooling is remarkably advanced vs generally available tech. But base models aren’t the future of revenues (to be clear tho they make considerable revenue today but at some point their efficiency will cannibalize demand and the residual business will be tools)

MP_1729 · 2024-05-13T19:26:36

I'm curious now. Can you give color on what you're doing that you keep hitting boundaries? I suppose it isn't limited by human-attention.

fnordpiglet · 2024-05-15T06:43:14

Yes it’s limited by human attention. It has humans in the loop but a lot of LLM use cases come from complex language oriented information space challenges. It’s a lot of classification challenges as well as summarization and agent based dispatch / choose your own adventure with humans in the loop in complex decision spaces at a major finserv.

IanCal · 2024-05-13T18:06:52

Tbf gpt4 level seems useful and better than almost everything else (or close if not). The more important barriers for use in applications have been cost, throughout and latency. Oh and modalities, which have expanded hugely.

w10-1 · 2024-05-13T19:10:41

> Since GPT-4 launch, token prices are falling 84% per year!! Good for mankind, but crazy for these companies

The message to competitor investors is that they will not make their money back.

OpenAI has the lead, in market and mindshare; it just has to keep it.

Competitors should realize they're better served by working with OpenAI than by trying to replace it - Hence the Apple deal.

Soon model construction itself will not be about public architectures or access to CPU's, but a kind of proprietary black magic. No one will pay for upstart 97% when they can get reliable 98% at the same price, so OpenAI's position will be secure.

siscia · 2024-05-13T20:29:48

Now a bit of Shameless plug, but of you need an AI to take over your emails then my https://getgabrielai.com should cover most use cases.

* Summarisation * Smart filtering * Smart automatic drafting of replies

Very much in beta, and summarisation is still behind feature flag, but feel free to give it a try.

For summarisation here I mean to get one email with all your unread emails summarised.

abrichr · 2024-05-13T19:14:44

> where are my agents?

https://github.com/OpenAdaptAI/OpenAdapt/

drag0s · 2024-05-13T18:29:36

what do you actually expect from an "agent"?

MP_1729 · 2024-05-13T21:24:31

Ask stuff like "Check whether there's some correlation between the major economies fiscal primary deficit and GDP growth in the post-pandemic era" and get an answer.

golol · 2024-05-13T19:24:29

GPT-2: February 2019

GPT-3: June 2020

GPT-3.5: November 2022

GPT-4: March 2023

There were 3 years between GPT-3 and GPT-4!

MP_1729 · 2024-05-13T21:20:23

Obviously, I know these timetables.

But there's a light and day difference post-Nov22 than before. Both in the AI race it sparkled, but also in the funding all AI labs have.

If you're expecting GPT-5 by 2026, that's ok. Just very weird to me.

whimsicalism · 2024-05-13T20:01:27

hardly anybody you are talking to even knows what gpt3 is, the time between 3.5 and 4 is what is relevant

golol · 2024-05-13T20:16:47

It doesn't make any sense to look at it that way. Apparently the GPT base model finised training in like late summer 2022, which is before the release of GPT-3.5. I am pretty sure that GPT-3.5 should be thought of as GPT-4-lite, in the sense that it uses techniques and compute of the GPT-4 era rather than the GPT-3 era. The advancement from GPT-3 to GPT-4 is what counts and it took 3 years.

whimsicalism · 2024-05-13T20:20:55

I fully don't agree.

> I am pretty sure that GPT-3.5 should be thought of as GPT-4-lite, in the sense that it uses techniques and compute of the GPT-4 era rather than the GPT-3 era

Compute of the "GPT-3 era" vs the "GPT-3.5 era" is identical, this is not a distinguishing factor. The architecture is also roughly identical, both are dense transformers. The only significant difference between 3.5 and 3 is the size of the model and whether it uses RLHF.

golol · 2024-05-13T20:30:45

Yes you're right about the compute. Let me try to make my point differnetly: GPT-3 and GPT-4 were models which when they were released represented the best that OpenAI could do, while GPT-3.5 was an intentionally smaller (than they could train) model. I'm seeing it as GPT-3.5 = GPT-4-70b. So to estimate when the next "best we can do" model might be released we should look at the difference between the release of GPT-3 and GPT-4, not GPT-4-70b and GPT-4. That's my understanding, dunno.

whimsicalism · 2024-05-13T20:38:09

GPT-4 only started training roughly at the same time/after the release of GPT-3.5, so I'm not sure where you're getting the "intentionally smaller".

golol · 2024-05-13T21:05:46

Ah I misremembered GPT-3.5 as being released around the time of ChatGPT.

whimsicalism · 2024-05-13T21:39:41

oh you remembered correctly, those are the same thing

actually i was wrong about when gpt-4 started training, the time i gave was roughly when they finished

Pr0ject217 · 2024-05-13T18:51:24

"OpenAI is still working in GPT-4-level models."

This may or may not be true - just because we haven't seen GPT-level-5 capabilities, does not mean that it does not yet exist. It is highly unlikely that what they ship is actually the full capability of what they have access to.

MP_1729 · 2024-05-13T19:07:59

they literally launched TODAY a GPT-4 model!

ldjkfkdsjnv · 2024-05-13T17:57:06

Yeah I'm also getting suspicious. Also, all of the models (opus, llama3, gpt4, gemini pro) are converging to similar levels of performance. If it was true that the scaling hypothesis was true, we would see a greater divergence of model performance

bigyikes · 2024-05-13T18:37:38

Plot model performance over the last 10 years and show me where the convergence is.

The graph looks like an exponential and is still increasing.

Every exponential is a sigmoid in disguise, but I don’t think there has been enough time to say the curve has flattened.

MP_1729 · 2024-05-13T18:55:02

Two pushbacks.

1- The mania only started post Nov 22. And the huge investments since then didn't meant substantial progress since GPT-4 launch in March 22. 2- We are running out of high quality tokens in 2024. (per Epoch AI)

dwaltrip · 2024-05-13T21:02:15

GPT-4 launch was barely 1 year ago. Give the investments a few years to pay off.

I've heard multiple reports that training runs costing ~$1 billion are in the the works at the major labs, and that the results will come in the next year or so. Let's see what that brings.

As for the tokens, they will find more quality tokens. It's like oil or other raw resources. There are more sources out there if you keep searching.

bionhoward · 2024-05-13T18:53:18

imho gpt4 is definitely [proto-]agi and the reason i cancelled my openai sub and am sad to miss out on talking to gpt4o is, openai thinks it's illegal, harmful, or abusive to use their model output to develop models that compete with openai. which means if you use openai then whatever comes out of it is toxic waste due to an arguably illegal smidgen of legal bullshit.

for another adjacent example, every piece of code github copilot ever wrote, for example, is microsoft ai output, which you "can't use to develop / otherwise improve ai," some nonsense like that.

the sum total of these various prohibitions is a data provenance nightmare of extreme proportion we cannot afford to ignore because you could say something to an AI and they parrot it right back to you and suddenly the megacorporation can say that's AI output you can't use in competition with them, and they do everything, so what can you do?

answer: cancel your openai sub and shred everything you ever got from them, even if it was awesome or revolutionary, that's the truth here, you don't want their stuff and you don't want them to have your stuff. think about the multi-decade economics of it all and realize "customer noncompete" is never gonna be OK in the long run (highway to corpo hell imho)

MP_1729 · 2024-05-10T18:20:33

Simmons is one of the greatest people and a true inspiration as a mathematician, even though my career drifted from academia. He and Andrew Wiles are the reason why I always say I am a mathematician, even though I work elsewhere.

RIP

77pt77 · 2024-05-12T01:51:01

Why do you admire Wiles so much?

MP_1729 · 2024-05-13T17:51:35

I read Simon Sigh during high schools and that was such a beautiful story of perseveration that I decided to do Math.

MP_1729 · 2024-05-06T13:31:55

Beware Goodheart's Law: "when a measure becomes a target, it ceases to be a good measure". If your goal is stopping to waste time solving bugs, I'm sure you're going to be able to do that.

You should have an important counter-metric to see if you're not messing with software. It could be number of reported bugs, crashs in production, etc.

PaulKeeble · 2024-05-06T13:51:18

Then it becomes the challenger scenario. Various pieces are failing but the whole mission succeeds so everyone ignores the known risks because management is interested in their return on investment. That works right up until the rocket explodes and suddenly there are lots of external people asking serious questions. Boeing is going through the same thing having optimised for ROI as well and its planes are now falling apart on a daily basis.

Who always gets in trouble for this? More often than not the developers and operators who in a high pressure environment optimised what they were told to optimise and gamed the metrics a little so they weren't fired or held back in their careers.

Izkata · 2024-05-06T18:33:52

Naming it "muda" helps push it that way, too: If any of those higher-ups decide to look up the word, they'll see that you're calling bugfixing "pointless work".

hinkley · 2024-05-06T22:39:16

Professional athletes have a lot of telemetry on them. But some of that telemetry makes sense during training, and maybe makes more sense for a brief period of time while they work on technique.

You focus on something intensely for a little while, get used to how it feels, then you work on something else. If your other numbers look wrong, or if it's been too long, we look at it again.

littlestymaar · 2024-05-06T13:50:50

Nitpick: It's Goodhart (without the “e”).

MP_1729 · 2024-04-22T17:40:56

Financial market applications of "transformers", not LLMs

MP_1729 · 2024-04-22T17:18:54

It seems hard for people to take a nuanced approach that GPT-4 level models have in the present the potential to improve many people's lives and corporations' bottom lines while still being cautiously pessimists on the next generation of models.

GPT-4, Claude 3 & Co. are simply too useful for certain coding tasks or to review a contract. Obviously, you need to understand you're dealing with a probabilistic being, and for many tasks it isn't the correct tool, but I use ChatGPT ~5 a day and the $20 are super well spent. Now, there's an ocean apart of me liking to use ChatGPT and the promises from Silicon Valley and Microsoft.

MP_1729 · 2024-04-19T17:54:37

Mark said in a podcast they are currently at MMLU 85, but it's still improving.

MP_1729 · 2024-04-03T19:11:44

OpenAI is not the worst, ChatGPT is used by 100M people weekly, sort of insulated from benchmarks. The best of the rest, Anthropic, should be really scared.

MP_1729 · 2024-01-31T04:04:48

Definitely not the case. Shareholders got back what was theirs of right.

Dozens of CEOs, including CEOs of much larger companies like Satya Nadella and Tim Cook, got $2.2B in stock options that ended up being worth $55B. But I want to stress that no one, ever, got a salary of $2.2B. Satya Nadella just now have crossed the $1B compensation mark.

The market is full of CEOs that 10x-ed their companies' share prices and all they get is a decent salary.

AlchemistCamp · 2024-01-31T13:08:48

Which other CEOs of companies as large as Tesla have 10x-ed their company’s share price in just five years?

bmitc · 2024-01-31T13:20:51

Perhaps share price is not a good metric? And it would actually explain a lot of Musk's behavior in manipulating the stock price.

AlchemistCamp · 2024-01-31T13:38:05

Given that the discussion is about whether or not the shareholder’s interests were violated, share price is a pretty good metric in this case.

bmitc · 2024-01-31T17:14:58

Shareholder interests go beyond pump and dumping a stock.

mlindner · 2024-02-03T10:53:34

Share price wasn't the metric used to determine the compensation though. It was revenue growth.

esoterica · 2024-01-31T17:54:33

Nvidia is more than twice the size of Tesla currently and 17xed over the last 5 years.

PinkiesBrain · 2024-02-01T03:30:48

Looking at form 4s, Jensen got about 25 Billion dollar worth of shares between 2018 and 2023.