Ilya Sutskever to leave OpenAI

zoogeny · on May 15, 2024

Interesting, both Karpathy and Sutskever are gone from OpenAI now. Looks like it is now the Sam Altman and Greg Brockman show.

I have to admit, of the four, Karpathy and Sutskever were the two I was most impressed with. I hope he goes on to do something great.

nabla9 · on May 15, 2024

Top 6 science guys are long gone. Open AI is run by marketing, business, software and productization people.

When the next wave of new deep learning innovations sweeps the world, Microsoft eats whats left of them. They make lots of money, but don't have future unless they replace what they lost.

kinnth · on May 15, 2024

AI has now evolved beyond just the science and it's biggest issue is in the productization. Finding use cases for what's already available ALONG with new models will be where success lies.

ChatGPT is the number 1 brand in AI and as such needs to learn what it's selling, not how its technology works. It always sucks when mission and vision don't align with the nerds ideas, but I think it's probably the best move for both parties.

ActionHank · on May 15, 2024

> AI has now evolved beyond just the science

Pretty weak take there bud. If we just look at the Gartner Hype Cycle that marketing and business people love so much it would seem to me that we are at the peak, just before the downfall.

They are hyping hard to sell more, when they should be prepping for the coming dip, building their tech and research side more to come out the other side.

Regardless, a tech company without the inventors is doomed to fail.

disqard · on May 15, 2024

I'm siding with you here. The same is happening at Google, but they definitely have momentum from past decades, so even if they go "full Boeing", there's a long way to fall.

Meanwhile, OpenAI (and the rest of the folks riding the hype train) will soon enter the trough. They're not diversified and I'm not sure that they can keep running at a loss in this post-ZIRP world.

Zacobia · on May 17, 2024

You are on point my friend. OpenAI might keep going on with the current momentum for several years without a doubt. But they already lost the long term game. The true essence of the OpenAI empire is not its showmen or PR guys, it's the scientists like Ilya. What a shame to see Ilya leaving. I hope whatever he creates is a rival AI product to stop this OpenAI monopoly and put up some interesting competition.

Breza · on May 21, 2024

OpenAI launched gpt3 on June 11, 2020. That's probably the biggest lead they'll ever have over the competition. Over the past several months, I've gotten to the point where OpenAI could vanish tomorrow and I honestly wouldn't miss it. Claude 3 Opus, DBRX, and Llama 3 are at least as good at the tasks that I spend most of my time doing. And if Google can get itself figured out, Gemini Pro has a lot of potential.

itsoktocry · on May 15, 2024

>ChatGPT is the number 1 brand in AI and as such needs to learn what it's selling, not how its technology works.

I'm not as in tune as some people here so: don't they need both? With the rate at which things are moving, how can it be otherwise?

throwthrowuknow · on May 15, 2024

They do need both but it seems like they have enough engineering talent to keep improving. Time will tell now that Ilya is out but I expect they have enough cultural cache to attract excellent engineers even if they aren’t as famous as Ilya and Karpathy.

They have a strong focus on making the existing models fast and cheap without sacrificing capability which is music to the ears of those looking to build with them.

Zacobia · on May 17, 2024

I love the way you used "cultural cache" here Lol. In any case I do hope whatever Ilya is building is some sort of AI competition to stop this OpenAI monopoly.

godelski · on May 15, 2024

> With the rate at which things are moving

Things have been moving fast because we had a bunch of top notch scientists in companies paired with top notch salesmen/hype machines. But you need both in combination.

Hypemen make promises that can't be kept, but get absurd amounts of funding for doing so. Scientists fill in as many of the gaps as possible, but also get crazy resources due to the aforementioned funding. Obviously this train can't go forever, but I think you might understand that one of these groups is a bit more important than the other while one of these groups is more of a catalyst (makes things happen faster) for the other.

vasco · on May 15, 2024

I guess their point is you already have a lot out there to create new products, and you can still read papers you just won't be writing them.

CooCooCaCha · on May 15, 2024

“Its biggest issue is in the productization.”

That’s not true at all. The biggest issue is that it doesn’t work. You can’t actually trust ai systems and that’s not a product issue.

tivert · on May 15, 2024

> That’s not true at all. The biggest issue is that it doesn’t work. You can’t actually trust ai systems and that’s not a product issue.

I don't know about that, it seems to work just fine at creating spam and clone websites.

startupsfail · on May 15, 2024

Ilya is one of the Founders of the original nonprofit. This is also an issue. It does look like he was not the Founder or in any control of the for profit venture.

anonymousab · on May 15, 2024

> You can’t actually trust ai systems

For a lot of (very profitable) use cases, hallucinations and 80/20 are actually more than good enough. Especially when they are replacing solutions that are even worse.

player1234 · on May 16, 2024

What use cases? This kind of thing is stated all the time, never any examples.

jedberg · on May 16, 2024

Any use case where you treat the output like the work of a junior person and check it. Coding, law, writing. Pretty much anywhere that you can replace a junior employee with an LLM.

Google or Meta (don't remember which) just put out a report about how many human-hours they saved last year using transformers for coding.

roguas · on May 16, 2024

All the usecases we see. Take a look at perplexity optimising short internet research. If I get this mostly right its fine enough, saved my 30 minutes of mindless clicking and reading - even if some errors are there.

CooCooCaCha · on May 16, 2024

You make it sound like LLMs just make a few small mistakes when in reality they can hallucinate on a large scale.

still_grokking · on May 16, 2024

What are examples of these (very profitable) use cases?

Producing spam has some margin on it, but is it really very profitable? And else?

tbrownaw · on May 15, 2024

It works fine for some things. You just need a clearly defined task where LLM + human reviewer is on average faster (ie cheaper) than a human doing the same task themselves without that assistance.

still_grokking · on May 16, 2024

Given the fact that you need to review, research, and usually correct every detail of AI output, how can that be faster than just doing it right yourself in the first place? Do you have some examples of such tasks?

tbrownaw · on May 16, 2024

Yes. There's the one that $employer built a POC app for and found did in fact save time. There's also github copilot which apparently a large chunk of people find saves time for them (and which $employer is reviewing to figure out if they can identify which people / job functions benefit precisely enough to pay for group licensing).

senseiV · on May 15, 2024

if the ai is the product, and the product isnt trustable, isnt that a product issue??

shwaj · on May 15, 2024

It’s a core technology issue.

The AI isn’t the product, e.g. the ChatGPT interface is the main product that is layered above the core AI tech.

The issue is trustworthiness isn’t solvable by applying standard product management techniques on a predictable schedule. It requires scientific research.

appplication · on May 15, 2024

To this end, OpenAI is already off track. Their “GPT marketplace” or whatever they’re calling it is just misguided flailing from a product perspective.

javaunsafe2019 · on May 15, 2024

Ain’t there this pattern that innovations comes in waves and that the companies of the first wave most often just die but the second and third wave a build upon their artefacts and can be successful in a longer run?

I see this coming for sure for open ai and I do my part by just writing this comment on HN.

JohnFen · on May 15, 2024

Yes, "the pioneers get all the arrows".

From a business point of view, you don't want to be first to market. You want to be the second or third.

j45 · on May 15, 2024

Or they were experimenting with people defining agentic A.I. slightly before it became more widely popular.

watt · on May 15, 2024

> ChatGPT is the number 1 brand in AI

Not for long. They have no moat. Folks who did the science are now doing science for some other company, and will blow the pants off OpenAI.

pembrook · on May 15, 2024

I think you massively underestimate the power of viral media coverage and the role it plays in building a “brand.” You’ll never replicate the Musk/Altman/Satya soap opera again. ChatGPT will forever be in the history books as the Kleenex of LLM AI.

mensetmanusman · on May 16, 2024

“You are a bad user, I am a good bing!”

ChildOfChaos · on May 15, 2024

Eh maybe from a company point of view.

But this race to add 'AI' into everything is producing a lot of nonsense. I'd rather go fullsteam ahead on the science and the new models, because that is what will actually get us something decent, rather than milking what we already have.

ronald_petty · on May 15, 2024

Agree in general. While there remains issues on making/using AI, there is plenty of utility that doesn't require new science but maturation of deployment. For those who say its junk, I can only speak for myself and disagree.

fsloth · on May 15, 2024

If we look at history of innovation and invention it’s very typical the original discovery and final productization are done by different people. For many reasons, but a lot of them are universal I would say.

E.g. Oppenheimer’s team created the bomb, then following experts finetuned the subsequent weapon systems and payload designs. Etc.

OJFord · on May 15, 2024

> If we look at history of innovation and invention it’s very typical the original discovery and final productization are done by different people.

You don't really need to look at history, that's basically science vs engineering in a nutshell.

Maybe history could tell us if that's an accident or a division that arose out of 'natural' occurrence, but I suppose a question for an economist or psychologist or sociologist how natural that could really be anyway or if it's biased by e.g. academics not financially motivated because it happens that there isn't money there; so they don't care about productising; leaving it for others who are so motivated.

fsloth · on May 15, 2024

With abombs for weapons systems design they needed people who just got huge kicks out of explosions (not kidding here). I guess it’s partially about personal internal motivations, and it might be more of a chance wether the thing you are intrinsically motivated to do falls under engineering or science (in both cases you get the feeling the greats did stuff they wanted to do regardless of the categorizations applied to their discipline - you get more capital affinity in engineering ofc).

JohnFen · on May 15, 2024

> that's basically science vs engineering in a nutshell

Right, because those are two very different things. Science is about figuring out truths of how reality works. Engineering is about taking those truths and using them to make useful things.

People often talk in a way that conflates the two, but they are completely different activities.

fprog · on May 15, 2024

Except OpenAI hasn’t yet finished discovery on its true goal: AGI. I wonder if they risk plateauing at a local maximum.

anthonypasq · on May 15, 2024

how do you know that is actually currently their true goal. it appears to me that the goal has shifted, and the people that cared about the original goal are leaving.

shwaj · on May 15, 2024

I’m pretty sure Altman still cares about full AGI, if only because of the power inherent in achieving that goal.

JohnFen · on May 15, 2024

I don't think that their goal is achievable (at least not within any of our lifetimes), so I think that "plateauing" is inevitable.

Zambyte · on May 15, 2024

I'm genuinely curious: what do you expect an "AGI" system to be able to do that we can't do with today's technology?

Jensson · on May 15, 2024

An AGI could replace human experts at tasks that doesn't require physical embodiment, like diagnosing patients, drafting contracts, doing your taxes etc. If you still do those manually and not just offload all of it to ChatGPT then you would greatly benefit from a real AGI that could do those tasks on their own.

And no, using ChatGPT like you use a search engine isn't ChatGPT solving your problem, that is you solving your problem. ChatGPT solving your problem would mean it drives you, not you driving it like it works today. When I hired people to help me do taxes they told me what papers they needed and then they did my taxes correctly without me having to look it through and correct them, an AGI would work like that for most tasks, it means you no longer need to think or learn to solve problems since the AGI solves them for you.

xdennis · on May 15, 2024

> An AGI could replace human experts at tasks that doesn't require physical embodiment, like diagnosing patients, drafting contracts, doing your taxes etc.

How come the goal posts for AGI are always the best of what people can do?

I can't diagnose anyone, yet I have GI.

Reminds me of:

> Will Smith: Can a robot write a symphony? Can a robot take a blank canvas and turn it into a masterpiece?

> I Robot: Can you?

Jensson · on May 15, 2024

> How come the goal posts for AGI are always the best of what people can do?

Not the best, I just want it to be able to do what average professionals can do because average humans can become average professionals in most fields.

> I can't diagnose anyone, yet I have GI.

You can learn to, an AGI system should be able to learn to as well. And since we can copy AGI learning it means that if it hasn't learned to diagnose people yet then it probably isn't an AGI, because an AGI should be able to learn that without humans changing its code and once it learned it once we copy it forever and now the entire AGI knows how to do it.

So, the AGI should be able to do all the things you could do if we include all versions of you that learned different fields. If the AGI can't do that then you are more intelligent than it in those areas, even if the singular you isn't better at those things than it is.

For these reasons it makes more sense to compare an AGI to humanity rather than individual humans, because for an AGI there is no such thing as "individuals", at least not the way we make AI today.

HeatrayEnjoyer · on May 15, 2024

People with severe Alzheimer's cannot learn, but still have general intelligence.

Jensson · on May 15, 2024

If they can't learn then they don't have general intelligence, without learning there are many problems you wont be able to solve that average (or even very dumb) people can solve.

Learning is a core part to general intelligence, as general intelligence implies you can learn about new problems so you can solve those. Take away that and you are no longer a general problem solver.

Zambyte · on May 15, 2024

That's a really good point. I want to define what I think of intelligence as being so we are on the same page: it is the combination of knowledge and reason. An example of a system with high knowledge amd low reason is Wikipedia. An example of a system with high reason and low knowledge is a scientific calculator. A highly intelligent system exhibits aspects of both.

A rule based expert intelligence system can be highly intelligent, but it is not general, and maybe no arrangement of rules could make one that is general. A general intelligence system must be able to learn and adapt to foreign problems, parameters, and goals dynamically.

Jensson · on May 15, 2024

Yes, I think that makes sense, you can be intelligent without being generally intelligent. For some definitions the person with Alzheimer can be more intelligent than someone without, but the person without is more general intelligent thanks to ability to learn.

The classical example of a general intelligent task is to get the rules for a new game and then play it adequately, there are AI contests for that. That is easy for humans to do, games are enjoyed even by dumb people, but we have yet to make an AI that can play arbitrary games as well as even dumb humans.

Note that LLMs are more general than previous AI's thanks to in context learning, so we are making progress, but still far from as general as humans are.

Zambyte · on May 15, 2024

Let's take a step back from LLMs. Could you accept the network of all interconnected computers as a generally intelligent system? The key part here that drives me to ask this is:

> ChatGPT solving your problem would mean it drives you, not you driving it like it works today.

I had a very bad Reddit addiction in the past. It took me years of consciously trying to quit in order to break the habit. I think I could make a reasonable argument that Reddit was using me to solve its problems, rather than myself using it to solve mine. I think this is also true of a lot of systems - Facebook, TikTok, YouTube, etc.

It's hard to pin down all computers as an "agent" in the way we like to think about that word and assign some degree of intelligence to, but I think it is at least an interesting exercise to try.

Jensson · on May 15, 2024

Companies are general intelligences and they use people, yes. But that depends on humans interpreting that data reddit users generates and updating their models, code and algorithms to adapt to that data, the computer systems alone aren't general intelligences if you remove the humans.

An AGI could run such a company without humans anywhere in the loop, just like humans can run such a company without an AGI helping them.

I'd say a strong signal that AGI has happened are large fully automated companies without a single human decisionmaker in the company, no CEO etc. Until that has happened I'd say AGI isn't here, if that happens it could be AGI but I can also imagine a good enough script to do it for some simple thing.

tsimionescu · on May 15, 2024

The simplest answer, without adding any extraordinary capabilities to the AGI that veer into magical intelligence, is to have AI assistants that can seemlessly interact with technology the way a human assistant would.

So, if you want to meet with someone, instead of opening you calendar app and looking for an opening, you'd ask your AGI assistant to talk to their AGI assistant and set up a 1h meeting soon. Or, instead of going on Google to find plane tickets, you'd ask you AGI assistant to find the most reasonable tickets for a certain date range.

This would not require any special intelligence more advanced than a human's, but it does require a very general understanding of the human world that is miles beyond what LLMs can achieve today.

Going only slightly further with assumptions about how smart an AGI would be, it could revolutionize education, at any level, by acting as a true personalized tutor for a single student, or even for a small group of students. The single biggest problem in education is that it's impossible to scale the highest quality education - and an AGI with capabilities similar to a college professor would entirely solve that.

sebastiennight · on May 15, 2024

The examples you're providing seem to have been thoroughly solved already.

I'm at the European AI Conference for our startup tomorrow, and they use a platform that just booked me 3 meetings automatically with other people there based on our availability... It's not rocket science.

And you don't even need those narrow tools. You could easily ask GPT-4o (or lesser versions) something along the lines of :

> "you're going to interact with another AI assistant to book meetings for me: [here would be the details about the meeting]. Come up with a protocol that you'll send to the other assistant so it can understand what the meetings are about, communicate you their availability, etc. I want you to come up with the entire protocol, send it, and communicate with the other assistant end-to-end. I won't be available to provide any more context ; I just want the meeting to be booked. Go."

tsimionescu · on May 15, 2024

GPT-4(o) lacks the ability to connect to any of the tools needed to achieve what I'm describing. Sure, it maybe could give instructions about how this could be done, but it can't actually do it. It can't send an email to your email account, and it can't check your incoming emails to see if any arrived asking for a meeting. It can't then check your calendar, and propose another email, or book a time if the time is available. It doesn't know that you normally take your lunch at some time, so that even though the spot is free, you wouldn't want a meeting at that time. And even if you did take the considerable amount of effort to hook it up with all of these systems, it's failure rate is still far too high to rely on it for such a thing.

And getting it to actually buy stuff like plane tickets on your behalf would be entirely crazy.

Sure, it can be made to do some parts of this for very narrowly defined scenarios, like the specific platform of a single three day conference. But it's nowhere near good enough for dealing with the general case of the messy general world.

sebastiennight · on May 16, 2024

Here's what's strange about your argument.

I had a (human) assistant in my previous business, super-smart MBA type, and by your definition she wasn't a general intelligence on the day of onboarding:

- she didn't have access to my email account or calendar

- she didn't know my usual lunch time hours

- she didn't have a company card yet.

All of those points you're raising are logistics, not intelligence.

Intelligence is "When trying to achieve a goal, can you conceive of a plan to get there despite adverse conditions, by understanding them and charting/reviewing a sequence of actions".

You can definitely be an intelligent entity without hands or tools.

tsimionescu · on May 16, 2024

I'm pretty certain your assistant learned to do all of those things more or less on her own. Of course, you shared your schedule and email with them, and similarly, you'd have to share your schedule and email with an AGI.

But you certainly didn't have to write a special program for your assistant to integrate with your inbox, they just used an existing email/calendar client and looked at their screen.

GPT-4 is nowhere near able to interact with, say, the Gmail web page at this level. And even if you created the proper integrations, it's nowhere near the level that it could read all incoming email and intelligently decide, with high accuracy, which emails necessitate updates to your calendar, which don't, and which necessitate back-and-forth discussions to negotiate a better date for you.

Sure, your assistant didn't know all of this on day one, but they learned how to do it on their own, presumably with a few dozen examples at most. That is the mark of a general intelligence.

sebastiennight · on May 24, 2024

I think we're disagreeing on the current capacity of models, as much as we're disagreeing about the definition of AGI.

I'm pretty sure, from previous interactions with GPT-4o and from their demos, that if you used their desktop app (which enables screensharing) and asked it to tell you where to click, step-by-step, in the Gmail web page, it would be able to do a pretty good job of navigating through it.

Let's remember that the Gmail UI is one of the most heavily documented (in blogs, FAQs, support pages, etc) in the world. I can't see GPT-4o having any difficulty locating elements in there.

InfiniteRand · on May 21, 2024

I think the intelligence part is to think of any potential logistical obstacles and figure out ways to deal with them with minimal disruption except when necessary because of potential conflicts with other goals.

Zambyte · on May 16, 2024

> Sure, it maybe could give instructions about how this could be done [...]

If you were in a room with no computer, would you consider yourself to be not intelligent enough to send an email? Does the tooling you have access to change your level of intelligence?

Zambyte · on May 15, 2024

This is definitely an interesting way to look at it. My initial reaction is to consider that I can enhance the capabilities of a system without increasing its inteligence. For example, if I give a monkey a hammer, it can do more than it could do when it didn't have the hammer, but it is not more intelligent (though it could probably learn things by interacting with the world with the hammer). That leads me to think: can we enhance the capabilities of what we call "AI systems" to do these things, without increasing their intelligence? It seems like you can glue GPT-4o to some calendar APIs to do exactly this. This seems more like an issue of tooling rather than an issue of intelligence to me.

I guess the issue here is: can a system be "generally intelligent" if it doesn't have access to general tools to act on that intelligence? I think so, but I also can see how the line is very fuzzy between an AI system and the tools it can leverage, as really they both do information processing of some sort.

Thanks for the insight.

tsimionescu · on May 16, 2024

I'm sure some aspects of this can be achieved by manually programming GPT-4 links to other specific services. And obviously, some interaction tools would have to be written manually even for an AGI.

The difference though is the amount of work. Today if you wanted GPT-4 to work as I describe, you would have to write an integration for Gmail, another one for Office365, another one for Proton etc. You would probably have to create a management interface to give access to your auth tokens for each of these to OpenAI so they can activate these interactions. The person you want to sync with would have to do the same.

In contrast, an AGI that only has average human intelligence, or even below, would just need access to, say, Firefox APIs, and should easily be able to achieve all of this. And it would work regardless if the other side is a different AGI using a different provider, or even if they are just a regular human assistant.

Zambyte · on May 16, 2024

What if you ask GPT-4 to write the integration between its API and an email provider? You're not really "manually" creating the integration then.

tsimionescu · on May 16, 2024

You can try that. I don't think it will be as reliable as you'd want for something like this.

duped · on May 15, 2024

> The single biggest problem in education is that it's impossible to scale the highest quality education

Do you work in education? Because I don't think many who do would agree with this take.

Where I live, the single biggest problem in education is that we can't scale staffing without increasing property taxes, and people don't want to pay higher property taxes. And no, AGI does not fix this problem, because you need staff to be physically present in schools to deal with children.

Even if we had an AGI that could do actual presentation of coursework and grading, you need a human being in there to make sure they behave and to meet the physical needs of the students. Humans aren't software to program around.

tsimionescu · on May 15, 2024

Having individual tutors for each child is not often discussed because it is self-evidently impossible for any cost whatsoever - it would require far too high a percentage of the workforce of a country to be dedicated to education. But it is the most responsible thing for the difference between the education the elites get, especially the elites of the past, and the general education.

Sure, this doesn't mean you could just fire all teachers and dissolve all schools. You still need people to physically be there and interact with the children in various ways. But if you could separate the actual teaching from the child care part, and if you could design individualized courses for each child with something approaching the skill of the best teachers in the whole world, you would get an inconceivably better educational system for the entire population.

And I don't need to work in education for much of this. Like all others, I was intimately acquainted with the educational system (in my country) for 16 years of my life through direct experience, and much more since in increasingly less direct experience. I have very very good and very direct experience of the variance between teachers and the impact that has on how well students understand and interact with the material.

duped · on May 16, 2024

That's like claiming you know how to run a restaurant because you like to eat out. Or worse actually, since you're extrapolating your individual experience from a small set of educational systems to education as a whole.

If you're looking for insight into the problems faced in education, speak to educators. I really doubt they would tell you that the quality of individual instructors is their biggest problem.

tsimionescu · on May 16, 2024

Educators don't like to discuss the performance of other educators, as most professionals don't like to diss their colleagues, especially not in front of their customers. But the quality of educators is absolutely a huge problem, so huge that there are even consecrated sayings about it (those who can, do; those who can't, teach). So huge that one of the most well known rock anthems of all time is about the poor quality of educators (Pink Floyd's Another Brick in the Wall Part II).

Educators are the best people to ask about how to make their jobs easier. They are not necessarily the best people to ask about how to make children's education better.

Edit:

> That's like claiming you know how to run a restaurant because you like to eat out.

No, it's like claiming you know some things about the problems of restaurants, and about the difference between good and bad restaurants, after spending 8+ hours a day almost every day, for 16 years, eating out at restaurants. Which I think would be a decent claim.

JohnFen · on May 15, 2024

> This would not require any special intelligence more advanced than a human's, but it does require a very general understanding of the human world that is miles beyond what LLMs can achieve today.

Does it? I am quite certain those things are achievable right now without anything like AI in the sense being discussed here.

tsimionescu · on May 15, 2024

Show me one product that can offer me an AI assistant that can set up a meeting with you at a time that doesn't contradict any of our plans, given only my and your email address.

JohnFen · on May 16, 2024

I've never looked into actual products as this isn't something I'm interested in. I'm just saying that accomplishing this can be done without involving AI of the sort being discussed here. I'm not sure what such AI would bring to the table for this sort of task.

> given only my and your email address.

AI or not, such an application would need more than just email addresses. It would need access to our schedules.

tsimionescu · on May 16, 2024

My point is that an AGI would give you this use case for free. Currently this kind of product, AI or not, simply doesn't exist. It's in principle doable, but the number of integrations required makes it uneconomical. An AGI assistant could use the same messy interfaces we use, and thus it would be compatible with every email provider and client ever created.

> AI or not, such an application would need more than just email addresses. It would need access to our schedules.

It needs access to my schedule, yes, but it only needs your email address. It can then ask you (or your own AGI assistant) if a particular date and time is convenient. If you then propose another time, it can negotiate appropriately.

Symmetry · on May 15, 2024

A working memory that can preserve information indefinitely outside a particular context window and which can engage in multi-step reasoning that doesn't show up in its outputs.

GPT4o's context window is 128k tokens which is somewhere on the order of 128kB. Your brain's context window, all the subliminal activations from the nerves in your gut and the parts of your visual field you aren't necessarily paying attention to is on the order of 2MB. So a similar order of magnitude though GPT has a sliding window and your brain has more of an exponential decay in activations. That LLMs can accomplish everything they do just with what seems analogous to human reflex rather than human reasoning is astounding and more than a bit scary.

datameta · on May 15, 2024

I'm curious what resources led you to calculate a 2MB context window, I'd like to learn more.

Symmetry · on May 15, 2024

Looking up an estimate of the brain's input bandwidth at 10 million bits per second and multiplying by the second or two a subliminal stimuli can continue to affect a person's behavior. This is a very crude estimate and probably an order of magnitude off, but I don't think many orders of magnitude off.

jagrsw · on May 15, 2024

Some first ideas coming to mind:

Engineering Level:

  Solve CO2 Levels
  End sickness/death
  Enhance cognition by integrating with willing minds.
  Safe and efficient interplanetary travel.
  Harness vastly higher levels of energy (solar, nuclear) for global benefit.

Science:

  Uncover deeper insights into the laws of nature.
  Explore fundamental mysteries like the simulation hypothesis, Riemann hypothesis, multiverse theory, and the existence of white holes.
  Effective SETI

Misc:

  End of violent conflicts
  Fair yet liberal resource allocation (if still needed), "from scarcity to abundance"

TimPC · on May 15, 2024

The problem with CO2 levels is that no one likes the solution not that we don't have one. I highly doubt adding AGI to the mix is going to magically make things better. If anything we'll just burn more CO2 providing all the compute resources it needs.

People want their suburban lifestyle with their red meat and their pick-up truck or SUV. They drive fuel inefficient vehicles long-distances to urban work environments and they seem to have very limited interest in changing that. People who like detached homes aren't suddenly affording the rare instances of that closer to their work. We burn lots of oil because we drive fuel inefficient vehicles long distances. This is a problem of changing human preferences which you just aren't going to solve with an AGI.

jagrsw · on May 15, 2024

Assuming embedded AI in every piece of robotics - sometimes directly, sometimes connected to a central server (this is doable even today) - it'll revolutionize industries: human-less mining, processing, manufacturing, services, and transportation. These factories would eventually produce and install enough solar power or build sufficient nuclear plants and energy infrastructure, making energy clean and free.

With abundant electric cars (at this future point in time) and clean electricity powering heating, transportation, and manufacturing, some AIs could be repurposed for CO2 capture.

It sounds deceptively easy, but from an engineering standpoint, it likely holds up. With free energy and AGI handling labor and thinking, we can achieve what a civilization could do and more (cause no individual incentives come into play).

However, human factors could be a problem: protests (luddites), wireheading, misuse of AI, and AI-induced catastrophes (alignment).

breuleux · on May 15, 2024

Having more energy is intrinsically dangerous, though, because it's indiscriminate: more energy cannot enable bigger solutions without also enabling bigger problems. Energy is the limiting factor to how much damage we can do. If we have way more of it, all bets are off. For instance, the current issue may be that we are indirectly cooking the planet through CO2 emissions, so capturing that sounds like a good idea. But even with clean energy, there is a point where we would cook the planet directly via waste heat of AI and gizmos and factories and whatever unforeseen crap we'll conjure just because we can. And given our track record I'm far from confident that we wouldn't do precisely that.

cityofdelusion · on May 15, 2024

This exactly. Every self replicating organism will eventually use all the energy available to it, there will never be an abundance. From the dawn of time, mankind has similarly used every bit of energy it generates. From the perspective of a subsistence farmer in the 1600s, if you told them how much energy would be available in 400s year they would think we surely must live in paradise with no labor. Here we are, still metaphorically tilling the land.

jprete · on May 15, 2024

The incentives aren't structured properly for these things to happen, it has always been a sci-fi fairy tale that AGI would achieve these things.

Zambyte · on May 15, 2024

Do you believe the average human has general intelligence, and do you believe the average human can intellectually achieve these things in ways existing technology cannot?

jagrsw · on May 15, 2024

Yes, considering that AI operates differently from human minds, there are several advantages:

  AI does not experience fatigue or distractions => consistent performance.
  AI can scale its processing power significantly, despite the challenges associated with it (I understand the challenges)
  AI can ingest and process new information at an extraordinary speed.
  AIs can rewrite themselves
  AIs can be multiplicated (solving scarcity of intelligence in manufacturing)
  Once achieving AGI, progress could compound rapidly, for better or worse, due to the above points.

Jensson · on May 15, 2024

The first AGI will probably take way too much compute to have a significant effect, unless there is a revolution in architecture that gets us fast and cheap AGI at once the AGI revolution will be very slow and gradual.

A model that is as good as an average human but costs $10 000 per effective manhour to run is not very useful, but it is still an AGI.

jagrsw · on May 15, 2024

> A model that is as good as an average human but costs $10 000 per effective manhour to run is not very useful, but it is still an AGI.

Geohot (https://geohot.github.io/blog/) estimates that a human brain equivalent requires 20 PFLOPS. Current top-of-the-line GPUs are around 2 PFLOPS and consume up to 500W. Scaling that linearly results in 5kW, which translates to approximately 3 EUR per hour if I calculate correctly.

Jensson · on May 15, 2024

That is if the first model we make is as efficient as a human.

davidgerard · on May 15, 2024

now do magical flying unicorn ponies, which I understand there is also considerable demand for

amirhirsch · on May 15, 2024

Prove the Riemann Hypothesis

Zambyte · on May 15, 2024

Can you do that, are you not generally intelligent, or is that a bad metric for general intelligence? At least one of these is true.

hskalin · on May 15, 2024

self consciousness?

Zambyte · on May 15, 2024

And how do you test for that?

datameta · on May 15, 2024

Probably the best thing we can do at the moment is compile a list of ways in which we shouldn't test for intelligence.

insane_dreamer · on May 15, 2024

That's probably a good thing

fnordpiglet · on May 15, 2024

I don’t feel that OpenAI has a huge moat against say Anthropic. And I don’t know OpenAI needs Microsoft nearly as much as Microsoft needs OpenAI

cm2187 · on May 15, 2024

But is it even clear what is the next big leap after LLM? I have the feeling many tend to extrapolate the progress of AI from the last 2 years to the next 30 years but research doesn't always work like that (though improvements in computing power did).

benterix · on May 15, 2024

Extrapolating 2 years might give you a wrong idea, but extrapolating the last year suggests making another leap that was GPT3 or GPT4 is much, much more difficult. The only considerable breakthrough I can think of is Google's huge context window which I hope will be the norm one day, but in terms of actual results they're not mind-blowing yet. We see little improvements everyday and for sure there will be some leaps, but I wouldn't count on a revolution.

trashtester · on May 15, 2024

Unlike AI in the past, there is now massive amounts of money going into AI. And the number things humans are still doing significantly better than AI is going down continously now.

If something like Q* is provided organically with GPT5 (which may have a different name), and allows proper planning, error correction and direct interaction with tools, that gaps is getting really close to 0.

varjag · on May 15, 2024

AI in the past (adjusted for 1980s) was pretty well funded. It's just that fundamental scientific discovery bears little relationship to the pallets of cash.

mark_l_watson · on May 15, 2024

Funding in the 1980s was sometimes very good. My company bought me an expensive Lisp Machine in 1982 and after that, even in “AI winters” it mostly seemed that money was available.

AI has a certain mystique that helps get money. In the 1980s I was on a DARPA neural network tools advisory panel, and I concurrently wrote a commercial product that included the 12 most common network architectures. That allowed me to step in when a project was failing (a bomb detector we developed for the FAA) that used a linear model, with mediocre results. It was a one day internal consult to provide software for a simple one hidden layer backprop model. During that time I was getting mediocre results using symbolic AI for NLP, but the one success provided runway internally in my company to keep going.

trashtester · on May 15, 2024

That funding may have felt good at the time compared to some other academic fields.

But compared to the 100s of billions (possibly trillions, globally) that is currently being plowed into AI, that's peanuts.

I think the closest recent analogy to the current spending on AI, was the nuclear arms race during the cold war.

If China is able to field ASI before the US even have full AGI, nukes may not matter much.

mark_l_watson · on May 15, 2024

You are right about funding levels, even taking inflation into account. Some of the infrastructure, like Connection Machines and Butterfly Machines seemed really expensive at the time though.

trashtester · on May 15, 2024

They only seem expensive because they're not expected to generate a lot of value (or military/strategic benefit).

Compare that the 6+ trillions that were spent in the US alone on nuclear weapons, and then consider, what is of greater strategic importance: ASI or nukes?

trashtester · on May 15, 2024

> AI in the past (adjusted for 1980s) was pretty well funded.

A tiny fraction of the current funding. 2-4 orders of magnitude less.

> It's just that fundamental scientific discovery bears little relationship to the pallets of cash

Heavy funding may not automatically lead to breakthroughs such as Special Relativity or Quantum Mechanics (though it helps there too). But once the most basic ideas are in place, massive is what causes the breakthroughs like in the Manhatten Project and Apollo Program.

And it's not only the money itself. It's the attention and all the talent that is pulled in due to that.

And in this case, there is also the fear that the competition will reach AGI first, whether the competition is a company or a foreign government.

It's certainly possible the the ability to monetize the investments may lead to some kind of slowdown at some point (like if there is a recession).

But it seems to me that such a recession will have no more impact on the development of AGI than the dotcom bust had for the importance of the internet.

varjag · on May 15, 2024

> A tiny fraction of the current funding. 2-4 orders of magnitude less.

Operational costs were correspondingly lower, as they didn't need to pay electricity and compute bills for tens of millions concurrent users.

> But once the most basic ideas are in place, massive is what causes the breakthroughs like in the Manhatten Project and Apollo Program.

There is no reason to think that the ideas are in place. It could be that the local optimum is reached as it happened in many other technology advances before. The current model is mass scale data driven, the Internet has been sucked dry for data and there's not much more coming. This may well require a substantial change in approach and so far there are no indications of that.

From this pov monetization is irrelevant, as except for a few dozen researchers the rest of the crowd are expensive career tech grunts.

trashtester · on May 15, 2024

> There is no reason to think that the ideas are in place.

That depends what you mean when you say "ideas". If you consider ideas at the level of transformers, well then I would consider those ideas of the same magnitude as many of the ideas the Manhatten Project or Apollo Program had to figure out on the way.

If you mean ideas like going from expert system to Neural Networks with backprop, then that's more fundamental and I would agree.

It's certainly still conceivable that Penrose is right in that "true" AGI requires something like microtubules to be built. If so, that would be on the level of going from expert systems to NNs. I believe this is considered extremely exotic in the field, though. Even LeCun probably doesn't believe that. Btw, this is the only case where I would agree that funding is more or less irrelevant.

If we require 1-2 more breakthroughs on par with Transformers, then those could take anything from 2-15 years to be discovered.

For now, though, those who have predicted that AI development will mostly be limited by network size and the compute to train it (like Sutskever or implicitly Kurzweil) have been the ones most accurate in the expected rate of progress. If they're right, then AGI some time between 2025-2030 seems most likely.

Those AGI's may be very large, though, and not economical to run for a wider audience until some time in the 30's.

So, to summarize: Unless something completely fundamental is needed (like microtubules), which happens to be a fringe position, AGI some time between 2025 and 2040 seems likely. The "pessimists" (or optimists, in term of extinction risk) may think it's closer to 2040, while the optimists seem to think it's arriving very soon.

makestuff · on May 15, 2024

IMO their next big leap will be to get it cheap enough and integrated with enough real time sources to become the default search engine.

You can really flip the entire ad supported industry upside down if you integrate with a bunch of publishers and offer them a deal where they are paid every time an article from their website is returned. If they make this good enough people will pay $15-20 a month for no ads in a search engine.

throwthrowuknow · on May 15, 2024

I don’t think we’re even close to exhausting the potential of transformer architectures. gpt4o shows that a huge amount can be gained by implementing work done on understanding other media modalities. There’s a lot of audio that they can continue to train on still and the voice interactions they collect will go into further fine tuning. Even after that plays out there will be video to integrate next and thanks to physics simulations and 3D rendering there is a potentially endless and readily generated license free supply of it, at least for the simpler examples. For more complex real world video they could just set up web cams in public areas around the world where consent isn’t required by law and collect masses of data every second. Given that audio seems to have enabled emotional understanding and possibly even humour, I can’t imagine what all might fall out of video. At the least it’s going to improve reasoning since it will involve predicting cause and effect. There are probably a lot of others you could add though we don’t have large datasets for them.

huygens6363 · on May 15, 2024

Not saying it’s going to be the same, but I’m sure computing progress looked pretty unimpressive from, say, 1975 to 1990 for the uninitiated.

By the 90s they were still mainly used as fancy typewriters by “normal” people (my parents, school, etc) although the ridiculous potential was clear from day one.

It just took a looong time to go from pong to ping and then to living online. I’m still convinced even this stage is temporary and only a milestone on the way to bigger and better things. Computing and computational thought still has to percolate into all corners of society.

Again not saying “LLM’s” are the same, but AI in general will probably walk a similar path. It just takes a long time, think decades, not years.

Edit: wanted to mention The Mother of All Demos by Engelbart (1968), which to me looks like it captures all essential aspects of what distributed online computing can do. In a “low resolution”, of course.

dgacmu · on May 15, 2024

Computing progress from 78 to 90 was mind-blowing.

1978: the apple ][. 1mhz 8 bit microprocessor, 4kb of ram, monochrome all-,caps display.

1990:Mac IIci, 25mhz 32-bit CPU, 4MB ram, 640x480 color graphics and an easy to use GUI.

Ask any of us who used both of these at the time: it was really amazing.

jameshart · on May 15, 2024

They were amazing, and the progress was incredible, but both of those computers - while equally exciting and delightful to people who saw the potential - were met with ‘but what can I actually use it for?’ from the vast majority of the population.

By 1990 home computer use was still a niche interest. They were still toys, mainly. DTP, word processing and spreadsheets were a thing, but most people had little use for them - I had access to a Mac IIci with an ImageWriter dot matrix around that time and I remember nervously asking a teacher whether I would be allowed to submit a printed typed essay for a homework project - the idea that you could do all schoolwork on a computer was crazy talk. By then, tools like Mathematica existed but as a curiosity not an essential tool like modern maths workbooks are.

The internet is what changed everything.

6510 · on May 15, 2024

A big obstacle was that everything was on paper. We still had to do massive amounts of data entry.

For some strange reason html forms is an incredibly impotent technology. Pretty standard things are missing like radioboxes with an other text input. 5000+ years ago the form labels aligned perfectly with the value.

I can picture it already, ancient Mesopotamia, the clay tablet needs name and address fields for the user to put their name and address behind. They pull out a stamp or a roller.

Of course if you have a computer you can have stamps with localized name and address formatting complete with validation as a basic building block of the form. Then you have a single clay file with all the information neatly wrapped together. You know, a bit like that e-card no one uses only without half data mysteriously hidden from the record by some ignorant clerk saboteur.

We've also failed to hook up devices to computers. We went from the beautiful serial port to IoT hell with subscriptions for everything. One could go on all day like that, payments, arithmetic, identification, etc much work still remains. I'm unsure what kind of revolution would follow.

Talking thinking machines will no doubt change everything. That people believe it is possible is probably the biggest driver. You get more people involved, more implementations, more experiments, more papers, improved hardware, more investments.

jorvi · on May 15, 2024

> The internet is what changed everything.

Broadband. Dial-up was still too much of an annoyance, too expensive.

Once broadband was ubiquitous in the US and Europe, that's when the real explosion of computer usage happened.

fidotron · on May 15, 2024

Honestly mobile totally outstrips this.

One day at work about 10-15 years ago I looked at my daily schedule and found that on that day my team were responsible for delivering a 128kb build of Tetris and a 4GB build of Real Racing.

huygens6363 · on May 15, 2024

I agree. Likewise, early AI models to GPT4 is breathtaking progress.

Regular people shrug and say, yeah sure, but what can I do with it. They still do this day.

dmd · on May 15, 2024

It was only 11 years from pong to ping.

huygens6363 · on May 16, 2024

You and your family and friends were online in 1983? That’s quite remarkable.

dmd · on May 16, 2024

No, but that’s when “ping” was written, which is what you said.

(And, irrelevant, but my parents were in fact both posting to Usenet in 1983.)

huygens6363 · on May 16, 2024

Kind of missing the forest for the trees, but TIL the actual application called ping was written in 1983.

dtech · on May 15, 2024

mobile internet and smartphones were the real gamechanger here, which were definitely not linear.

They became viable in the 2000's, let's say 2007 with the iPhone, and by late 2010's everyone was living online, so "decades" is a stretch.

huygens6363 · on May 15, 2024

To make the 2000s possible, decades of relatively uninteresting progress was made. It quickly takes off from there.

eitally · on May 15, 2024

I don't think it particularly matters right now (practically speaking). It's going to take years for businesses and product companies to commoditize applications of LLMs, so while it's valuable for the Ilyas & Andrejs of the world to continue the good work of hard research, it's the startups, hyperscalers and SaaS companies who are creating business applications for LLMs that going to be the near term focus.

renegade-otter · on May 15, 2024

In just a couple of generations each training cycle will cost close to $10 billion. That's a lot of cheddar that you have to show ROI on.

bsenftner · on May 15, 2024

The majority of the developers may know what LLMs are in an abstract sense, but I meet very few that really realize what these are. These LLMs are an exponential leap in computational capability. The next revolution is going to be when people realize what we have already, because it is extremely clear the majority do not. RAG? Chatbots? Those applications are toys compared to what LLMS can do right now, yet everyone is dicking around making lusty chatbots or naked celebrities in private.

sheeshkebab · on May 15, 2024

> The next revolution is going to be when people realize what we have already

Enlighten us

bsenftner · on May 15, 2024

It is both subtle and obvious, yet many are missing this: if you want/need a deep subject matter expert in virtually any subject, write a narrative biography describing your expert using the same language that expert would use to describe themselves; this generates a context within the LLM carrying that subject matter expertise, and now significantly higher quality responses are generated. Duplicate this process for several instances of your LLM, creating a home brewed collection of experts, and have them collectively respond to one's prompts as a group privately, and then present their best solution. Now there is a method of generating higher reliability responses. Now turn to the fact that the LLMs are trained on an Internet corpus of data that contains the documentation and support forums for every major software application; using the building blocks described so far, it is not difficult at all to create agents that sit between the user and pretty much every popular software application and act as co-authors with the user helping them use that application.

I have integrated 6 independent, specialized "AI attorneys" into a project management system where they are collaborating with "AI web developers", "AI creative writers", "AI spreadsheet gurus", "AI negotiators", "AI financial analysts" and an "AI educational psychologist" that looks at the user, the nature and quality of their requests, and makes a determination of how much help the user really needs, modulating how much help the other agents provide.

I've got a separate implementation that is all home solar do-it-yourself, that can guide someone from nothing all the way to their own self made home solar setup.

Currently working on a new version that exposes my agent creation UI with a boatload of documentation, aimed at general consumers. If one can write well, as in write quality prose, that person can completely master using these LLMs to superior results.

itsoktocry · on May 15, 2024

>I have integrated 6 independent, specialized "AI attorneys" into a project management system where they are collaborating with "AI web developers", "AI creative writers", "AI spreadsheet gurus", "AI negotiators", "AI financial analysts" and an "AI educational psychologist" that looks at the user, the nature and quality of their requests, and makes a determination of how much help the user really needs, modulating how much help the other agents provide.

Ah yes, "it's so obvious no one sees it but me". Until you show people your work, and have real experts examining the results, I'm going to remain skeptical and assume you have LLMs talking nonsense to each each other.

bsenftner · on May 15, 2024

The point is these characters are not doing the work for people, it co-authors the work with them. It's just like working with someone highly educated but with next to no experience - they're a great help, but ya gotta look at their work to verify they are on track. This is the same, but with a collection of inexperienced phds. The LLMs really are idiot savants, and when you treat them like that they respond with expectations better.

bsenftner · on May 15, 2024

I'm at a law firm, this is in use with attorneys to great success. And no, none of them are so dumb they do not verify the LLM's outputs.

mrtranscendence · on May 15, 2024

How can no one see what we have today? You only need six instances of an LLM running at the same time, with a system to coordinate between them, and then you have to verify the results manually anyway. Sign me up!

datameta · on May 15, 2024

If a certain percent of the work is completed through research synthesis and multiple perspective alignment, why is said novel approach not worth lauding?

I've created a version of one of the resume GPTs that analyses my resume's fit to a position when fed the job description along with a lookup of said company. I then have a streamlined manner in which it points out what needs to be further highlighted or omitted in my resume. It then helps me craft a cover letter based on a template I put together. Should I stop using it just because I can't feed it 50 job roles and have it automatically select which ones to apply to and then create all necessary changes to documents and then apply?

sheeshkebab · on May 15, 2024

epic… and not a single of these “experts” likely can solve even a basic goat problem https://x.com/svpino/status/1790624957380342151

bsenftner · on May 16, 2024

but at some point, probably in the near future, they will. And then this system I have will already be in place, and that added capability will just arrive and integrate into all the LLM integrated systems I've made and they'll just improve.

JKCalhoun · on May 15, 2024

I agree with OP, I think we still have no idea yet what dreams may come of the LLM's we have today. So no one will be able to "enlighten us" — perhaps not until we're looking in the rear-view mirror.

I would say instead, stay tuned.

EGreg · on May 15, 2024

LLM is all you need

Attention and scale is all you need

Anything else you do will be overtaken by LLM when it builds its internal structures

Well, LLM and MCTS

The rest is old news. Like Cyc

nabla9 · on May 15, 2024

There are no moats in deep learning, everything changes so fast.

They have the next iteration of GPT Sutskever helped to finalize. OpenAI lost it's future unless they find new same caliber people.

sk11001 · on May 15, 2024

> They have the next iteration of GPT Sutskever helped to finalize

How do you know that they have the next GPT?

How do you know what Sutskever contributed? (There was talk that the most valuable contributions came from the less well known researchers not from him)

nabla9 · on May 15, 2024

sha256:e33135417f7f5b8f4a1c98c28cf26330bea4cc6b120765f59f5d518ea0ce80e5

jacooper · on May 15, 2024

What should this mean?

criddell · on May 15, 2024

Isn't access to massive datasets and computation the moat? If you and your very talented friends wanted to build something like GPT-4, could you?

It's going to get orders of magnitude less expensive, but for now, the capital requirements feel like a pretty deep moat.

Gud · on May 15, 2024

How do you know massive datasets are required? Just because that’s how current LLMs operate, doesn’t mean it’s necessarily the only solution.

datameta · on May 15, 2024

Then the resources needed to discover an alternative to brute-forcing a large model are a huge barrier.

I think academia and startups are currently better suited to optimize tinyml and edge ai hardware/compilers/frameworks etc.

gunalx · on May 15, 2024

I Don't know. Being able to get azure credits has payed out really well for openai as a business in constant need of computer.

fnordpiglet · on May 15, 2024

Which is a very short term advantage. And Anthropic gets aws credits which would you rather have?

Closi · on May 15, 2024

Never discount the value of short term advantages.

Being first at the start (i.e. first mover advantage) is huge.

luma · on May 15, 2024

Given Amazon's no show in the AI space? Azure. By a mile.

cthalupa · on May 15, 2024

Except if you're Anthropic or OpenAI you don't care about what your compute provider has done in the AI space - you care about the compute power they can give you.

luma · on May 15, 2024

That's exactly what I'm talking about: https://i.imgur.com/sZ3tniY.jpeg

cthalupa · on May 15, 2024

But how many of those are ordered specifically for OpenAI, and are on order as a result of them to begin with? Do you think if we were in a parallel universe where OpenAI ended up partnering with Google or Amazon instead, the GPU shipments would look the same? I think they would reflect wherever OpenAI ended up doing all their compute showing a pretty similar lion's share.

Your claim was that people should care about compute based on what the provider has done in the AI space, but Microsoft was pretty far behind on that side until OpenAI - Google was really the only player in town. Should they have wanted GCP credits instead? Do you care about their AI results or the ex post facto GPU shipments?

Or, if what you actually want to argue is that Anthropic would be able to get more GPUs with Azure than AWS or GCP then this is a different argument which is going to require different evidence than raw GPU shipments.

luma · on May 15, 2024

The claim being implied was that Anthropic was in a better position because they had partnered with AWS versus Azure and thus they would have more access to GPU.

That isn't the case, at all. All I'm stating is what the chart clearly shows - Azure has invested deeply in this technology and at a rate that far exceeds AWS.

bamboozled · on May 15, 2024

They seem to have a huge "money moat" now. Partnerships with Apple and MS mean they have a LOT of money to try a lot of things I guess.

Before the Apple partnership, maybe it seemed like the moat was shrinking, but I'm tno sure now.

Likely they have access to a LOT of data now too.

boringg · on May 15, 2024

OpenAI most definitely needs the compute from MSFT. It could certainly swap out to another service but given that microsoft invested via credits it would be problematic. They have enmeshed their future.

ralfd · on May 15, 2024

How important are top science guys though? OpenAI has a thousand employees and almost unlimited money, and llm are better understood, I would guess continous development will beat singular genius heroes?

benterix · on May 15, 2024

> OpenAI has a thousand employees and almost unlimited money

You could say the same about Google - and yet they missed the consequences of their own discovery and got behind instead of being leaders. So you need specific talent to pull this off even if in theory you can hire anybody.

wg0 · on May 15, 2024

I am just curious how it happened to Google? Like who were the product managers or others who didn't see an opportunity here exactly where the whole thing was invented and they had huge amounts of data already, whole web basically and the amount of video that no one else can ever hope to have?

pembrook · on May 15, 2024

I’m 100% positive lots of people at Google were chomping at the bit to productize LLMs early on.

But the reality is, LLMs are a cannibalization threat to Search. And the Search Monopoly is the core money making engine of the entire company.

Classic innovators dilemma. No fat-and-happy corporate executive would ever say yes to putting lots of resources behind something risky that might also kill the golden goose.

The only time that happens at a big established company, is when driven by some iconoclastic founder. And Google’s founders have been MIA for over a decade.

JKCalhoun · on May 15, 2024

Golden goose is already being hoisted upon a spit — and your company is not even going to get even drippings of the fat. I am surprised by the short-sightedness of execs.

pembrook · on May 15, 2024

I don’t work there, I’ve just worked for lots of big orgs — they are all the same. Any claimed uniqueness in “Organizational structure” and “culture” are just window dressing around good ol’ human nature.

It’s not short sightedness, it’s rational self-interest. The rewards for taking risk as employee #20,768 in a large company are minimal, whereas the downside can be catastrophic for your career & personal life.

jack_riminton · on May 15, 2024

I think the discovery of the power of the LLM was almost stumbled upon at OpenAI, they certainly didn't set out initially with the goal of creating them. Afaik they had one guy who was doing a project of creating an LLM with amazon review text data and only off the back of playing around with that did they realise its potential

andy99 · on May 15, 2024

Data volume isn't that important, that's becoming clearer now. What OpenAI did was paid for a bunch of good labelled data. I'm convinced that's basically the differentiator. It's not a academic or fundamental thing to do which is why google didn't do it, it's a pure practical product thing.

iosjunkie · on May 15, 2024

Well for one, Ilya was poached from Google to work for OpenAI to eventually help build SOTA models.

Fast forward to today and we a discussing the implications of him leaving OpenAI on this very thread.

Evidence to support the notion that you can’t just throw mountains of cash and engineers at a problem to do something truly trailblazing.

thebytefairy · on May 15, 2024

A lot of it was the unwillingness to take risk. LLMs were, and still are, hard to control, in terms of making sure they give correct and reliable answers, making sure they don't say inappropriate things that hurt your brand. When you're the stable leader you don't want to tank your reputation, which makes LLMs difficult to put out there. It's almost good for Google that OpenAI broke this ground for them and made people accepting of this imperfect technology.

aramattamara · on May 15, 2024

It's hard to invest millions in employees who are likely to leave to a competitor later. That's very risky, aka venture.

JKCalhoun · on May 15, 2024

So the alternative is to...?

l5870uoo9y · on May 15, 2024

Difficult to quantify but as an example the 2017 scientific paper “Attention is all you need” changed the entire AI field dramatically. Without these landmark achievements delivered by highly skilled scientists, OpenAI wouldn’t exist or only be severely limited.

belter · on May 15, 2024

And ironically even the authors did not fully grasp at the time the paper importance. Reminds me of when Larry Page and Sergey Brin, tried to sell Google for $1 million ...

spamizbad · on May 15, 2024

It depends on your views on LLMs

If your view is that LLMs only need minor improvements to their core technology and that the major engineering focus should be placed on productizing them, then losing a bunch of scientists might not be seen as that big of a deal.

But if your view is that they still need to overcome significant milestones to really unlock their value... then this is a pretty big loss.

I suppose there's a third view, which is: LLMs still need to overcome significant hurdles, but solutions to those hurdles are a decade or more away. So it's best to productize now, establish some positive cashflow and then re-engage with R&D when it becomes cheaper in the future and/or just wait for other people to solve the hard problems.

I would guess the dominant view of the industry right now is #1 or #3.

mrklol · on May 15, 2024

They definitely need them to find new approaches which you won’t find normally.

sheeshkebab · on May 15, 2024

Talk to Microsoft

RandomLensman · on May 15, 2024

Most of every large business isn't science but getting organized, costs controlled, products made, risk managed, and so forth.

belter · on May 15, 2024

OpenAI has less than 800 employees

AdamN · on May 15, 2024

Agreed - it's good to have some far thinking innovation but really that can be acquired as needed so you really just need a few people with their pulse on innovation which there will always be more of outside a given company than within it.

Right now it's all about reducing transaction costs, small-i innovating, onboarding integrations, maintaining customer and stakeholder trust, getting content, managing stakeholders, and selling.

hojleorier23423 · on May 15, 2024

What an absurd thing to say.

John Schulman is still at OpenAI. As are many others.

FuriouslyAdrift · on May 15, 2024

Jakub Pachocki is taking over as chief scientist. https://analyticsindiamag.com/meet-jakub-pachocki-openais-ne...

dkjaudyeqooe · on May 15, 2024

> Open AI is run by marketing, business, software and productization people.

AKA 'the four horsemen of enshitification'.

ookdatnog · on May 15, 2024

> They make lots of money

Will they though? Last I heard OpenAI isn't profitable, and I don't know if it's safe to assume they every will be.

People keep saying that LLMs are an existential threat to search, but I'm not so sure. I did a quick search (didn't verify in any way if this is a feasible number) to find that Google on average makes about 30 cents in revenue per query. They make a good profit on that because processing the query costs them almost nothing.

But if processing a query takes multiple seconds on a high-end GPU, is that still a profitable model? How can they increase revenue per query? A subscription model can do that, but I'd argue that a paywalled service immediately means they're not a threat to traditional ad-supported search engines.

loldomsa · on May 15, 2024

I honestly think that is the best course of actions for humanity. Even less chance to see AGI anytime soon if he leaves.

renegade-otter · on May 15, 2024

"Productization". You mean "enshitification".

boringg · on May 15, 2024

Depends on who is controlling product.

_ugfj · on May 15, 2024

> When the next wave of new deep learning innovations sweeps the world,

that won't happen, the next scam will be different

it was crypto until FTX collapsed then the usual suspects led by a16z leaned on OpenAI to rush whatever they had on market hence the odd naming of ChatGPT 3.5.

When the hype is finally realized to be just mass printing bullshit -- relevant bullshit, yes, which sometimes can be useful but not billions of dollars of useful -- there will be something else.

Same old, same old. The only difference is there is no new catchy tunes. Yet? https://youtu.be/I6IQ_FOCE6I https://locusmag.com/2023/12/commentary-cory-doctorow-what-k...

trashtester · on May 15, 2024

Crypto currencies has the potential to grow the world economy by about 1-3%, as banking fees go down. Add other uses of crypto may double or triple that, but that's really speculative.

AI, on the other hand, has a near infinite potential. It's conceivable that it will grow the global economy by 2% OR MORE per MONTH for decades or more.

AI is going to be much more impactful than the internet. Probably more than internal combustion, the steam engine and electricity combined.

The question is about the timescale. It could take 2 years before it really starts to generate profits, or it could take 10 or even more.

itsoktocry · on May 15, 2024

>Crypto currencies has the potential to grow the world economy by about 1-3%, as banking fees go down.

Bank fees don't disappear into the ether when they're collected, so I doubt they have this much affect.

Oh, made my very first retail purchase with Bitcoin the other day. While the process was pretty slick and easy, the network charged $15.00 in fees. Long way to go until "free".

trashtester · on May 15, 2024

> Bank fees don't disappear into the ether when they're collected, so I doubt they have this much affect.

1-3% was intended as a ceiling for what cryptocurrency could bring to the economy, after adjusting for the reduction in inflation once those costs are gone.

cma · on May 15, 2024

He's saying the fees aren't burned like with mining, so they don't hurt the economy by the amount of the fee: the profit portion of them goes into other investments. The fees hinder parts of the economy making some transactions nonviable, but they don't fully translate to "friction" making waste heat so much as something more adiabatic that goes back in. It's largely an extraneous spring in the system, not a damper.

trashtester · on May 15, 2024

That's why I stated that 1-3% was the ceiling

KoolKat23 · on May 15, 2024

I agree with you on the AI point, but with crypto not all is what it seems.

Yes you may have short term growth, this is solely due to there being less regulation.

Despite what many people think regulation is a good thing, put in place to avoid the excesses that lead to lost livelihoods. It stops whales from exploiting the poor, provides tools for central banks to try avoid depressions.

Costs wise, banks acting as trust authorities actually can theoretically be cheaper too.

trashtester · on May 15, 2024

Well, I agree with all that. The 1-3% was meant to come off as a tiny, one-time gain, and an optimistic estimate of that. Not at all worth the hype.

Basically, crypto is more like gold rush than a tech breakthrough. And gold rushes rarely lead to much more than increased inflation.

KoolKat23 · on May 15, 2024

Absolutely

cristiancavalli · on May 15, 2024

Do you have a source for any of these numbers or is this just your speculation? I haven’t seen any estimates from well-known institutions that reference any of the numbers your are pointing to.

_ugfj · on May 15, 2024

All crypto"currencies" with a transaction fee are negative sum games and as such , they are a scam. It's been nine years since the Washington Post admittedly somewhat clumsily but still drawn attention to this and people still insist it's something other than a scam. Despite heady articles about how it's going to solve world hunger, it's just a scam.

This round of AI is only capable of producing bullshit. Relevant bullshit but bullshit. This can be useful https://hachyderm.io/@inthehands/112006855076082650 but it doesn't mean it's more impactful than the Internet.

trashtester · on May 15, 2024

I agree, 1-3% was a best case. While I agree it's a net zero, even those who argue for it really don't claim much more than a couple of %.

I actually expected objections on the opposite direction. But then, this is not twitter/X.

The point is that something that can easily generate 20%-100% growth per year (AGI/ASI) is so much more important that the best case prediction for crypto's effect on the economy are not even noticeable.

That's why comparing the crypto bubble to AI is so meaningless. Crypto was NEVER* going to be something hugely important, while AI is potentially almost limitless.

*If crypto had anything to offer at all, it would be ways to avoid fees, taxes and the ability to trace transactions.

The thing is, if crypto at any point seriously threatens to replace traditional currencies as stores of value in the US or EU, it will be banned instantly. Simply because it would make it impossible for governments to run budget deficits, prevent tax evasion and sever other things that governments care about.

_ugfj · on May 15, 2024

LLM is not AGI and there's no way to AGI from LLM. Put down the kool-aid.

trashtester · on May 15, 2024

I never claimed llm's are agi. Not all neural nets are llm's.

jedrek · on May 15, 2024

AI? Yes.

LLMs pretending to be AI? No.

trashtester · on May 15, 2024

What you call "AI" is generally named AGI. LLM's are alredy a kind of AI, not just generic enough to fully replace all humans.

We don't know if full AGI can be built using just current technology (like transformers) given enough scale, or if 1 or more fundamental breakthroughs are needed beyond just the scale.

My hypothesis has always been that AGI will arrive roughly when the compute power and model size matches the human brain. That means models of about 100 trillion params, which is not that far away now.

_ugfj · on May 15, 2024

> We don't know if full AGI can be built using just current technology (like transformers) given enough scale,

We absolutely do and the answer is such a resounding no it's not even funny.

trashtester · on May 15, 2024

Actually, we really don't. When GPT-3.5 was released, it was a massive surprise to many, exactly because they didn't believe simply scaling up transformers wouldn't end up with something like that.

Now using transformers doesn't mean they have to be assembled like LLM's. There are other ways to stich them together to solve a lot of other problems.

We may very well have the basic types of lego pieces needed to build AGI. We won't know until we try to build all the brain's capacities into a model of size of a few 100 trillion parameters.

And if we actually lack some types of pieces, they may even be available by then.

larodi · on May 15, 2024

Karpathy is still a mountain in the area of ML/AI, one of the few people worth following closely on Twitter/X.

gdiamos · on May 15, 2024

I don’t think people give Dario enough credit

HarHarVeryFunny · on May 15, 2024

Yeah, I think him leaving was a huge blow to OpenAI that they have maybe not yet recovered from. Clearly there is no moat to transformer-based LLM development (other than money), but in terms of pace of development (insight as to what is important) I think Anthropic have the edge, although Reka are also storming ahead at impressive pace.