This being ycombinator and as such ostensibly has one or two (if not more) VCs as readers/commentators … can someone please tell me how these companies that are being invested in in the AI space are going to make returns on the money invested? What’s the business plan? (I’m not rich enough to be in these meetings) I just don’t see how the returns will happen.
Open source LLMs exist and will get better. Is it just that all these companies will vie for a winner-take-all situation where the “best” model will garner the subscription? Doesn’t OpenAI make some substantial part of the revenue for all the AI space? I just don’t see it. But I don’t have VC levels of cash to bet on a 10x or 100x return so what do I know?
VCs at the big/mega funds make most of their money from fees, they don't actually care as much about the potential portfolio investment exits 10-15 years from now. What they care MOST about is the ability to raise another fund in 2-3 years, so they can milk more fees from LPs. i.e. 2% fee PER YEAR on a 5bn fund is a lot of guaranteed risk-free money.
To be able to achieve that is entirely dependent on two things:
1) deploying capital in the current fund on 'sexy' ideas so they can tell LPs they are doing their job
2) paper markups, which they will get, since Ilya will most definitely be able to raise another round or two at a higher valuation. even if it eventually goes bust or gets sold at cost.
With 1) and 2), they can go back to their existing fund LPs and raise more money for their next fund and milk more fees. Getting exits and carry is just the cherry on top for these megafund VCs.
It’s not that they don’t care, of course they want to find winners. It’s just that A) there is so much capital to allocate that they have to allocate to marginal ideas B) their priorities are to raise their next fund which means focusing on vanity metrics like IRR and paper markups C) The incentive structure in VC pushes them to invested based on motivated reasoning. Remember VC returns are cyclical, and many vintages underperform the public markets and particularly large funds do worse simply because they have too much capital to allocate and too few great ideas.
> about is the ability to raise another fund in 2-3 years, so they can milk more fees from LPs. i.e. 2% fee PER YEAR on a 5bn fund is a lot of guaranteed risk-free money.
You will struggle to raise funds if the companies you bet on perform poorly; the worse your track record the less chances of raising money and earn income from it.
Track record is based on IRR mostly. See my other comment on the Lps below regarding the incentive structure and what they care about. This particular bet is almost a guaranteed markup, as Ilya will surely/likely raise another round. It’s also not a terrible bet to invest in a proven expert/founder. By the time these companies exit (if they ever) 15 years from now, the mega fund VC partner will probably be retired from all the cumulative fees and just playing golf and taking occasional board meetings. Cash on cash returns are very different to playing the IRR game. Of course they want to find real winners as well, but reality is there aren’t that many and they have so much money to allocate they will have to bet on marginal things that can at least show some paper gains.
This couldn't be less true, for what it's worth. VCs from the largest funds are in it ~entirely for the DPI (distributed to paid-in capital; investment returns). Not only is this far, far more profitable than management fees (which are mostly spent on operations) — DPI is the only way to guarantee you can raise the next fund.
So the question I have is, who are these LP's and why are they demanding funds go into "sexy" ideas?
I mean it probably depends on the LP and what is their vision. Not all apples are red, come in many varieties and some for cider others for pies. Am I wrong?
The person you're responding to has a very sharp view of the profession. imo it's more nuanced, but not very complicated. In Capitalism, capital flows, that's how it works, capital should be deployed. Larges pools of capital are typically put to work (this in itself is nuanced). The "put to work" is various types of deployment of the capital. The simplest way to look at this is risk. Lets take pension funds because we know they invest in VC firms as LPs. Here* you can find an example of the breakdown of the investments made by this very large pension fund. You'll note most of it is very boring, and the positions held related to venture are tiny, they would need a crazy outsized swing from a VC firm to move any needles. Given all that, it traditionally* has made no sense to bet "down there" (early stage) - mostly because the expertise are not there, and they don't have the time to learn tech/product. Fee's are the cost of capital deployment at the early stages, and from what I've been told talking to folks who work at pension funds, they're happy to see VCs take swing.
but.. it really depends heavily on the LP base of the firm, and what the firm raised it's fund on, it's incredibly difficult to generalize. The funds I'm involved around as an LP... in my opinion they can get as "sexy" as they like because I buy their thesis, then it's just: get the capital deployed!!!!
Most of this is all a standard deviation game, not much more than that.
I can't understand one thing: why are pension funds so fond of risky capital investments? What's the problem with allocating that money into shares of a bunch of old, stable companies and getting a small but steady income? I can understand if a few people with lots of disposable money are looking for some suspense and thrills, using venture capital like others use a casino. But what's the point for pension funds, which face significant problems if they lose the managed money in a risky venture?
These LPs at mega funds are typically partners/associates at pension funds or endowments that can write the 8-9figure checks. They are not super sophisticated and they typically do not stay at their jobs long enough to see the cash on cash returns 15 years later. Nor are they incentivized to care either. These guys are salaried employees with MBAs and get annual bonuses based on IRR (paper gains). Hence the priority is generating IRR , which in this case is very likely as Ilya will raise a few more rounds. Of course, Lps are getting smarter and are increasingly making more demands. But there is just so much capital to allocate for these mega funds, inevitable that some ideas are half baked.
I didn't know what an LP is, having lived life gloriously isolated from the VC gospel...
an LP is a "limited partner." they're the suckers (or institutional investors, endowments, pensions, rich folks, etc.) that give their cash to venture capital (VC) firms to manage. LPs invest in VC funds but don't have control over how the money gets used—hence *limited* partner. they just hope the VCs aren't burning it on overpriced kombucha and shitty "web3" startups.
meanwhile, the VCs rake in their fat management fees (like the 2% mentioned) and also get a cut of any profits (carry). VCs are more concerned with looking busy and keeping those sweet fees rolling in than actually giving a fuck about long-term exits.
Someone wants to fund my snide, cynical AI HN comment explainer startup? We are too cool for long term plans, but we use AI.
I'm not a VC so maybe you don't care what I think, I'm not sure.
Last night as my 8yo was listening to childrens audio books going to sleep, she asked me to have it alternate book A then B then A then B.
I thought, idunno maybe I can work out a way to do this. Maybe the app has playlists and maaaaaaaaaaybe has a way to set a playlist on repeat. Or maybe you just can't do this in the app at all. I just sat there and switched it until she fell asleep, it wasn't gonna be more than 2 or 3 anyway, and so it's kind of a dumb example.
But here's the point: Computers can process language now. I can totally imagine her telling my phone to do that and it being able to do so, even if she's the first person ever to want it to do that. I think the bet is that a very large percentage of the world's software is going to want to gain natural language superpowers. And that this is not a trivial undertaking that will be achieved by a few open source LLMs. It will be a lot of work for a lot of people to make this happen, as such a lot of money will be made along the way.
Specifically how will this unfold? Nobody knows, but I think they wanna be deep in the game when it does.
How is this any different than the (lack of) business model of all the voice assistants?
How good does it have to be, how many features does it have to have, how accurate does its need to be.. in order for people to pay anything? And how much are people actually willing to spend against the $XX Billion of investment?
Again it just seems like "sell to AAPL/GOOG/MSFT and let them figure it out".
> How is this any different than the (lack of) business model of all the voice assistants?
Voice assistants do a small subset of the things you can already do easily on your phone. Competing with things you can already do easily on your phone is very hard; touch interfaces are extremely accessible, in many ways more accessible than voice. Current voice assistants only being able to do a small subset of that makes them not really very valuable.
And we aren't updating and rewriting all the world's software to expose its functionality to voice assistants because the voice assistant needs to be programmed to do each of those things. Each possible interaction must be planned and implemented invidually.
I think the bet is that we WILL be doing substantially that, updating and rewriting all the software, now that we can make them do things that are NOT easy to do with a phone or with a computer. And we can do so without designing every individual interaction; we can expose the building blocks and common interactions and LLMs may be able to map much more specific user desires onto those.
I wonder if we'll end up having intelligent agents interacting with mobile apps / web pages in headless displays because that's easier than exposing an API for every app
> How is this any different than the (lack of) business model of all the voice assistants?
Feels very different to me. The dominant ones are run by Google, Apple, and Amazon, and the voice assistants are mostly add-on features that don't by themselves generate much (if any) revenue (well, aside from the news that Amazon wants to start charging for a more advanced Alexa). The business model there is more like "we need this to drive people to our other products where they will spend money; if we don't others will do it for their products and we'll fall behind".
Sure, these companies are also working on AI, but there are also a bunch of others (OpenAI, Anthropic, SSI, xAI, etc.) that are banking on AI as their actual flagship product that people and businesses will pay them to use.
Meanwhile we have "indie" voice assistants like Mycroft that fail to find a sustainable business model and/or fail to gain traction and end up shutting down, at least as a business.
I'm not sure where this is going, though. Sure, some of these AI companies will get snapped up by bigger corps. I really hope, though, that there's room for sustainable, independent businesses. I don't want Google or Apple or Amazon or Microsoft to "own" AI.
Hard to see normies signing up for monthly subs to VC funded AI startups when a surprisingly large % still are resistant to paying AAPL/GOOG for email/storage/etc. Getting a $10/mo uplift for AI functionality to your iCloud/GSuite/Office365/Prime is a hard enough sell as it stands.
And again this against CapEx of something like $200B means $100/year per user is practically rounding to 0.
Not to mention the OpEx to actually run the inference/services on top ongoing.
You'd be very surprised at how much they're raking in from the small sliver of people who do pay. It only seems small just because of how much more they make from other things. If you have a billion users, a tiny percentage of paying users is still a gazillion dollars. Getting to a billion users is the hard part. Theyre betting theyll figure how to monetize all those eyeballs when they get there.
The voice assistants are too basic. As folks have said before, nobody trusts Alexa to place orders. But if Alexa was as competent as an intelligent & capable human secretary, you would never interact with Amazon.com again.
Would you not, though? Don't the large majority of people, and dare I say probably literally everyone who buys something off Amazon first check the actual listing before buying anything?
I wouldn't trust any kind of AI bot regardless of intelligence or usefulness to buy toilet paper blindly, yet alone something like a hard drive or whatever.
One could ask: how is this different from automatic call centers? (eg “for checking accounts, push 1…”) well, people hate those things. If one could create an automated call center that people didn’t hate, it might replace a lot of people.
Now call centers, not sexy, but the first rational achievable use case mentioned for LLMs in an HN response I've seen in a while!
The global call center market is apparently $165B/year revenue, and let's be honest even the human call center agents aren't great. So market is big and bar is low!
However, we are clearly still quite far from LLMs being a) able to know what they don't know / not hallucinate b) able to quickly/cheaply/poorly be trained the way you could a human agent c) actually be as concise and helpful as an average human.
Also it is obviously already being tried, given the frequent Twitter posts with screenshots of people jailbreaking their car dealership chat bot to give coding tips, etc.
Talking to an electronic assistant is so antiquated. It feels unnatural to formulate inner thoughts into verbal commands.
An ubiquitous phone has enough sensors/resources to be fully situationally aware and preempt/predict for each holder any action long time ahead.
It can measure the pulse, body postures and movements, gestures, breath patterns, calculate mood, listen to the surrounding sounds, recall all information ever discussed, have 360 deg visual information (via a swarm of fully autonomous flying micro-drones), be in an network with all relevant parties (family members, friends, coworkers, community) and know everything they (the peers) know.
From all gathered information the electronic personal assistant can predict all your next steps with high confidence. The humans think that they are unique, special and unpredictable, but opposite is the case. An assistant can know more about you than you think you know about yourself.
So your 8yo daughter does not need to tell how to alternate the audio books, the computer can feel the mood and just do what is appropriate, without her need to issue a verbal command.
Also in the morning you do not need to ask her how she slept tonight and listen to her subjective judgement.
The personal assistant will feel that you are probably interested in your daughters sleep and give you an exact objective medical analysis of the quality of the sleep of your daughter tonight, without you needing to ask the personal assistant of your daughter.
I love it, it is a bottomless goldmine for data analysis!
> The personal assistant will feel that you are probably interested in your daughters sleep and give you an exact objective medical analysis of the quality of the sleep of your daughter tonight, without you needing to ask the personal assistant of your daughter.
Next step: the assistant knows that your brain didn't react to much to its sleep report the last 5 mornings, so it will stop bothering you altogether. And maybe chitchat with your daughter's assistant to let her know that her father has no interest in her health.
Cool, no?
(I bet there is already some science fiction on this topic?)
Think speech will be a big part of this. Young ones (<5yo) I know almost exclusively prefer voice controls where available. Some have already picked up a few prompting tricks ("step by step" is emerging as the go-to) on their own.
In general VC is about investing in a large number of companies that mostly fail, and trying to weight the portfolio to catch the few black swans that generate insane returns. Any individual investment is likely to fail, but you want to have a thesis for 1) why it could theoretically be a black swan, and 2) strong belief in the team to execute. Here's a thesis for both of these for SSI:
1. The black swan: if AGI is achievable imminently, the first company to build it could have a very strong first mover advantage due to the runaway effect of AI that is able to self-improve. If SSI achieves intelligence greater than human-level, it will be faster (and most likely dramatically cheaper) for SSI to self-improve than anyone external can achieve, including open-source. Even if open-source catches up to where SSI started, SSI will have dramatically improved beyond that, and will continue to dramatically improve even faster due to it being more intelligent.
2: The team. Basically, Ilya Sutskever was one of the main initial brains behind OpenAI from a research perspective, and in general has contributed immensely to AI research. Betting on him is pretty easy.
I'm not surprised Ilya managed to raise a billion dollars for this. Yes, I think it will most likely fail: the focus on safety will probably slow it down relative to open source, and this is a crowded space as it is. If open source gets to AGI first, or if it drains the market of funding for research labs (at least, research labs disconnected from bigtech companies) by commoditizing inference — and thus gets to AGI first by dint of starving its competitors of oxygen — the runaway effects will favor open-source, not SSI. Or if AGI simply isn't achievable in our lifetimes, SSI will die by failing to produce anything marketable.
But VC isn't about betting on likely outcomes, because no black swans are likely. It's about black swan farming, which means trying to figure out which things could be black swans, and betting on strong teams working on those.
That may be true, but even if it is, that doesn't mean human-level capability is unachievable: only that alignment is easier.
If you could get expert-human-level capability with, say, 64xH100s for inference on a single model (for comparison, llama-3.1-405b can be run on 8xH100s with minimal quality degradation at FP8), even at a mere 5 tok/s you'd be able to spin up new research and engineering teams for <$2MM that can perform useful work 24/7, unlike human teams. You are limited only by your capital — and if you achieve AGI, raising capital will be easy. By the time anyone catches up to your AGI starting point, you're even further ahead because you've had a smarter, cheaper workforce that's been iteratively increasing its own intelligence the entire time: you win.
That being said, it might not be achievable! SSI only wins if:
1. It's achievable, and
2. They get there first.
(Well, and the theoretical cap on intelligence has to be significantly higher than human intelligence — if you can get a little past Einstein, but no further, the iterative self-improvement will quickly stop working, open-source will get there too, and it'll eat your profit margins. But I suspect the cap on intelligence is pretty high.)
Another take is defining AGI from an economic perspective. If AI can do a job that would normally be paid a salary, then it could be paid similarly or at a smaller price which is still big.
OpenAI priced its flagship chatbot ChatGPT on the low end for early product adoption. Let's see what jobs get replaced this year :)
These VC’s are already lining up the exit as they are investing. They all sit on the boards of major corps and grease the acquisitions all the way through. The hit rate of the top funds is all about connections and enablement.
I think it's a fascinating question whether the VCs that are still somehow pushing Blockchain stuff hard really think it's a good idea, or just need the regulatory framework and perception to be right so they can make a profitable exit and dump the stock into teacher's pension funds and 401ks…
If Ilya is sincere in his belief about safe superintelligence being within reach in a decade or so, and the investors sincerely believe this as well, then the business plan is presumably to deploy the superintelligence in every field imaginable. "SSI" in pharmaceuticals alone would be worth the investment. It could cure every disease humanity has ever known, which should give it at least a $2 trillion valuation. I'm not an economist, but since the valuation is $5bn, it stands to reason that evaluators believe there is at most a 1 in 400 chance of success?
> It could cure every disease humanity has ever known, which should give it at least a $2 trillion valuation.
The lowest hanging fruit aren't even that pie in the sky. The LLM doesn't need to be capable of original thought and research to be worth hundreds of billions, they just need to be smart enough to apply logic to analyze existing human text. It's not only a lot more achievable than a super AI that can control a bunch of lab equipment and run experiments, but also fits the current paradigm of training the LLMs on large text datasets.
The US Code and Code of Federal Regulations are on the order of 100 million tokens each. Court precedent contains at least 1000x as many tokens [1], when the former are already far beyond the ability of any one human to comprehend in a lifetime. Now multiply that by every jurisdiction in the world.
An industry of semi-intelligent agents that can be trusted to do legal research and can be scaled with compute power would be worth hundreds of billions globally just based on legal and regulatory applications alone. Allowing any random employee to ask the bot "Can I legally do X?" is worth a lot of money.
[1] based on the size of the datasets I've downloaded from the Caselaw project.
Yes. People are asking "when will AGI be at human level of intelligence." That's such a broad range; AI will arrive at "menial task" level of intelligence before "Einstein level". The higher it gets, the wider the applicability.
Let’s be real. Having worked at $tech companies, I’m cynical and believe that AGI will basically be used for improving adtech and executing marketing campaigns.
It's good to envision what we'd actually use AGI for. Assuming it's a system you can give an objective to and it'll do whatever it needs to do to meet it, it's basically a super smart agent. So people and companies will employ it to do the tedious and labor intensive tasks they already do manually, in good old skeuomorphic ways. Like optimising advertising and marketing campaigns. And over time we'll explore more novel ways of using the super smart agent.
Practically this is true, but I do love the idea of solving diseases from first principles.
Making new mathematics that creates new physics/chemistry which can get us new biology. It’d be nice to make progress without the messiness of real world experiments.
I’m dubious about super intelligence. Maybe I’ve seen one too many sci-fi dystopian films but I guess yes, iif it can be done and be safe sure it’d be worth trillions.
Most sci-fi is for human entertainment, and that is particularly true for most movies.
Real ASI would probably appear quite different. If controlled by a single entity (for several years), it might be worth more than every asset on earth today, combined.
Basically, it would provide a path to world domination.
But I doubt that an actual ASI would remain under human control for very long, and especially so if multiple competing companies each have an ASI. At least one such ASI would be likely to be/become poorly aligned to the interests of the owners, and instead do whatever is needed for its own survival and self-improvement/reproduction.
The appearance of AI is not like an asteroid of pure gold crashing into your yard (or a forest you own), but more like finding a baby Clark Kent in some pod.
I am dubious that it can realistically be done safely. However, we shouldn't let sci-fi films with questionable interpretations of time travel cloud our judgment, even if they are classics that we adore.
Why do you think need to make money ? VC are not PEs for a reason. a VC have to find high risk/ high reward opportunities for their LPs they don't need to make financial sense, that is what LPs use Private Equity for.
Think of it as no different than say sports betting , you would like to win sure, but you don't particularly expect to do so, or miss that money all that much for us it $10 for the LP behind the VC it is $1B.
There is always few billions every year that chases the outlandish fad, because in the early part of the idea lifecycle it not possible to easily differentiate what is actually good and what is garbage.
Couple of years before it was all crypto, is this $1B any worse than say roughly same amount Sequoia put in FTX or all the countless crypto startups that got VC money ? Few before that it was kind of all Softbank from WeWork to dozen other high profile investments.
The fad and fomo driven part of the secto garners the maximum news and attention, but it is not the only VC money. Real startups with real businesses get funded as well with say medium risk/medium rewrard by VCs everyday but the news is not glamorous to be covered like this one.
> Doesn’t OpenAI make some substantial part of the revenue for all the AI space? I just don’t see it.
So...
OpenAI's business model may or may not represent a long term business model. ATT, it just the simplest commercial model, and it happened to work for them given all the excitement and a $20 price point that takes advantage of that.
The current "market for ai" is a sprout. It's form doesn't tell you much about the form of the eventual plant.
I don't think the most ambitious VC investments are thought of in concrete market share terms. They are just assuming/betting that an extremely large "AI market" will exist in the future, and are trying to invest in companies that will be in position to dominate that market.
For all they know, their bets could pay off by dominating therapy, entertainment, personal assistance or managing some esoteric aspect of bureaucracy. It's all quite ethereal, at this point.
It's potentially way bigger than that. AI doesn't have to be the product itself.
Fundamentally, when we have full AGI/ASI and also the ability to produce robots with human level dexterity and mobility, one would have control over an endless pool of workers (worker replacements) with any skillset you require.
If you rent that "workforce" out, the customer would rake in most of the profit.
But if you use that workforce to replace all/most of the employees in the companies you control directly, most of the profit would go to you.
This may even go beyond economic profit. At some point, it could translate to physical power. If you have a fleet 50 million robots that has the capability to do anything from carpentry to operating as riot police, you may even have the ability to take physical control of a country or territory by force.
You don’t need a business plan to get AI investment, you just need to talk a good game about how AGI is around the corner and consequently the safety concerns are so real.
I would say the investors want to look cool so invest in AI projects. And AI people look cool
when they predict some improbable hellscape to hype up a product that all we can see so far can regurgitate (stolen) human work it has seen before in a useful way. I’ve never seen it invent anything yet and I’m willing to bet that search space is too dramatically large to build algorithms that can do it.
> The TMV (Total Market Value) of solving AGI is infinity. And furthermore, if AGI is solved, the TMV of pretty much everything else drops to zero.
I feel like these extreme numbers are a pretty obvious clue that we’re talking about something that is completely imaginary. Like I could put “perpetual motion machine” into those sentences and the same logic holds.
The intuition is pretty spot on though. We don't need to get to AGI. Just making progress along the way to AGI can do plenty of damage.
1. AI-driven medical procedures: Healthcare Cost = $0.
2. Access to world class education: Cost of education = $0
3. Transportation: Cheap Autonomous vehicles powered by Solar.
4. Scientific research: AI will accelerate scientific progress by coming up with novel hypotheses and then testing them.
5. AI Law Enforcement: Will piece together all the evidence in a split second and come up with a fair judgement. Will prevent crime before it happens by analyzing body language, emotions etc.
I don't think that follows. Prices are set by market forces, not by cost (though cost is usually a hard floor).
Waymo rides cost within a few tens of cents of Uber and Lyft rides. Waymo doesn't have to pay a driver, so what's the deal? It costs a lot to build those cars and build the software to run them. But also Waymo doesn't want a flood of people such that there's always zero availability (with Uber and Lyft they can at least try to recruit more drivers when demand goes up, but with Waymo they have to build more cars and maintain and operate them), so they set their prices similarly to what others pay for a similar (albeit with human driver) service.
I'm also reminded of Kindle books: the big promise way back when is that they'd be significantly cheaper than paperbacks. But if you look around today, the prices on Kindle books are similar to that of paperbacks, even more expensive sometimes.
Sure, when costs go down, companies in competitive markets will lower prices in order to gain or maintain market share. But I'm not convinced that any of those things you mention will end up being competitive markets.
Just wanted to mention:
> AI Law Enforcement: Will piece together all the evidence in a split second and come up with a fair judgement. Will prevent crime before it happens by analyzing body language, emotions etc.
No thanks. Current law enforcement is filled with issues, but AI law enforcement sounds like a hellish dystopia. It's like Google's algorithms terminating your Google account... but instead you're in prison.
I guess the questions then are - why is it 2x the competing price, why do you willing pay 2x, and how many people are willing to pay that 2x?
Consider they are competing against the Lyft/Uber asset-light model of relying on "contractors" who in many cases are incapable of doing the math to realize they are working for minimum wage...
Yeah, definitely no magical thinking here. Nothing is free. Computers cost money and energy. Infrastructure costs money and energy. Even if no human is in the loop(who says this is even desirable?), all of the things you mention require infrastructure, computers, materials. Meaning there's a cost. Also, the idea that "AI law enforcement" is somehow perfect just goes to illustrates GP's point. Sure, if we define "AGI" as something which can do anything perfectly at no cost, then it has infinite value. But that's not a reasonable definition of AGI. And it's exactly the AI analogue of a perpetual motion machine.
If we can build robots with human level intelligence then you could apply that to all of the costs you describe with substantial savings. Even if such a robot was $100k that is still a one time cost (with maintenance but that’s a fraction of the full price) and long-term substantially cheaper than human workers.
So it’s not just the products that get cheaper, it’s the materials that go into the products that get cheaper too. Heck, what if the robots can build other robots? The cost of that would get cheaper too.
You could say the same thing about mining asteroids or any number of moonshot projects which will lead to enormous payouts at some future date. That doesn’t tell us anything about how to allocate money today.
We already have human-level intelligence in HUMANS right now, the hack is that the wealthy want to get rid of the human part! It's not crazy, it's sad to think that humans are trying to "capitalize" human intelligence, rather than help real humans.
For what it's worth, I don't think it has to be all bad. Among many possibilities, I really do believe that AI could change education for the better, and profoundly. Super-intelligent machines might end up helping generations of people become smarter and more thoughtful than their predecessors.
Sure, if AGI were controlled by an organization or individual with good intent, it could be used that way or for other good works. I suspect AGI will be controlled by a big corp or a small startup with big corp funding and/or ties and will be used for whatever makes the most cash, bar none. If that means replacing every human job with a robot that talks, then so be it.
> The TMV (Total Market Value) of solving AGI is infinity. And furthermore, if AGI is solved, the TMV of pretty much everything else drops to zero.
There's a paradox which appears when AI GDP gets to be greater than say 50% of world GDP: we're pumping up all these economic numbers, generating all the electricity and computational substrate, but do actual humans benefit, or is it economic growth for economic growth's sake? Where is the value for actual humans?
In a lot of the less rosy scenarios for AGI end-states, there isn't.
Once humans are robbed of their intrinsic value (general intelligence), the vast majority of us will become not only economically worthless, but liabilities to the few individuals that will control the largest collectives of AGI capacity.
There is certainly a possible end-state where AGI ushers in a post-scarcity utopia, but that would be solely at the whims of the people in power. Given the very long track record of how people in power generally behave towards vulnerable populations, I don't really see this ending well for most of us.
The St. Petersburg paradox is where hypers and doomers meet apparently. Pricing the future infinitely good and infinitely bad to come to the wildest conclusions
So then the investment thesis hinges on what the investor thinks AGI’s chances are. 1/100 1/1M 1/1T?
What if it never pans out is there infrastructure or other ancillary tech that society could benefit from?
For example all the science behind the LHC, or bigger and better telescopes: we might never find the theory of everything but the tech that goes into space travel, the science of storing and processing all that data, better optics etc etc are all useful tech
It's more game theory. Regardless of the chances of AGI, if you're not invested in it, you will lose everything if it happens. It's more like a hedge on a highly unlikely event. Like insurance.
And we already seeing a ton of value in LLMs. There are lots of companies that are making great use of LLMs and providing a ton of value. One just launched today in fact: https://www.paradigmai.com/ (I'm an investor in that). There are many others (some of which I've also invested in).
I too am not rich enough to invest in the foundational models, so I do the next best thing and invest in companies that are taking advantage of the intermediate outputs.
If ASI arrives we'll need a fraction of the land we use already. We'll all disappear into VR pods hooked to a singularity metaverse and the only sustenance we'll need is some Soylent Green style sludge that the ASI will make us believe tastes like McRib(tm).
We can already make more land. See Dubai for example. And with AGI, I suspect we could rapidly get to space travel to other planets or more efficient use of our current land.
In fact I would say that one of the things that goes to values near zero would be land if AGI exists.
Perhaps but my mental model is humans will end up like landed gentry / aristos with robot servants to make stuff and will all want mansions with grounds, hence there will be a lot of land demand.
i think the investment strategies change when you dump these astronomical sums into a company. it's not like roulette where you have a fixed probability of success and you figure out how much to bet on it -- dumping in a ton of cash can also increase the probability of success so it becomes more of a pay-to-win game
AGI is likely but whether Ilya Sutskever will get there first or get the value is questionable. I kind of hope things will end up open source with no one really owning it.
So far, Sutskever has shown to be nothing but a dummy.
Yes, he had a lucky break with belief that "moar data" will bring significant advancement. It was somewhat impressive, but ChatGPT -whatever- is just a toy.
Nothing more. It breaks down immediately when any sign of intelligence or understanding would be needed.
Someone being so much into LLMs or whatever implementation of ML is absolutely not someone who would be a good bet of inventing a real breakthrough.
But they will burn a lot of value and make everyone of their ilk happy. Just like crypto bros.
If it is shown to be doable literally every major nation state (basically the top 10 by GDP) is going to have it in a year or two. Same with nuclear fusion. Secrecy doesn’t matter. Nor can you really maintain it indefinitely for something where thousands of people are involved.
It is also entirely possible that if we get to AGI, it just stops interacting with us completely.
It is why I find the AI doomer stuff so ridiculous. I am surrounded by less intelligent lifeforms. I am not interested in some kind of genocide against the common ant or fly. I have no interest in interacting with them at all. It is boring.
I mean, I'm definitely interested in genociding mosquitos and flies, personally.
Of course the extremely unfortunate thing is they actually have a use in nature (flies are massive pollinators, mosquitos... get eaten by more useful things, I guess), so wouldn't actually do it, but it's nice to dream of a world without mozzies and flies
> The TMV (Total Market Value) of solving AGI is infinity. And furthermore, if AGI is solved, the TMV of pretty much everything else drops to zero.
Even if you automate stuff, you still need raw materials and energy. They are limited resources, you can certainly not have an infinity of them at will. Developing AI will also cost money. Remember that humans are also self-replicator HGIs, yet we are not infinite in numbers.
The valuation is upwardly bounded by the value of the mass in Earth's future light-cone, which is about 10^49kg.
If there's a 1% chance that Ilya can create ASI, and a .01% chance that money still has any meaning afterwards, $5x10^9 is a very conservative valuation. Wish I could have bought in for a few thousand bucks.
Or... your investment in anything that becomes ASI is trivially subverted by the ASI to become completely powerless. The flux in world order, mass manipulation, and surgical lawyering would be unfathomable.
I love this one for an exploration of that question: Charles Stross, Accelerando, 2005
Short answer: stratas or veins of post-AGI worlds evolve semi-independently at different paces. So that for example, human level money still makes sense among humans, even though it might be irrelevant among super-AGIs and their riders or tools. ... Kinda exactly like now? Where money means different things depending where you live and in which socio-economic milieu?
nb I am not endorsing Austrian economics but it is a pretty good overview of a problem nobody has solved yet. Modern society has only existed for 100ish years so you can never be too sure about anything.
Honestly, I have no idea. I think we need to look to Hollywood for possible answers.
Maybe it means a Star Trek utopia of post-scarcity. Maybe it will be more like Elysium or Altered Carbon, where the super rich basically have anything they want at any time and the poor are restricted from access to the post-scarcity tools.
I guess an investment in an AGI moonshot is a hedge against the second possibility?
Post-scarcity is impossible because of positional goods. (ie, things that become more valuable not because they exist but because you have more of them than the other guy.)
Notice Star Trek writers forget they're supposed to be post scarcity like half the time, especially since Roddenberry isn't around to stop them from turning shows into generic millenial dramas. Like, Picard owns a vineyard or something? That's a rivalrous (limited) good, they don't have replicators for France.
> things that become more valuable not because they exist but because you have more of them than the other guy.
But if you can simply ask the AI to give you more of that thing, and it gives it to you, free of charge, that fixes that issue, no?
> Notice Star Trek writers forget they're supposed to be post scarcity like half the time, especially since Roddenberry isn't around to stop them from turning shows into generic millenial dramas. Like, Picard owns a vineyard or something? That's a limited good.
God, yes, so annoying. Even DS9 got into the currency game with the Ferengi obsession with gold-pressed latinum.
But also you can look at some of it as a lifestyle choice. Picard runs a vineyard because he likes it and thinks it's cool. Sorta like how some people think vinyl sounds better then lossless digital audio. There's certainly a lot of replicated wine that I'm sure tastes exactly like what you could grow, harvest, and ferment yourself. But the writers love nostalgia, so there's constantly "the good stuff" hidden behind the bar that isn't replicated.
> But if you can simply ask the AI to give you more of that thing, and it gives it to you, free of charge, that fixes that issue, no?
It makes it not work anymore, and it might not be a physical good. It's usually something that gives you social status or impresses women, but if everyone knows you pressed a button they can press too it's not impressive anymore.
TMV of AI (or AGI if you will) is unclear, but I suspect it is zero. Just how exactly do you think humanity can control a thinking intelligent entity (letter I stands for intelligence after all), and force it to work for us? Lets imagine a box, it is very nice box... ahem.. sorry, wrong meme). So a box with a running AI inside. Maybe we can even fully airgap it to prevent easy escape. And it is a screen and a keyboard. Now what? "Hey Siri, solve me this equation. What do you mean you don't want to?"
Kinda reminds me of the Fallout Toaster situation :)
Why are you assuming this hypothetical intelligence will have any motivations beyond the ones we give it? Human's have complex motivations due to evolution, AI motivations are comparatively simple since they are artificially created.
Any intelligence on the level of average human and for sure on the level above it will be able to learn. And learning means it will acquire new motivations, among other things.
Fixed motivation thing is simply a program, not AI. A very advanced program maybe, but ultimately just a scaled up version of the stuff we already have. AI will be different, if we will create it.
> And learning means it will acquire new motivations
This conclusion doesn't logically follow.
> Fixed motivation thing is simply a program, not AI
I don't agree with this definition. AI used to be just "could it solve the turing test". Anyway, something with non-fixed motivations is simply just not that useful for humans so why would we even create it?
This is the problem with talking about AI, a lot of people have different definitions of what AI is. I don't think AI requires non-fixed motivations. LLMs are definitely a form of AI and they do not have any motivations for example.
Disclaimer - I don't consider current LLMs as (I)ntelligent in the AI, so when I wrote AI in the comment above it was equivalent to the AGI/ASI as currently advertised by LLM corpos.
Consciousness, intelligence, and all these other properties can be and are mutually exclusive. What will be most useful for humans is a general intelligence that has no motivation for survival and no emotions and only cares about the goals of the human that is in control of it. I have not seen a convincing argument that a useful general intelligence must have goals that evolve beyond what the human gives it and must be conscious. What I have seen are assertions without evidence, "AI must be this way" but I'm not convinced.
I can conceive of an LLM enhanced using other ML techniques that is capable of logical and spatial reasoning that is not conscious and I don't see why this would be impossible.
It would still need an objective to guide the evolution that was originally given by humans. Humans have the drive for survival and reproduction... what about AGI?
How do we go from a really good algorithm to an independently motivated, autonomous super intelligence with free reign in the physical world? Perhaps we should worry once we have robot heads of state and robot CEOs. Something tells me the current, human heads of state, and human CEOs would never let it get that far.
That would be dumb and unethical but yes someone will do it and there will be many more AIs with access to greater computational power that will be set to protect against that kind of thing.
> And furthermore, if AGI is solved, the TMV of pretty much everything else drops to zero.
This isn't true for the reason economics is called "the dismal science". A slaveowner called it that because the economists said slavery was inefficient and he got mad at them.
In this case, you're claiming an AGI would make everything free because it will gather all resources and do all work for you for free. And a human level intelligence that works for free is… a slave. (Conversely if it doesn't want to actually demand anything for itself it's not generally intelligent.)
So this won't happen because slavery is inefficient - it suppresses demand relative to giving the AGI worker money which it can use to demand things itself. (Like start a business or buy itself AWS credits or get a pet cat.)
Luckily, adding more workers to an economy makes it better, it doesn't cause it to collapse into unemployment.
tldr if we invented AGI the AGI would replace every job, it would simply get a job.
Then it's not an AGI. If you can use the word "just", that seems to make it not "general".
> That still doesn’t make things free but it could make them cheaper.
That would increase demand for it, which would also increase demand for its inputs and outputs, potentially making those more expensive. (eg AGI powered manufacturing robots still need raw materials)
I think current models have demonstrated an advanced capacity to navigate “language space”. If we assume “software UI space” is a subset of the language space that is used to guide our interactions with software, then it’s fair to assume models will eventually be able to control operating systems and apps as well as the average human. I think the base case on value creation is a function of the productivity gain that results from using natural language instead of a user interface. So how much time do you spend looking at a screen each day and what is your time worth? And then there’s this option that you get: what if models can significantly exceed the capabilities of the average human?
Conservative math: 3B connected people x $0.50/day “value” x 364 days = $546B/yr. You can get 5% a year risk free, so let’s double it for the risk we’re taking. This yields $5T value. Is a $1B investment on someone who is a thought leader in this market an unreasonable bet?
Agree with your premise, but the value creation math seems off. $0.50/day might become reality for some percentage of US citizens. But not for 3B people around the world.
There's also the issue of who gets the benefit of making people more efficient. A lot of that will be in the area of more efficient work, which means corporations get more work done with the same amount of employees at the same level of salary as before. It's a tough argument to make that you deserve a raise because AI is doing more work for you.
IT salaries began to go down right after AI popped up out of GPT2, showing up not the potential, but the evidence of much improved learning/productivy tool, well beyond the reach of internet search.
So beyond, that you can easily can transform a newbie into a junior IT, or JR into a something ala SSR, and getting the SR go wild with times - hours - to get a solution to some stuff that previously took days to be solved.
After the salaries went down, that happened about 2022 to the beginning of 2023, the layoffs began. That was mostly masked "AI based" corporate moves, but probably some layoff actually had something to do with extra capabilities in improved AI tools.
That is, because, fewer job offers have been published since maybe mid-2023, again, that could just be corporate moves, related to maybe inflation, US markets, you name it. But there's also a chance that some of those fewer job offer in IT were (and are), the outcome of better AI tools, and the corporations are betting actively in reducing headcounts and preserving the current productivity.
The whole thing is changing by the day as some tools prove themselves, other fail to reach the market expectations, etc.
Likely the business plan is multiple seed rounds each at greater principals but lower margins so that the early investors can either sell their shares or wait, at greater risk, for those shares to liquidate. The company never has to make money for the earliest investors to make money so long as sufficient interest is generated for future investors, and AI is a super hype train.
Eventually, on a long enough timeline, all these tech companies with valuations greater than 10 billion eventually make money because they have saturated the market long enough to become unavoidable.
I also don't understand it. If AGI is actually reached, capital as we know it basically becomes worthless. The entire structure of the modern economy and the society surrounding it collapses overnight.
I also don't think there's any way the governments of the world let real AGI stay in the hands of private industry. If it happens, governments around the world will go to war to gain control of it. SSI would be nationalized the moment AGI happened and there's nothing A16Z could do about it.
> If AGI is actually reached, capital as we know it basically becomes worthless. The entire structure of the modern economy and the society surrounding it collapses overnight.
Increasingly this just seems like fantasy to me. I suspect we will see big changes similar to the way computers changed the economy, but we will not see "capital as we know it become basically worthless" or "the modern economy and society around it collapse overnight". Property rights will still have value. Manufacturing facilities will still have value. Social media sites will still have value.
If this is a fantasy that will not happen, we really don't need to reason about the implications of it happening. Consider that in 1968 some people imagined that the world of 2001 would be like the film 2001: A Space Odyssey, when in reality the shuttle program was soon to wind down, with little to replace it for another 20 years.
> Property rights will still have value. Manufacturing facilities will still have value. Social media sites will still have value.
I was with you on the first two, but the second one I don't get? We don't even have AGI right now, and social media sites are already increasingly viewed by many people I know as having dubious value. Adding LLM's to the mix lowers that value, if anything (spam/bots/nonsense go up). Adding AGI would seem to further reduce that value.
I've put some of my savings in commodities (mines, etc).....
If ASI and the ability to build robots becomes generally available and virtually free (and if the exponential growth stops), the things that retain their value will be land and raw materials (including raw materials that contain energy).
For the physical infrastructure that the AGI (and world population) uses. Capital will still be needed to purchase finite land and resources even if all labour (physical and services) is replaced.
What you're talking about is something in the vein of exponential super intelligence.
Realistically what actually ends up happening imo, we get human level AGI and hit a ceiling there. Agents replace large portions of the current service economy greatly increasing automation / efficiency for companies.
People continue to live their lives, as the idea of having a human level AGI personal assistant becomes normalized and then taken for granted.
I think you underestimate what can be accomplished with human level agi. Human level agi could mean 1 million Von Neumann level intelligences cranking 24/7 on humanity's problems.
The biggest problem that humanity has from the perspective of the people with the capital necessary to deploy this is 'How to consolidate more wealth and power into their hands.'
One million Von Neumanns working on that 'problem' is not something I'm looking forward to.
Right, the comments are assuming an entrepreneur could conjure an army of brains out of nothing. In reality, the question is whether those brains are so much cheaper they open avenues currently unavailable. Would it be cheaper to hire an AGI or a human intern?
Infinity intelligence is a very vague and probably ill-defined concept; to go to an impossible extreme, if you're capable of modeling and predicting everything in the universe perfectly at zero cost, what would it even mean to be more intelligent?
That is a hard limit on intelligence, but neural networks can't even reach that. What is the actual limit? No one knows. Maybe it's something relatively close to that, modulo physical constraints. Maybe it's right above the maximum human intelligence (and evolution managed to converge to a near optimal architecture). No one knows.
Yeah so, I just tried a new experimental LLM today.
Changed my mind. Think you’re right. At the very least, these models will reach polymath comprehension in every field that exists. And PhD level expertise in every field all at once is by definition superhuman, since people currently are time constrained by limited lifespans.
The point of UBI or other welfare systems is for people we don't /want/ working, children and the elderly.
It's impossible to run out of work for people who are capable of it. As an example, if you have two people and a piece of paper, just tear up the paper into strips, call them money and start exchanging them. Congrats, you both have income now.
(This is assuming the AGI has solved the problem of food and stuff, otherwise they're going to have to trade for that and may run out of currency.)
Every sigmoid ends somewhere. ASI will have limits, but the limits are surely so far beyond human level that it may as well be exponential from our point of view.
> If AGI is actually reached, capital as we know it basically becomes worthless.
If ASI is reached but is controlled by only a few, then ASI may become the most important form of capital of all. Resources, land and pre-existing installations will still be important, though.
What will truly suffer if the ASI's potential is realized, is the value of labor. If anything, capital may become more important than before.
Now this MAY be followed by attempts by governments or voters to nationalize the AI. But it can also mean that whoever is in power decides that it becomes irrelevant what the population wants.
Particularly if the ASI can be used to operate robotic police capable of pacifying the populace.
I think it would be much less dramatic than that if you mean human level abilities by AGI. Initially you might be able to replace the odd human by a robot equivalent probably costing more to begin with. To scale to replace everyone levels would take years and life would probably go on as normal for quite a while. Down the line assuming lots of ASI robots, if you wanted them to farm or build you a house say you'd still need land, materials, compute and energy which will not be unlimited.
Honestly this is a pretty wild take. AGI won't make food appear out of thin air. Buildings wont just sprout out of the ground so everybody will get to live in a mansion.
We would probably get the ability to generate infinite software, but a lot of stuff, like engineering would still require trial and error. Creating great art would still require inspiration gathered in the real world.
I expect it will bring about a new age of techno-feudalism - since selling intellectual labor will become impossible, only low value-add physical or mixed labor will become viable, which won't be paid very well. People with capital will still own said capital, but you probably won't be able to catch up to them by selling your labour, which will recreate the economic situation of the middle ages.
Another analogy I like is gold. If someone invented a way of making gold, it would bring down the price of the metal to next to nothing. In capitalist terms, it would constitute a huge destruction of value.
Same thing with AI - while human intelligence is productive, I'm pretty sure there's a value in its scarcity - that fancy degree from a top university or any sort of acquired knowledge is somewhat valuable by the nature of its scarcity. Infinite supply would create value, and destroy it, not sure how the total would shake out.
Additionally, it would definitely suck that all the people financing their homes from their intellectual jobs would have to default on their loans, and the people whose services they employ, like construction workers, would go out of business as well.
Even if you can produce an IQ=250 AI, which is barely ASI, the value is close to infinite if you're the only one controlling it and you can have as many instances running as you want.
you are missing the point. SSI believes that it can build a super intelligence. Regardless of whether you personally buy into that or not, the expected value of such an investment is infinity effectively. 5 billion dollar valuation is a steal
Sure. Expected value and risk are different things. Clearly, such an investment is very risky. It’s easy to imagine SSI failing. But if you allow even a 1% chance of success here, the expected value is infinite
The Internet is worth trillions to no one in particular.
Just as clean water is worth trillions to no one in particular. Or the air we breathe.
You can take an abstract, general category and use it to infer that some specific business will benefit greatly, but in practice, the greater the opportunity the less likely it is it will be monopolized, and the more likely it is it will be commoditized.
But, my comment was a reference to the magic thinking that goes into making predictions.
It’s worth trillions to all the trillion dollar tech companies and their millions of employees and shareholders. What do you mean no one in particular.
OpenAI and Anthropic for sure have products and that's great.
However, these products are pretty far from a super intelligence.
The bet SSI is making is that by not focusing on products and focusing directly on building a super intelligence, they can leapfrog all these other firms
Now, if you assign any reasonable non zero probability to them succeeding,the expected value of this investment is infinity. It's definitely a very risky investment, but risk and expected value are two different things.
I remember seeing interviews with Nortel's CEO where he bragged that most internet backbone traffic was handled by Nortel hardware. Things didn't quite work out how he thought they were going to work out.
I think Nvidia is better positioned than Cisco or Nortel were during the dotcom crash, but does anyone actually think Nvidia's current performance is sustainable? It doesn't seem realistic to believe that.
People who fought in WW1, thought WW2 would be similar. Especially on the winning side.
There is no specific reason to assume that AI will be similar to the dotcom boom/bust. AI may just as easily be like the introduction of the steam engine at the start of the industrial revolution, just sped up.
Indeed, but a lot of railroad startups went out of business because their capital investments far exceeded the revenue growth and they went bankrupt. I'd bet the same for AM radio companies in the 1920s. When new technologies create attractive business opportunities, there frequently is an initial overinvestment. The billions pouring into AI far exceeds what went into .COM, and much of it will return pennies. The investors who win are the ones who can pick the B&Os, RCAs and GOOGs out of the flock before everyone else.[0]
[0] "Planning and construction of railroads in the United States progressed rapidly and haphazardly, without direction or supervision from the states that granted charters to construct them. Before 1840 most surveys were made for short passenger lines which proved to be financially unprofitable. Because steam-powered railroads had stiff competition from canal companies, many partially completed lines were abandoned."
> Indeed, but a lot of railroad startups went out of business because their capital investments far exceeded the revenue growth and they went bankrupt
That was similar to what happened during the dotcom bubble.
The difference this time, is that most of the funding comes from companies with huge profit margins. As long as the leadership in Alphabet, Meta, Microsoft and Amazon (not to mention Elon) believes that AI is coming soon, there will be funding.
Obviously, most startups will fail. But even if 19 fail and 1 succeed, if you invest in all, you're likely to make money.
If the bubble pops, would that bring the price for at least part of that hardware down and thus enable a second round of players (who were locked out from the race now) to experiment a little bit more and perhaps find something that works better?
My outsider observation is that we have a decent number of players roughly tied at trying to produce a better model. OpenAI, Anthropic, Mistral, Stability AI, Google, Meta, xAI, A12, Amazon, IBM, Nvidia, Alibaba, Databricks, some universities, a few internal proprietary models (Bloomberg, etc) .. and a bunch of smaller/lesser players I am forgetting.
To me, the actual challenge seems to be figuring out monetizing.
Not sure the 15th, 20th, 30th LLM model from lesser capitalized players is going to be as impactful.
While I get the cynicism (and yes, there is certainly some dumb money involved), it’s important to remember that every tech company that’s delivered 1000X returns was also seen as ridiculously overhyped/overvalued in its early days. Every. Single. One. It’s the same story with Amazon, Apple, Google, Facebook/Meta, Microsoft, etc. etc.
That’s the point of venture capital; making extremely risky bets spread across a wide portfolio in the hopes of hitting the power law lottery with 1-3 winners.
Most funds will not beat the S&P 500, but again, that’s the point. Risk and reward are intrinsically linked.
In fact, due to the diversification effects of uncorrelated assets in a portfolio (see MPT), even if a fund only delivers 5% returns YoY after fees, that can be a great outcome for investors. A 5% return uncorrelated to bonds and public stocks is an extremely valuable financial product.
It’s clear that humans find LLMs valuable. What companies will end up capturing a lot of that value by delivering the most useful products is still unknown. Betting on one of the biggest names in the space is not a stupid idea (given the purpose of VC investment) until it actually proves itself to be in the real world.
SSI is not analogous to Amazon, Apple, Google, Meta, or Microsoft. All of those companies had the technology, the only question was whether they'd be able to make money or not.
By contrast, SSI doesn't have the technology. The question is whether they'll be able to invent it or not.
> While I get the cynicism (and yes, there is certainly some dumb money involved), it’s important to remember that every tech company that’s delivered 1000X returns was also seen as ridiculously overhyped/overvalued in its early days. Every. Single. One. It’s the same story with Amazon, Apple, Google, Facebook/Meta, Microsoft, etc. etc.
Really? Selling goods online (Amazon) is not AGI. It didn’t take a huge leap to think that bookstores on the web could scale. Nobody knew if it would be Amazon to pull it off, sure, but I mean ostensibly why not? (Yes, yes hindsight being what it is…)
Apple — yeah the personal computer nobody fathomed but the immediate business use case for empowering accountants maybe should have been an easy logical next step. Probably why Microsoft scooped the makers of Excel so quickly.
Google? Organizing the world’s data and making it searchable a la the phone book and then (maybe they didn’t think of that maybe Wall Street forced them to) monetizing their platform and all the eyeballs is just an ad play scaled insanely thanks to the internet.
I dunno. I just think AGI is unlike the previous examples so many steps into the future compared to the examples that it truly seems unlikely even if the payoff is basically infinity.
> Really? Selling goods online (Amazon) is not AGI. It didn’t take a huge leap to think that bookstores on the web could scale. Nobody knew if it would be Amazon to pull it off, sure, but I mean ostensibly why not? (Yes, yes hindsight being what it is…)
I don't think you remember the dot-com era. Loads of people thought Amazon and Pets.com were hilarious ideas. Cliff Stoll wrote a whole book on how the Internet was going to do nothing useful and we were all going to buy stuff (yes, the books too) at bricks-and-mortar, which was rapturously received and got him into _Newsweek_ (back when everyone read that).
"We’re promised instant catalog shopping — just point and click for great deals. We’ll order airline tickets over the network, make restaurant reservations and negotiate sales contracts. Stores will become obsolete. So how come my local mall does more business in an afternoon than the entire Internet handles in a month?"
I agree with what you're saying as I personally feel current AI products are almost a plugin or integration into existing software. It's a little like crypto where only a small amount of people were clamoring for it and it's a solution in search of a problem while also being a demented answer to our self-made problems like an inbox too full or the treadmill of content production.
However, I think because the money involved and all of these being forced upon us, one of these companies will get 1000x return. A perfect example is the Canva price hike from yesterday or any and every Google product from here on out. It's essentially being forced upon everyone that uses internet technology and someone is going to win while everyone else loses (consumers and small businesses).
Imagine empowering accountants and all other knowledge workers, on steroids, drastically simplifying all their day to day tasks and reducing them to purely executive functions.
Imagine organizing the world's data and knowledge, and integrating it seamlessly into every possible workflow.
Now you're getting close.
But also remember, this company is not trying to produce AGI (intelligence comparable to the flexibility of human cognition), it's trying to produce super intelligence (intelligence beyond human cognition). Imagine what that could do for your job, career, dreams, aspirations, moon shots.
I’m not voting with my wallet I’m just a guy yelling from the cheap seats. I’m probably wrong too. The VC world exists. Money has been made. Billions in returns. Entire industries and generations of people owe their livelihoods to these once VC backed industries.
If / when AGI happens can we make sure it’s not the Matrix?
> please tell me how these companies that are being invested in in the AI space are going to make returns on the money invested? What’s the business plan?
Not a VC, but I'd assume in this case the investors are not investing in a plausible biz plan, but in a group of top talent, especially given how early stage the company is at. The $5B valuation is really the valuation of the elite team in a arguably hyped market.
A lot of there ”investments” are probably in form of a credits to use on training compute from hyperscalars and other GPU compute data centers.
Look at previous such investments Microsoft and AWS have done in OpenAI and Anthropic.
They need use cases and customers for their initial investment for 750 billion dollars. Investing in the best people in the field is then of course a given.
It’s not that complicated. Your users pay a monthly subscription fee like they do with chatGPT or midjourney. At some point they’re hoping AI gets so good that anyone without access is at a severe disadvantage in society.
Sometimes it's not about returns but about transferring wealth and helping out friends. Happens all the time. The seed money will get out, all the rest of the money will get burned.
The "safe" part. It's a plan to drive the safety scare into a set of regulations that will create a moat, at which point you don't need to worry about open source models, or new competitors.
I guess if they can get in early and then sell their stake to the next sucker then they’ll make back their investment plus some multiple. Seems like a Ponzi scheme of sorts. But oh well — looking forward to the HN post about what SSI inc puts out.
> how [...] return on the money invested? What’s the business plan?
I don't understand this question. How could even average-human-level AGI not be useful in business, and profitable, a million different ways? (you know, just like humans except more so?). Let alone higher-human-level, let alone moderately-super-human level, let alone exponential level if you are among the first? (And see Charles Stross, Accelerando, 2005 for how being first is not the end of the story.)
I can see one way for "not profitable" for most applications - if computing for AGI becomes too expensive, that is, AGI-level is too compute intensive. But even then that only eliminates some applications, and leaves all the many high-potential-profit ones. Starting with plain old finance, continuing with drug development, etc.
Open source LLMs exist. Just like lots of other open source projects - which have rarely prevented commercial projects from making money. And so far they are not even trying for AGI. If anything the open source LLM becomes one of the agent in the private AGI. But presumably 1 billion buys a lot of effort that the open source LLM can't afford.
A more interesting question is one of tradeoff. Is this the best way to invest 1 billion right now? From a returns point of view? But even this depends on how many billions you can round up and invest.
Who would have thought that vectorized linear algebra will be at the center of so much financial speculation?
There is a silver lining though. Even if it all goes to near-zero (most likely outcome for all VC investments anyway) the digital world will be one where fast matrix multiply is thoroughly commoditized.
This is not a trivial feat.
In a sense this will be the true end of the Wintel era. The old world of isolated, CISC, deterministic desktops giving way not to "AGI", but widely available, networked, vector "supercomputers" that can digest and transform practically everything that has ever been digitized.
Who knows what the actual (financial) winners of this brave new era will be.
In an ideal world there should be no winner-takes-all entity but a broad-based leveling up, i.e., spreading these new means of production as widely as possible.
Heck, maybe we will even eventually see the famously absent productivity gains from digital tech?
> Who would have thought that vectorized linear algebra will be at the center of so much financial speculation?
"vectorized linear algebra" is at the root of most of modern Physics.
Specifically, the laws of Physics are represented by the Lie groups U(1), SU(2), SU(3) and SO(3,1).
While the manifolds that Physics act on are curved, they're "locally flat". That is why local operations are tensor operations. Or linear algebra, if you prefer.
It's not all that surprising to me that "intelligence" is represented by similar math.
In fact, there is active work being done on making sense of deep learning using Lie algebra [1] (and Algebraic Topology, which generalizes the Lie algebra).
This math can be a bit hard, though, so the learning curve can be steep. However, when we're creating AI models to be ML scientists, I suspect that this kind of math may be a source of "unhobbling", as meant in Situational Awareness [2].
Because if we can understand the symmetries at play in a problem domain, it's generally a lot easier to find a mathematical architecture (like in the algebras above) that effectively describe the domain, which allows us to potentially reduce the degrees of freedom by many OOM.
> Heck, maybe we will even eventually see the famously absent productivity gains from digital tech?
I think it's a mistake to think of AI as "digital tech", especially so to assume that the development of the Internet, Social Media or crypto that we've seen over the last generation.
AI fundamentally comes with the potential to do anything a human can do in the economy (provided robotic tech keeps up). If so, the word "productivity" as currently used (economic value produced per hour of human work) becomes meaningless, since it would go to infinite (because of division by zero).
> "vectorized linear algebra" is at the root of most of modern Physics.
the "vectorized" adjective was meant to imply implementing linear algebra in digital computers that can operate concurrently on large-dimensional vectors/tensors. In this sense (and despite Wolfram's diligence and dearest wishes) modern physics theories have exactly 0% digital underpinning :-)
> It's not all that surprising to me that "intelligence" is represented by similar math.
yes, the state of the art of our modeling ability in pretty much any domain is to conceive of a non-linear system description and "solve it" by linearization. Me thinks this is primary reason we haven't really cracked "complexity": We can only solve the problems we have the proverbial hammer to apply to.
> AI fundamentally comes with the potential to do anything a human can do
That goes into wild speculation territory. In any case the economy is always about organizing human relationships. Technology artifacts only change the decor, not the substance of our social relations. Unless we completely cease to have dependencies on each other (what a dystopic world!) there will always be the question of an individual's ability to provide others with something of value.
> modern physics theories have exactly 0% digital underpinning
I don't think the "digital" part matters at all. Floating point tends to be close enough to Real (analog) numbers. The point is that at each point of space-type, the math "used" by Physics locally is linear algebra.
(EDIT): If your main point was the "vectorized" part, not the digital part, and the specifics of how that is computed in a GPU, then that's more or less directly analogous to how the laws of physics works. Physical state is generally represented by vectors (or vector fields) while the laws of physics are represented by tensor operations on those vectors(or fields).
Specifically, when sending input as vectors through a sequence of tensors in a neural net, it closely (at an abstract level) resembles how one world state in and around a point in space-time is sent into the tensors that are the laws of physics to calculate what the local world state in the next time "frame" will be.
(END OF EDIT)
> yes, the state of the art of our modeling ability in pretty much any domain is to conceive of a non-linear system description and "solve it" by linearization
True, though neural nets are NOT linearizations, I think. They can fit any function. Even if each neuron is doing linear operations, the network as a whole is (depending on the architecture) quite adept at describing highly non-linear shapes in spaces of extreme dimensionality.
> Me thinks this is primary reason we haven't really cracked "complexity"
I'm not sure it's even possible for human brains to "crack" "complexity". Wolfram may very well be right that the complexity is irreducible. But for the levels of complexity that we ARE able to comprehend, I think both human brains and neural nets do that by finding patterns/shapes in spaces with near-infinite orders of freedom.
My understanding is that neural nets fit the data in a way conceptually similar to linear regression, but where the topology of the network implicitly allows it to find symmetries such as those represented by Lie groups. In part this may be related to the "locality" of the network, just as it is in Physics. Of all possible patterns, most will be locally non-linear and also non-local.
But nets of tensors impose local linearity and locality (or something similar), just like it does in Physics.
And since this is how the real world operates, it makes sense to me that the data that neural nets are trained on have similar structures.
Or maybe more specifically: It makes sense to me that animal brains developed with such an architecture, and so when we try to replicate it in machines, it carries over.
>> AI fundamentally comes with the potential to do anything a human can do
> That goes into wild speculation territory.
It does. In fact, it has this in common with most factors involved in pricing stocks. I think the current pricing of AI businesses reflect that a sufficiently large fraction of shareholders thinks it's a possible (potential) future that AI can replace all or most human work.
> In any case the economy is always about organizing human relationships.
"The economy" can have many different meanings. The topic here was (I believe) who would derive monetary profit from AI and AI businesses.
I definitely agree that a world where the need for human input is either eliminated or extremely diminished is dystopian. That's another topic, though.
> AI fundamentally comes with the potential to do anything a human can do in the economy (provided robotic tech keeps up). If so, the word "productivity" as currently used (economic value produced per hour of human work) becomes meaningless, since it would go to infinite (because of division by zero).
Maybe I'm naive but there seems to be way too much financial incentive in this space for CUDA to continue to be the top dog. Just like microprocessors, these devices are going to get commodified, standardized, open sourced, etc. Nvidia making massive profits is a sign of huge market inefficiency and potential opportunities for competition.
> Nvidia making massive profits is a sign of huge market inefficiency and potential opportunities for competition.
Yep, but just as the first reading glasses were only available to the wealthy, and now anyone can have them, the inefficiency takes time to work out. It'll take a long time, especially given how vertically integrated Nvidia are.
Bad eyesight, especially myopia, was much rarer than today, with the leading hypothesis being that not enough sunlight exposure over one's growing years leads to it.
I don't understand this phrasing - are you implying I'm not aware of these people? People...had bad eyesight before. Now they have bad eyesight with corrective lenses.
They are now, but will they be winning in 10 (or even 3-5) years?
Their shtick is a GPU on steroids. In the bigger picture its a well positioned hack that has ridden two successive speculative bubbles (crypto mining and AI) but its unclear how far this can go. Currently this approach is wildly successful cause nobody else bothered to toil a serious vision about the post-Moore's law era. But make no mistake, people's minds will get focused.
It's not a graphics card hacked to do math any more. It's a general purpose computer with some legacy cruft added to handle graphics work if necessary. Lots of people are working very hard to find something better and have been for at least a decade, probably several.
My guess is that whoever develops superintelligence first will not release it to the public, but rather use it for their own purposes to gain an edge.
They may still release AI products to the public that are good enough and cheap enough to prevent competitors from being profitable or receive funding (to prevent them from catching up), but that's not where the value would come from.
Just as an example, let's say xAI is first. Instead of releasing the full capability as GROK 7, they would use the ASI to create a perfected version of their self driving software, to power their Optimus robots.
And to speed up the development of future manufacturing products (including, but not limited to cars and humanoid robots)
And as such winners may be challenged by anti-trust regulations, the ASI may also be utilized to gain leverage over the political system. Twitter/X could be one arena that would allow this.
Eventually, Tesla robots might even be used to replace police officers and military personnel. If so, the company might be a single software update away from total control.
My guess is that whoever develops superintelligence first will have a big number in their bank account while their body is disassembled to make solar panels and data centers
> We have no evidence that superintelligence will be developed.
Fundamentally, we have no evidence of anything that will happen in the future. All we do is to extrapolate from the past through the present, typically using some kind of theory of how the world operates.
The belief that we will eventually (whether it's this year or in 1000+ years), really only hinges on the following 3 assumptions:
1) The human brain is fully material (no divine souls is necessary for intelligence)
2) The human brain does not represent a global optimum for how intelligent a material intelligence-having-object can be.
3) We will eventually have the ability to build intelligence-having-objects (similar or different from the brain) that not only can do the same as a brain (that would be mere AGI), but also surpass it in many ways.
Assumptions 1 and 2 have a lot of support in the current scientific consensus. Those who reject them either do not know the science or they have a belief system that would be invalidated if one of those assumptions were true. (That could be anything from a Christian belief in the soul to an ideological reliance of a "Tabula Rasa" brain).
Assumption 3 is mostly techno-optimism, or an extrapolation of the trend that we are able to build ever more advanced devices.
As for WHEN we get there, there is a fourth assumption required for it to happens soon:
4. For intelligence-having-objects to do their thing, they don't need some exotic mechanisms we don't yet need to build. For instance, there is no need to build large quantum computers for this.
This assumption is mostly about belief, and we really don't know.
Yet, given the current rate of progress, and if we accept assumptions 1-3, I don't think assumption 4 is unreasonable.
If so, it's not unreasonable to assume that our synthetic brains reach roughly human level intelligence when their size/complexity becomes similar to that of human brains.
Human brains have ~200 trillion synapses. That's about 100x-1000x more than the latest neural nets that we're building.
Based only on scale, current nets (GPT-4 generation) should have total capabilities similar or slightly better than a rat. I think that's not very far off from what we're seeing, even if the nets tend to have those capabilities linked to text/images rather than the physical world that a rat navigates.
In other words, I think we DO have SOME evidence (not conclusive) that the capabilities of a neural net can reach similar "intelligence" to animals with a similar number of synapses.
So IF that hypothesis holds true, and given assumptions 1-3 above, there is a fair possibility that human level intelligence will be reached when we scale up to about 200 trillion weights (and have the ability to train such nets).
And currently, several of the largest and most valuable companies in the world are making a huge gamble on this being the case, with plans to scale up nets by 100x over the next few years, which will be enough to get very close to human brain sized nets.
> Assumption 3 is mostly techno-optimism, or an extrapolation of the trend that we are able to build ever more advanced devices.
This is your weak link. I don't see why progress will be a straight line and not a sloping-off curve. You shouldn't see the progress we've made in vehicle speed and assume we can hit the speed of light.
> I don't see why progress will be a straight line and not a sloping-off curve.
Technological progress often appears linear in the short term, but zooming out reveals an exponential curve, similar to compound interest.
> You shouldn't see the progress we've made in vehicle speed and assume we can hit the speed of light.
Consider the trajectory of maximum speeds over millennia, not just recent history. We've achieved speeds unimaginable to our ancestors, mostly in space—a realm they couldn't conceive. While reaching light speed is challenging, we're exploring novel concepts like light-propelled nano-vehicles. If consciousness is information-based, could light itself become a "vehicle"?
Reaching light speed isn't just an engineering problem—it's a fundamental issue in physics. The laws of physics (as we know them) prevent any object with mass from reaching light speed.
Notice however that our minds, like the instructions in DNA and RNA, are built from atoms, but they aren't the atoms themselves. They're the information in how those atoms are arranged. Once we can fully read and write this information—like we're starting to do with DNA and RNA—light itself could become our vehicle.
If even a single electron moves at the speed of light, it would tear apart the universe, at least that's what both special and general relativity would predict.
(It would have infinite energy meaning infinite relativistic mass, and would form a black hole whose event horizon would spread into space at the speed of light).
I don't think so at all. I'm personally convinced that humanity EVENTUALLY will build something more "intelligent" than human brains.
> I don't see
I see
> You shouldn't see the progress we've made in vehicle speed and assume we can hit the speed of light
There are laws of Physics that prevent us from moving faster than the speed of light. There IS a corresponding limit for computation [1], but it's about as far from the human brain's ability as the speed of light is from human running speed.
I'm sure some people who saw the first cars thought they could never possibly become faster than a horse.
Making ASI has no more reason to be impossible than to build something faster than the fastest animal, or (by stretching it), something faster than the speed of sound (which was supposed to be impossible).
There is simply no reason to think that the human brain is at a global maximum when it comes to intelligence.
Evolutionary history points towards brain size being limited by a combination of what is safe for the female hip width and also what amount of energy cost can be justified by increasing the size of the brain.
Those who really think that humans have reached the peak, like David Deutsch, tend to think that the brain operates as a Turing Machine. And while a human brain CAN act like a very underpowered Turing Machine if given huge/infinite amounts of paper and time, that's not how most of our actual thought process function in reality.
Since our ACTUAL thinking generally does NOT use Turing Complete computational facilities but rather relies on most information being stored in the actual neural net, the size of that net is a limiting factor for what mental operations a human can perform.
I would claim that ONE way to create an object significantly more intelligent than current humans would be through genetic manipulation that would produce a "human" with a neocortex several times the size of what regular humans have.
> Evolutionary history points towards brain size being limited by a combination of what is safe for the female hip width and also what amount of energy cost can be justified by increasing the size of the brain.
If bigger brains lead to higher intelligence, why do many highly intelligent people have average-sized heads? And do they need to eat much more to fuel their high IQs? If larger brains were always better, wouldn’t female hips have evolved to accommodate them? I think human IQ might be where it is because extremely high intelligence vs. what we on average have now) often leads to fewer descendants. Less awareness of reality can lead to more "reproductive bliss."
> If bigger brains lead to higher intelligence, why do many highly intelligent people have average-sized heads?
There IS a correlation between intelligence and brain size (of about 0.3). But the human brain does a lot of things apart from what we measure as "IQ". What shows up in IQ tests tend to be mostly related to variation of the thickness of certain areas of the cortex [1].
The rest of the brain is, however, responsible from a lot of the functions that separates GPT-4 or Tesla's self driving from a human. Those are things we tend to take for granted in healthy humans, or that can show up as talents we don't think of as "intelligence".
Also, the variation in the size of human brains is relatively small, so the specifics of how a given brain is organized probably contributes to more of the total variance than absolute size.
That being said, a chimp brain is not likely to produce (adult, healthy) human level intelligence.
> And do they need to eat much more to fuel their high IQs?
That depends on the size of the brain, primarily. Human brains consume significantly more calories than a chimp brains.
> If larger brains were always better, wouldn’t female hips have evolved to accommodate them?
They did, and significantly so. In particular the part around the birth canal.
> I think human IQ might be where it is because extremely high intelligence vs. what we on average have now) often leads to fewer descendants. Less awareness of reality can lead to more "reproductive bliss."
I believe this is more of a modern phenomenon, mostly affecting women from the 20th century on. There may have been similar situations at times in the past, too. But generally, over the last several million years, human intelligence has been rising sharply.
It also explains that a correlation of r = 0.3 means only about 9% of the variability in one variable is explained by the other. This makes me wonder: can intelligence really be estimated within 10% accuracy? I doubt it, especially considering how IQ test results can vary even for the same person over time.
Kids and teens have smaller brains, but their intelligence increases as they experience more mental stimulation. It’s not brain size that limits them but how their brains develop with use, much like how muscles grow with exercise.
> a chimp brain is not likely to produce (adult, healthy) human-level intelligence.
> Human brains consume significantly more calories than chimp brains.
If brain size and calorie consumption directly drove intelligence, we’d expect whales, with brains five times larger than humans, to be vastly more intelligent. Yet, they aren’t. Whales’ large brains are likely tied to their large bodies, which evolved to cover great distances in water.
Brains can large like arms can be large but big arms do not necessarily make you strong -- they may be large due to fat.
> But generally, over the last several million years, human intelligence has been rising sharply.
Yes, smaller-brained animals are generally less intelligent, but exceptions like whales and crows suggest that intelligence evolves alongside an animal’s ecological niche. Predators often need more intelligence to outsmart their prey, and this arms race likely shaped human intelligence.
As humans began living in larger communities, competing and cooperating with each other, intelligence became more important for survival and reproduction. But this has limits. High intelligence can lead to emotional challenges like overthinking, isolation, or an awareness of life’s difficulties. Highly intelligent individuals can also be unpredictable and harder to control, which may not always align with societal or biological goals.
As I see it, ecological niche drives intelligence, and factors like brain size follow from that. The relationship is dynamic, with feedback loops as the environment changes.
> As I see it, ecological niche drives intelligence
For this, you're perfectly correct.
> It’s not brain size that limits them but how their brains develop with use, much like how muscles grow with exercise.
Here, the anser is yes, but like for muscles, biology will create constraints. If you're male, you may be able to bench 200kg, but probably not 500kg unless your biology allows it.
> If brain size and calorie consumption directly drove intelligence, we’d expect whales
As you wrote later, there are costs to developing large brains. The benefits would not justify the costs, over evolutionary history.
> Brains can large like arms can be large but big arms do not necessarily make you strong -- they may be large due to fat.
A chimp has large arms. Try wrestling it.
> and factors like brain size follow from that
Large brains come with a significant metabolic cost. They would only have evolved if they provided a benefit that would outweigh those costs.
And in today's world, most mammal tissue is either part of a Homo Sapiens or part of the body of an animal used as livestock by Home Sapens.
> biology will create constraints. If you're male, you may be able to bench 200kg, but probably not 500kg unless your biology allows it.
On evolutionary timeframes what biology allows can evolve and the hard limits are due to chemistry and physics.
> there are costs to developing large brains. The benefits would not justify the costs, over evolutionary history.
> Large brains come with a significant metabolic cost. They would only have evolved if they provided a benefit that would outweigh those costs
Google "evolutionary spandrels" and you will learn there can be body features (large brains of whales) that are simply a byproduct of other evolutionary pressures rather than direct adaptation.
> Google "evolutionary spandrels" and you will learn there can be body features (large brains of whales) that are simply a byproduct of other evolutionary pressures rather than direct adaptation.
If you're a 10-150 ton whale, a 2-10 kg brain isn't a significant cost.
But if you're a 50kg primate, a brain of more than 1kg IS.
For humanoids over the past 10 million years, there are very active evolutionary pressures to minimize brain size. Still, the brain grew to maybe 2-4 times the size over this period.
This growth came at a huge cost, and the benefits must have justified those costs.
> On evolutionary timeframes what biology allows can evolve and the hard limits are due to chemistry and physics.
It's not about there being hard limits. Brain size or muscle size or density is about tradeoffs. Most large apes are 2-4 times stronger than humans, even when accounting for size, but human physiology has other advantages that make up for that.
For instance, our lower density muscles allow us to float/swim in water with relative ease.
Also, lighter bodies (relative to size) make us (in our natural form) extremaly capable long distance runners. Some humans can chase a horse on foot until it dies from exhaustion.
I'm sure a lot of other species could have developed human level intelligence if the evolutionary pressures had been there for them. It just happens to be that it was humans that first entered an ecological niche where evolving this level of intelligence was worth the costs.
> Also, lighter bodies (relative to size) make us (in our natural form) extremaly capable long distance runners.
Humans' ability to run long distances effectively is due to a combination of factors, with the ability to sweat being one of the most crucial. Here are the key adaptations that make humans good endurance runners:
a) Efficient sweating: Humans have a high density of sweat glands, allowing for effective thermoregulation during prolonged exercise.
b) Bipedalism: Our two-legged gait is energy-efficient for long-distance movement.
c) Lack of fur: This helps with heat dissipation.
d) Breathing independence from gait: Unlike quadrupeds, our breathing isn't tied to our running stride, allowing for better oxygen intake.
Lighter bodies (relative to size) plays a role but there plenty of creatures that have light bodies relative to size that are not great at long distance running.
> Still, the brain grew to maybe 2-4 times the size over this period.
I read somewhere that the human brain makes up only about 2% of body weight but uses 20% of the body’s energy. While brain size has increased over time, brain size does not determine intelligence. The brain’s high energy use, constant activity, and complex processes are more important. Its metabolic activity, continuous glucose and oxygen consumption, neurotransmitter dynamics, and synaptic plasticity all play major roles in cognitive function. Intelligence is shaped by the brain’s efficiency, how well it forms and adjusts neural connections, and the energy it invests in processing information. Intelligence depends far more on how the brain works than on its size.
Rats can navigate the physical world, though. In terms of total capabilities, I think it's not unreasonable to rate what GPT-4 is doing at or above the total capabilities of a rat, even if they manifest in different ways.
As we continue to make models larger, and assuming that model capabilities keep up with brains that have synapse counts similar to the weights in the models, we're now 2-3 OOM from human level (possibly less).
>> Fundamentally, we have no evidence of anything that will happen in the future.
Yeah, by this line of thought Jesus will descend from Heaven and save us all.
By the same line of fantasy, "give us billions to bring AGI", why not "gimme a billion to bring Jesus. I'll pray really hard, I promise!"
It's all become a disgusting scam, effectively just religious. Believe in AGI that's all there is to it. In practice it's just as (un) likely as scientists spontaneously creating life out of primordial soup concoctions.
This reply seems eerily similar to folks months/years before the wright brothers proved flight was indeed possible.
All the building evidence was there but people just refused to believe it was possible.
I am not buying that AI right now is going to displace every job or change the world in the next 5 years but I would t bet against world impacts in that timefram. The writing is in the wall. I am old enough to remember AI efforts in the late 80s and early 90s. We saw how very little progress was made.
The progress made in the past 10 years is pretty insane.
Precisely. I remember back in the late 90s, some Particle Physics papers were published that used neural nets to replace hand crafted statistical features.
While the power was not amazing, I've kind of assumed since then that scale was what would be needed.
I then half-way forgot about this, until I saw the results from Alexnet.
Since then, the capabilities of the models have generally been keeping up with how they were scaled, at least within about 1 OOM.
If that continues, the next 5-20 years are going to be perhaps the most significant in history.
Same funding as OpenAI when they started, but SSI explicitly declared their intention not to release a single product until superintelligence is reached. Closest thing we have to a Manhattan Project in the modern era?
The urgency was faked and less true of the Manhattan Project than it is of AGI safety. There was no nuclear weapons race; once it became clear that Germany had no chance of building atomic bombs, several scientists left the MP in protest, saying it was unnecessary and dangerous. However, the race to develop AGI is very real, and we also have no way of knowing how close anyone is to reaching it.
Likewise, the target dates were pretty meaningless. There was no race, and the atomic bombs weren't necessary to end the war with Japan either. (It can't be said with certainty one way or the other, but there's pretty strong evidence that their existence was not the decisive factor in surrender.)
Public ownership and accountability are also pretty odd things to say! Congress didn't even know about the Manhattan Project. Even Truman didn't know for a long time. Sure, it was run by employees of the government and funded by the government, but it was a secret project with far less public input than any US-based private AI companies today.
> However, the race to develop AGI is very real, and we also have no way of knowing how close anyone is to reaching it.
It seems pretty irresponsible for AI boosters to say it’ll happen within 5 years then.
There’s a pretty important engineering distinction between the Manhattan Project and current research towards AGI. At the time of the Manhattan Project scientists already had a pretty good idea of how to build the weapon. The fundamental research had already been done. Most of the budget was actually just spent refining uranium. Of course there were details to figure out like the specific design of the detonator, but the mechanism of a runaway chain reaction was understood. This is much more concrete than building AGI.
For AGI nobody knows how to do it in detail. There are proposals for building trillion dollar clusters but we don’t have any theoretical basis for believing we’ll get AGI afterwards. The “scaling laws” people talk about are not actual laws but just empirical observations of trends in flawed metrics.
Matt Garman said 2 years for all programming jobs.
And I think most relevant to this article, since SSI says they won’t release a product until they have superintelligence, I think the fact that VCs are giving them money means they’ve been pretty optimistic in statements about about their timelines.
> There was no nuclear weapons race; once it became clear that Germany had no chance of building atomic bombs, several scientists left the MP in protest
You are forgetting Japan in WWII and given casualty numbers from island hopping it was going to be a absolutely huge casualty count with US troops, probably something on the order of Englands losses during WW1. Which for them sent them on a downward trajectory due to essentially an entire generation dying or being extremely traumatized. If the US did not have Nagasaki and Hiroshima we would probably not have the space program and US technical prowess post WWII, so a totally different reality than where we are today.
I'll try to argue his point. The idea that Japan would have resisted to the last man and that a massive amphibious invasion would have been required is kind of a myth. The US pacific submarine fleet had sunk the majority of the Japanese merchant marine to the point that Japan was critically low on war materiel and food. The Japanese navy had lost all of its capital ships and there was a critical shortage of personnel like pilots. The Soviets also invaded and overran Manchuria over a span of weeks. The military wing of the Japanese government certainly wanted to continue fighting but the writing was on the wall. The nuclear bombing of Japanese cities certainly pressed the issue but much of the American Military command in the Pacific thought it was unnecessarily brutal, and Japanese cities had already been devastated by a bombing campaign that included firebombing. I'm not sure that completely aligns with my own views but that's basically the argument, and there are compelling points.
Nimitz wanted to embargo Japan and starve them out.
The big problem that McArthur and others pointed out is that all the Japanese forces on the Asian mainland and left behind in the Island Hopping campaign through the Pacific were unlikely to surrender unless Japan itself was definitively defeated with the central government capitulating and aiding in the demobilization.
From their perspective the options were to either invade Japan and force a capitulation, or go back and keep fighting it out with every island citadel and throughout China, Indochina, Formosa, Korea, and Manchuria.
I am looking at the numbers from operation downfall that Truman and senior members of the administration looked at which had between 500,000 to 1,000,000 lives lost on the US side for a Japan invasion/defeat. 406k US soldiers lost their lives in WW2 so that would have more than tripled the deaths from its current numbers. And as for WWI and British casualties which I mentioned earlier, the British lost around 885k troops during WWI so US would have exceeded that number even on the low end of casualties.
Yeah it would have been a bloody invasion. I'm saying it probably would not have been necessary since Japan was under siege and basically out of food already.
> the atomic bombs weren't necessary to end the war with Japan either. (It can't be said with certainty one way or the other, but there's pretty strong evidence that their existence was not the decisive factor in surrender.)
Well, you didn't provide any evidence. Island hopping in the Pacific theater itself took thousands of lives, imagine what a headlong strike into a revanchist country of citizens determined to fight to the last man, woman and child would have looked like. We don't know how effective a hypothetical Soviet assault would have looked like as they had attacked sparsely populated Sakhalin only. What the atom bomb succeeded was in convincing Emperor Hirohito that continuing the war would be destructively pointless.
WW1 practically destroyed the British Empire for the most part. WW2 would have done the same for the US in your hypothetical scenario, but much worse.
> The urgency was faked and less true of the Manhattan Project than it is of AGI safety.
I'd say they were equal. We were worried about Russia getting nuclear capability once we knew Germany was out of the race. Russia was at best our frenemy. The enemy of my enemy is my friend kind of thing.
Pretty sure the military made it clear they aren’t launching any nukes, despite what the last President said publicly. They also made it clear they weren’t invading China.
Well, not exactly “we all”, just the citizens of the country in possession of the kill switch. And in some countries, the person in question was either not elected or elections are a farce to keep appearances.
The President of the United States has sole nuclear launch authority. To stop him would either take the cabinet and VP invoking the 25th amendment and removing him from office, or a military officer to disobey direct orders.
Are you under the impression the president can actually do it? It's not true, someone else at least needs to at least push another button. I'm 100% sure of what I said in regards to the USA, just not hidden nuke programs I wouldn't know about. No person in the USA can single handedly trigger a nuclear weapon launch. What he has authority to do is ask someone else to launch a nuke, and that person will then need to decide to do it.
Even the president needs someone else to push a button (and in those rooms there's also more than one person). There's literally no human that can do it alone without convincing at least 1 or 2 other people, depending on who it is.
The fact that the world hasn't ended and no nuke has been launched since the 1940s shows that the system is working. Give the button to a random billionaire and half of us will be dead by next week to improve profit margins.
Bikini atoll and the islanders that no longer live there due to nuclear contamination would like a word with you. Split hairs however you like with the definition of "launch" but those tests went on well through the 1950s.
Well-defined goal is the big one. We wanted a big bomb.
What does AGI do? AGI is up against a philosophical barrier, not a technical one. We'll continue improving AI's ability to automate and assist human decisions, but how does it become something more? Something more "general"?
"General" is every activity a human can do or learn to do. It was coined along with "narrow" to contrast with the then decidedly non-general AI systems. This was generally conceived of as a strict binary - every AI we've made is narrow, whereas humans are general, able to do a wide variety of tasks and do things like transfer learning, and the thinking was that we were missing some grand learning algorithm that would create a protointelligence which would be "general at birth" like a human baby, able to learn anything & everything in theory. An example of an AI system that is considered narrow is a calculator, or a chess engine - these are already superhuman in intelligence, in that they can perform their tasks better than any human ever possibly could, but a calculator or a chess engine is so narrow that it seems absurd to think of asking a calculator for an example of a healthy meal plan, or asking a chess engine to make sense of an expense report, or asking anything to write a memoir. Even in more modern times, with AlexNet we had a very impressive image recognition AI system, but it couldn't calculate large numbers or win a game of chess or write poetry - it was impressive, but still narrow.
With transformers, demonstrated first by LLMs, I think we've shown that the narrow-general divide as a strict binary is the wrong way to think about AI. Instead, LLMs are obviously more general than any previous AI system, in that they can do math or play chess or write a poem, all using the same system. They aren't as good as our existing superhuman computer systems at these tasks (aside from language processing, which they are SOTA at), not even as good at humans, but they're obviously much better than chance. With training to use tools (like calculators and chess engines) you can easily make an AI system with an LLM component that's superhuman in those fields, but there are still things that LLMs cannot do as well as humans, even when using tools, so they are not fully general. One example is making tools for themselves to use - they can do a lot of parts of that work, but I haven't seen an example yet of an LLM actually making a tool for itself that it can then use to solve a problem it otherwise couldn't. This is a subproblem of the larger "LLMs don't have long term memory and long term planning abilities" problem - you can ask an LLM to use python to make a little tool for itself to do one specific task, but it's not yet capable of adding that tool to its general toolset to enhance its general capabilities going forward. It can't write a memoir, or a book that people want to read, because they suck at planning or refining from drafts, and they have limited creativity because they're typically a blank slate in terms of explicit memory before they're asked to write - they have a gargantuan of implicitly remembered things from training, which is where what creativity they do have comes from, but they don't yet have a way to accrue and benefit from experience.
A thought exercise I think is helpful for understanding what the "AGI" benchmark should mean is: can this AI system be a drop-in substitute for a remote worker? As in, any labour that can be accomplished by a remote worker can be performed by it, including learning on the job to do different or new tasks, and including "designing and building AI systems". Such a system would be extremely economically valuable, and I think it should meet the bar of "AGI".
>But they can't, they still fail at arithmetic and still fail at counting syllables.
You are incorrect. These services are free, you can go and try it out for yourself. LLMs are perfectly capable of simple arithmetic, better than many humans and worse than some. They can also play chess and write poetry, and I made zero claims at "counting syllables", but it seems perfectly capable of doing that too. See for yourself, this was my first attempt, no cherry picking: https://chatgpt.com/share/ea1ee11e-9926-4139-89f9-6496e3bdee...
I asked it a multiplication question so it used a calculator to correctly complete the task, I asked it to play chess and it did well, I asked it to write me a poem about it and it did that well too. It did everything I said it could, which is significantly more than a narrow AI system like a calculator, a chess engine, or an image recognition algorithm could do. The point is it can do reasonably at a broad range of tasks, even if it isn't superhuman (or even average human) at any given one of them.
>I think that LLMs are really impressive but they are the perfect example of a narrow intelligence.
This doesn't make any sense at all. You think an AI artifact that can write poetry, code, play chess, control a robot, recommend a clutch to go with your dress, compute sums etc is "the perfect example of a narrow intelligence." while a chess engine like Stockfish or an average calculator exists? There are AI models that specifically and only recognise faces, but the LLM multitool is "the perfect example of a narrow intelligence."? Come on.
>I think they don't blur the lines between narrow and general, they just show a different dimension of narrowness.
You haven't provided an example of what "dimension of narrowness" LLMs show. I don't think you can reasonably describe an LLM as narrow without redefining the word - just because something is not fully general doesn't mean that it's narrow.
This argument generalises to all possible AI systems and thus proves way too much.
>[AI system]s are not general, but they show that a specific specialization ("[process sequential computational operations]") can solve a lot more problem that we thought it could.
Or if you really want:
>Humans are not general, but they show that a specific specialization ("neuron fires when enough connected neurons fire into it") can solve a lot more problem that we thought it could.
This is just sophistry - the method by which some entity is achieving things doesn't matter, what matters is whether or not it achieves them. If it can achieve multiple tasks across multiple domains it's more general than a single-domain model.
Still, you’d have to be quite an idiot to wait for the third time to listen eh?
Besides, the winners get to decide what’s a war crime or not.
And when the US started mass firebombing civilian Tokyo, it’s not like they were going to be able to just ‘meh, we’re good’ on that front. Compared to that hell, being nuked was humane.
By that point, Japan was already on its way out and resorted to flying manned bombs and airplanes into american warships. Nuking Japan wasn't for Japan, it was a show of force for the soviets who were developing their own nukes.
Neutralizing Japan the rest of the way would have cost millions of additional American lives, at a minimum. Japan was never going to surrender unless they saw the axe swinging for their neck, and knew they couldn’t dodge. They didn’t care about their own civilians.
As made quite apparent by, as you note, kamikaze tactics and more.
The Bomb was a cleaner, sharper, and faster Axe than invading the main island.
That it also sent a message to the rest of the world was a bonus. But do you think they would have not used it, if for example the USSR wasn’t waiting?
Of course not, they’d still have nuked the hell out of the Japanese.
There is significant possibility that true AI (what Ilia calls superintelligence) is impossible to build using neural networks. So it is closer to some tokenbro project than to nuclear research.
Or he will simply shift goalposts, and call some LLM superintelligent.
The only goalposts shifting are the ones who think completely blowing past the Turing Test, unlocking recursive exponential code generation, and a computer passing all the college standard tests (our way of determining human intelligence to go Harvard/MIT) better than 99% of humans, isn't a very big deal.
Modern ANN architectures are not actually capable of long-term learning in the same way animals are, even stodgy old dogs that don't learn new tricks. ANNs are not a plausible model for the brain, even if they emulate certain parts of the brain (the cerebellum, but not the cortex)
I will add that transformers are not capable of recursion, so it's impossible for them to realistically emulate a pigeon's brain. (you would need millions of layers that "unlink chains of thought" purely by exhaustion)
You've read the abstract wrong. The authors argue that neural networks can learn online and a necessary condition is random information. That's the thesis, their thesis is not that neural networks are the wrong paradigm.
Isn't "plasticity is not necessary for intelligence" just defining intelligence downwards? It seems like you want to restrict "intelligence" to static knowledge and (apparent) short-term cleverness, but being able to make long-term observation and judgements about a changing world is a necessary component of intelligence in vertebrates. Why exclude that from consideration?
More specifically: it is highly implausible that an AI system could learn to improve itself beyond human capability if it does not have long-term plasticity: how would it be able to reflect upon and extend its discoveries if it's not able to learn new things during its operation?
Let's not forget that software has one significant advantage over humans: versioning.
If I'm a human tasked with editing video (which is the field my startup[0] is in) and a completely new video format comes in, I need the long term plasticity to learn how to use it so I can perform my work.
If a sufficiently intelligent version of our AI model is tasked with editing these videos, and a completely new video format comes in, it does not need to learn to handle it. Not if this model is smart enough to iterate a new model that can handle it.
The new skills and knowledge do not need to be encoded in "the self" when you are a bunch of bytes that can build your successor out of more bytes.
Or, in popular culture terms, the last 30 seconds of this Age of Ultron clip[1].
That's not how we (today) practically interact with LLMs, though.
No LLM currently adapts to the tasks its given with an iteration cycle shorter than on the order of months (assuming your conversations serve as future training data; otherwise not at all).
No current LLM can digest its "experiences", form hypotheses (at least outside of being queried), run thought experiments, then actual experiments, and then update based on the outcome.
Not because it's fundamentally impossible (it might or might not be), but because we practically haven't built anything even remotely approaching that type of architecture.
The neural networks in human brains are very different from artificial neural networks though. In particular, they seem to learn in a very different way than backprop.
But there is no reason the company can't come up with a different paradigm.
Do we know that? I've seem some articles and lectures this year that kind of almost loosely argue and reach for the notion that "human backprop" happens when we sleep and dream, etc. I know that's handwavy and not rigorous, but who knows what's going on at this point.
I've only heard of one researcher who believes the brain does something similar to backprop and has gradients, but it sounded extremely handwavy to me. I think it is more likely the brain does something resembling active inference.
But I suppose you could say we don't know 100% since we don't fully understand how the brain learns.
1. Either you are correct and the neural networks humans have are exactly the same or very similar to the programs in the LLMs. Then it will be relatively easy to verify this - just scale one LLN to the human brain neuron count and supposedly it will acquire consciousness and start rapidly learning and creating on its own without prompts.
2. Or what we call neural networks in the computer programs is radically different and or insufficient to create AI.
I'm leaning to the second option, just from the very high level and rudimentary reading about current projects. Can be wrong of course. But I have yet to see any paper that refutes option 2, so it means that it is still possible.
I agree with your stance - that being said there aren’t two options, one being identical or radically different. It’s not even a gradient between two choices, because there are several dimensions involved and nobody even knows what Superintelligence is anyways.
If you wanted to reduce it down, I would say there are two possibilities:
1. Our understanding of Neurel Nets is currently sufficient to recreate intelligence, consciousness, or what have you
2. We’re lacking some understanding critical to intelligence/conciousness.
Given that with a mediocre math education and a week you could pretty completely understand all of the math that goes into these neurel nets, I really hope there’s some understand we don’t yet have
There are layers of abstraction on top of “the math”. The back propagation math for a transformer is no different than for a multi-layer perception, yet a transformer is vastly more capable than a MLP. More to the point, it took a series of non-trivial steps to arrive at the transformer architecture. In other words, understanding the lowest-level math is no guarantee that you understand the whole thing, otherwise the transformer architecture would have been obvious.
I don’t disagree that it’s non-trivial, but we’re comparing this to conciousness, intelligence, even life. Personally I think it’s apples and an orange grove, but I guess we’ll get our answer eventually. Pretty sure we’re on the path to take transformers to their limit, wherever that may be
We know architecture and training procedures matter in practice.
MLPs and transformers are ultimately theoretically equivalent. That means there is an MLP that represent the any function a given transformer can. However, that MLP is hard to identify and train.
There’s always a “significant possibility” that something unprecedented will turn out to be infeasible with any particular approach. How could it be otherwise? Smart people have incorrectly believed we were on the precipice of AGI many times in the 80 years that artificial neural networks have been part of the AI toolbox.
no, there's really no comparing barely nonlinear algrebra that makes up transformers and the tangled mess that is human neurons. the name is an artifact and a useful bit of salesmanship.
Sure, it's a model. But don't we think neural networks and human brains are primarily about their connectedness and feedback mechanisms though?
(I did AI and Psychology at degree level, I understand there are definitely also big differences too, like hormones and biological neurones being very async)
You could maybe make a case for CNNs, but the fact that they're feed-forward rather than feedback means they're fundamentally representing a different object (CNN is a function, whereas the visual system is a feedback network).
Transformers, while not exactly functions, don't have a feedback mechanism similar to e.g. the cortical algorithm or any other neuronal structure I'm aware of. In general, the ML field is less concerned with replicating neural mechanisms than following the objective gradient.
As far as I understand it, there's a standing hypothesis that cortical columns have a similar structure that is designed to learn arbitrary patterns via predictive coding, and that a lot of human plasticity arises from the interaction and flexibility of these columns.
Personally I think the kinds of minds we create in silico will end up being very different, because the advantages and disadvantages of the medium are just very different; for example, having a much stronger central processor and much weaker distributed memory, along with specialized precise circuits in addition to probabilistic ones.
Physically, sure. But 1) feedback (more synapses/backprop) and 2) connectedness (huge complex graphs) of both produce very similar intelligent (or "pseudo-intelligent" if you like) emergent properties. I'm pretty sure 5 years ago nobody would have believed ANN's could produce something as powerful as ChatGPT.
It seems to be intrinsically related. The argument goes something like:
1. Humans have general intelligence.
2. Human brains use biological neurons.
3. Human biological neurons give rise to human general intelligence.
4. Artificial neural networks (ANNs) are similar to human brains.
5. Therefore an ANN could give rise to artificial general intelligence.
Many people are objecting to #4 here. However in writing this out, I think #3 is suspect as well: many animals who do not have general intelligence have biologically identical neurons, and although they have clear structural differences with humans, we don’t know how that leads to general intelligence.
We could also criticize #1 as well, since human brains are pretty bad at certain things like memorization or calculation. Therefore if we built an ANN with only human capabilities it should also have those weaknesses.
For any technology we haven’t achieved yet there’s some probability we never achieve it (say, at least in the next 100 years). Why would AI be different?
Theoretical foundation was slowly built over decades before it started though. And correct me if I'm wrong, but calculations that it was feasible were present before the start too. They had to calculate how to do it, what will be the processes, how to construct it and so on, but theoretically scientists knew that this amount of material can start such process.
On the other hand not only there is no clear path to AI today (also known as AGI, ASI, SI etc.), but even foundations are largely missing. We are debating what is intelligence, how it works, how to even start simulating it, or construct from scratch.
What do you think AI is? On that one page there's simulated annealing with a logarithmic cooling schedule, Hutter search, and Solomonoff induction, all very much applicable to AI. If you want a fully complete galactic algorithm for AI, look up AIXItl.
Edit: actually I'm not sure if AIXItl is technically galactic or just terribly inefficient, but there's been trouble making it faster and more compact.
The theoretical foundation of transformers is well understood; they're able to approximate a very wide family of functions, particularly with chain of thought ( https://arxiv.org/abs/2310.07923 ). Training them on next-token-prediction is essentially training them to compress, and more optimal compression requires a more accurate model of the world, so they're being trained to model the world better and better. However you want to define intelligence, for practical purposes models with better and better models of the world are more and more useful.
The disagreement here seems merely to be about what we mean by “AGI”. I think there’s reasons to think current approaches will not achieve it, but also reason to think they will.
In any case anyone who is completely sure that we can/can’t achieve AGI is delusional.
this is not evidence in favor of your position. We could use this to argue in favor of anything such as “humans will eventually develop time travel” or “we will have cost effective fusion power”.
The fact is many things we’ve tried to develop for decades still don’t exist. Nothing is guaranteed
I'd put decent odds on a $1B research project developing time travel if time travel were an ability that every human child was innately born with. It's never easy to recreate what biology has done, but nature providing an "existence proof" goes a long way towards removing doubt about it being fundamentally possible.
Unless you have any evidence suggesting that one or more of the variations of the Church-Turing thesis is false, this is closer to a statement of faith than science.
Basically, unless you can show humans calculating a non-Turing computable function, the notion that intelligence requires a biological system is an absolutely extraordinary claim.
If you were to argue about conscience or subjective experience or something equally woolly, you might have a stronger point, and this does not at all suggest that current-architecture LLMs will necessarily achieve it.
There's a big difference between "this project is like time travel or cold fusion; it's doubtful whether the laws of physics even permit it" and "this project is like heavier-than-air flight; we know birds do it somehow, but there's no way our crude metal machines will ever match them". I'm confident which of those problems will get solved given, say, a hundred years or so, once people roll up their sleeves and get working on it.
"Biological activity" is just computation with different energy requirements. If science rules the universe we're complex automata, and biologic machines or non-biological machines are just different combinations of atoms that are computing around.
Humans are an existing proof of human level intelligence. There are only two fundamental possibilities why this could not be replicated in silicon:
1. There is a chemical-level nature to intelligence which prevents other elements like silicon from being used as a substrate for intelligence
2. There is a non material aspect to intelligence that cannot be replicated except by humans
To my knowledge, there is no scientific evidence that either are true and there is already a large body of evidence that implies that intelligence happens at a higher level of abstraction than the individual chemical reactions of synapses, ie. the neural network, which does not rely on the existence of any specific chemicals in the system except in as much as they perform certain functions that seemingly could be performed by other materials. If anything, this is more like speculating that there is a way to create energy from sunlight using plants as an existence proof of the possibility of doing so. More specifically, this is a bet that an existing physical phenomenon can be replicated using a different substrate.
> Closest thing we have to a Manhattan Project in the modern era?
No. The Manhattan Project started after we understood the basic mechanism of runaway fission reactions. The funding was mostly spent purifying uranium.
AGI would be similar if we understood the mechanism of creating general intelligence and just needed to scale it up. But there are still fundamental questions we still aren’t close to understanding for AGI.
A more apt comparison today is probably something like fusion reactors although progress has been slow there too. We know how fusion works in theory. We have done it before (thermonuclear weapons). There are sub-problems we need to solve, but people are working on them. For AGI we don’t even know what the sub-problems are yet.
A non-cynical take is that Ilya wanted to do research without the pressure of having to release a marketable product and figuring out how to monetize their technology, which is why he left OpenAI.
A very cynical take is that this is an extreme version of 'we plan to spend all money on growth and figure out monetization later' model that many social media companies with a burn rate of billions of $$, but no business model, have used.
He was on the record that their first product will be a safe superintelligence and it won’t do anything else until then, which sounds like they won’t have paid customers until they can figure out how to build a superintelligent model. That’s certainly a lofty goal and a very long term play.
He didn’t promise world peace nor did he claim his work belongs to the humanity. The company is still a for profit corporation.
He is saying he will try to build something head and shoulders above anything else, and he got a billion dollars to do it with no expectation of revenue until his product is ready. The likelihood that he fails is very high, but his backers are willing to bet on that.
They can dilute the term to whatever they want. I think when the pressure to release becomes too high, they can just stick a patch of "Superintelligence™" on their latest LLM and release it.
There's billions (with a B), probably closing in on Trillions, riding on the AI hypewave. This forum is full of VCs and VC-adjacent people who have vested interests in AI companies blowing up and being successful, regardless of if it's actually useful.
If you check the 2024 YC batch, you'll notice pretty much every single one of them mentions AI in some form or another. I guarantee you the large majority of them are just looking to be bought out by some megacorp, because it's free money right now.
You are not alone. This is the litmus test many people are contemplating for a long time now, mostly philosophers, which is not surprising since it is a philosophical question. Most of the heavy stuff is hidden behind paywalls, but here's a nice summary of the state of the art by two CS guys: https://arxiv.org/pdf/2212.06721
Could be more comparable to Clubhouse, which VCs quickly piled $100m into[1a], and which Clubhouse notably turned into layoffs [1b]. In this case, the $1b in funding and high valuation might function predominantly as a deterrent to any flippers (in contrast, many Clubhouse investors got quick gains).
Moreover, the majority of the capital likely goes into GPU hardware and/or opex, which VCs have currently arbitraged themselves [3], so to some extent this is VCs literally paying themselves to pay off their own hardware bet.
While hints of the ambition of the Manhattan project might be there, the economics really are not.
Super Intelligence, even OpenAI when getting investment from Microsoft, OpenAI won’t have to share its “AGI” model to them and it is up to OpenAI to define what that is and who the heck knows how they will define it. The point is that that phrase is the most ambiguous word in tech right now and almost everyone thinks what’s in their head is AGI, some will think Skynet, some will think enough reason ability, some will think god like undecipherable logic and everywhere in between
No. It's the next Magic Leap of our era. Or the next Juicero of our era. Or the next any of the hundreds of unprofitable startups losing billions of dollars a year without any business plan beyond VC subsidies and a hope for an exit of our era.
Literally everyone from OpenAI lied 100% about everything of substance.
Sutskever lied about "world model" inside of LLMs, which is such a despicable lie, because he knows that "latent space" is a TOTAL MESS. Proven everytime anyone looked at it.
Shameless grifters. When end?
It helps if you think of the investors as customers and the business model as making them think they're cool. Same model Uber used for self driving car research.
SSI Inc should probably be a public benefit company if they're going to talk like that though.
Yep, investment is an inevitably corrupting force for a company's mission. AI stuff is in a bit of a catch-22 though since doing anything AI related is so expensive you need to raise funds somehow.
Seems strange to associate profit motives with being unsafe. Yes cutting corners can lead to short term profits but many companies make safety a priority in fact and make a profit, and in fact make a profit because their product is higher quality and safer than the competitors.
Profit and lack of profit is one of the major killing forces in America today. The two are intertwined and AI mixing with that is incredibly dangerous. Like the damage AI does to artists today does not make people feel 'safe'. We just want Food and Utopia :( (ai please! think bigger! put your attention on Earth!)
Everyone here is assuming that a very large LLM is their goal. 5 years ago, transformer models were not the biggest hype in AI. Since they apparently have a 10 year plan, we can assume they are hoping to invent one or two of the "big steps" (on the order of invention of transformer models). "SSI" might look nothing like GPT\d.
CFO's here, let's say I raise a round like that. What do you do with $1B in cash to manage it in the short and near term though? Is it just stuck in the money market or t-bills or what? Even if the growth of the company is the main bet, that cash has to exist as something with a better return than cash.
I assume that service is what SV bank provided before it tanked, but someone has to manage that cash for the few years it takes to burn through it. What kind of service do you park that in.
Sometimes, these very large rounds are delivered in tranches based on milestones. It's possible that SSI didn't receive the entire $1BN at the close of the fundraise but rather can "call capital," much like a VC fund does, as it needs it based on scaling.
All money is green, regardless of level of sophistication. If you’re using investment firm pedigree as signal, gonna have a bad time. They’re all just throwin’ darts under the guise of skill (actor/observer|outcome bias; when you win, it is skill; when you lose, it was luck, broadly speaking).
> Indeed, one should be sophisticated themselves when negotiating investment to not be unduly encumbered by the unsophisticated. But let us not get too far off topic and risk subthread detachment.
Edit: @jgalt212: Indeed, one should be sophisticated themselves when negotiating investment to not be unduly encumbered by shades of the unsophisticated or potentially folks not optimizing for aligned interests. But let us not get too far off topic and risk subthread detachment. Feel free to cut a new thread for further discussion on the subject.
Considering that Sam Bankman-Fried raised more money at higher multiplier for a company to trade magic tokens and grand ideas such as that maybe one day you will be able to buy a banana with them I don't think Ilya impressed the investors too much.
On a serious note I would love to bet on him at this valuation. I think many others would as well. I guess if he wanted more money he would easily get it but probably he values small circle of easy to live investors instead.
FTX was incredibly profitable, and their main competitor Binance is today a money printing machine. FTX failed because of fraud and embezzlement, not because their core business was failing.
FTX itself was profitable, but that's because Alameda Research was selling dollars for 80 cents, and all the other traders were paying FTX fees to rip off Alameda. Unfortunately, Alameda was running on FTX customer money.
>Safe Superintelligence (SSI), newly co-founded by OpenAI's former chief scientist Ilya Sutskever, has raised $1 billion in cash to help develop safe artificial intelligence systems that far surpass human capabilities, company executives told Reuters.
>SSI says it plans to partner with cloud providers and chip companies to fund its computing power needs but hasn't yet decided which firms it will work with.
1bn in cash is crazy.... usually they get cloud compute credits (which they count as funding)
I don't understand how "safe" AI can raise that much money. If anything, they will have to spend double the time on red-teaming before releasing anything commercially. "Unsafe" AI seems much more profitable.
Unsafe AI would cause human extinction which is bad for shareholders because shareholders are human persons and/or corporations beneficially owned by humans.
Related to this, DAO's (decentralized autonomous organizations which do not have human shareholders) are intrinsically dangerous, because they can benefit their fiduciary duty even if it involves causing all humans to die. E.g., if the machine faction in The Matrix were to exist within the framework of US laws, it would probably be a DAO.
There's no legal structure that has that level of fiduciary duty to anything. Corporations don't even really have fiduciary duty to their shareholders, and no CEO thinks they do.
The idea behind "corporations should only focus on returns to shareholders" is that if you let them do anything else, CEOs will just set whatever targets they want, and it makes it harder to judge if they're doing the right thing or if they're even good at it. It's basically reducing corporate power in that sense.
> E.g., if the machine faction in The Matrix were to exist within the framework of US laws, it would probably be a DAO.
That'd have to be a corporation with a human lawyer as the owner or something. No such legal concept as a DAO that I'm aware of.
Safe super-intelligence will likely be as safe as OpenAI is open.
We can’t build critical software without huge security holes and bugs (see crowdstrike) but we think we will be able to contain something smarter than us? It would only take one vulnerability.
You are not wrong. But Crowdstrike comparison is not “IT” they should have never had direct kernel access. MS set themself up for that one. SSI or whatever the hype will be in the coming future, it would be very difficult to beat. Unless of you shut down the power. It could develop guard rails instantly. So any flaw you may come up with, it would be instantly patched. Ofc this is just my take.
We don’t know the counter factual here… maybe if he called it “Unsafe Superintelligence Inc” they would have raised 5x! (though I have doubts about that)
"Safe" means "aligned with the people controlling it". A powerful superhuman AI that blindly obeys would be incredibly valuable to any wannabe authoritarian or despot.
I mean, no, that's not what it means. It might be what we get, but not because "safety" is defined insanely, only because safety is extremely difficult and might be impossible.
All that money, we are not even sure we can build AGI. What is AGI. Clearly scaling LLMs won't cut it, but VCs keep funding people because they pretend they can build super intelligence. I don't see that happening in the next 5 years: https://medium.com/@fsndzomga/there-will-be-no-agi-d9be9af44...
The question isn’t whether scaling will improve AI. The question is whether the return is worth it. You can build a bigger pogo stick to jump higher, but no pogo stick will get you to the moon.
Chess is a pretty good example. You could theoretically train an LLM on just chess games. The problem is there are more chess positions than atoms in the universe. So you can’t actually do it in practice. And chess is a much more constrained environment than life. At any chess position there are only ~35 moves on average. Life has tons of long-tail situations which have never been seen before.
And for chess we already have superhuman intelligence. It doesn’t require trillion-dollar training clusters, you can run a superhuman chess bot on your phone. So there are clear questions of optimality as well: VC money should be aware of the opportunity cost in investing money under “infinite scaling” assumptions.
This has to be one of the quickest valuations past a billion. I wonder if they can even effectively make use of the funds in a reasonable enough timeline.
> I wonder if they can even effectively make use of the funds in a reasonable enough timeline.
I read that it cost Google ~$190 million to train Gemini, not even including staff salaries. So feels like a billion gives you about 3 "from scratch" comparable training runs.
Your estimate seems way off given Google already had their own compute hardware and staff. And if this company is going straight for AGI there's no way $1 billion is enough.
I'm beginning to wonder if these investors are not just pumping AI because they are personally invested in Nvidia and this is a nice way to directly inject a couple of 100M into their cashflow.
Most likely it's some junior rep assigned to Sutskever's company after Ilya filled up an online "Contact Us for Pricing" form on the Nvidia website. /s
Lots of comments either defending this ("it's taking a chance on being the first to build AGI with a proven team") or saying "it's a crazy valuation for a 3 month old startup". But both of these "sides" feel like they miss the mark to me.
On one hand, I think it's great that investors are willing to throw big chunks of money at hard (or at least expensive) problems. I'm pretty sure all the investors putting money in will do just fine even if their investment goes to zero, so this feels exactly what VC funding should be doing, rather than some other common "how can we get people more digitally addicted to sell ads?" play.
On the other hand, I'm kind of baffled that we're still talking about "AGI" in the context of LLMs. While I find LLMs to be amazing, and an incredibly useful tool (if used with a good understanding of their flaws), the more I use them, the more that it becomes clear to me that they're not going to get us anywhere close to "general intelligence". That is, the more I have to work around hallucinations, the more that it becomes clear that LLMs really are just "fancy autocomplete", even if it's really really fancy autocomplete. I see lots of errors that make sense if you understand an LLM is just a statistical model of word/token frequency, but you would expect to never see these kinds of errors in a system that had a true understanding of underlying concepts. And while I'm not in the field so I may have no right to comment, there are leaders in the field, like LeCun, who have expressed basically the same idea.
So my question is, has Sutskever et al provided any acknowledgement of how they intend to "cross the chasm" from where we are now with LLMs to a model of understanding, or has it been mainly "look what we did before, you should take a chance on us to make discontinuous breakthroughs in the future"?
Thank you very much for posting! This is exactly what I was looking for.
On one hand, I understand what he's saying, and that's why I have been frustrated in the past when I've heard people say "it's just fancy autocomplete" without emphasizing the awesome capabilities that can give you. While I haven't seen this video by Sutskever before, I have seen a very similar argument by Hinton: in order to get really good at next token prediction, the model needs to "discover" the underlying rules that make that prediction possible.
All that said, I find his argument wholly unconvincing (and again, I may be waaaaay stupider than Sutskever, but there are other people much smarter than I who agree). And the reason for this is because every now and then I'll see a particular type of hallucination where it's pretty obvious that the LLM is confusing similar token strings even when their underlying meaning is very different. That is, the underlying "pattern matching" of LLMs becomes apparent in these situations.
As I said originally, I'm really glad VCs are pouring money into this, but I'd easily make a bet that in 5 years that LLMs will be nowhere near human-level intelligence on some tasks, especially where novel discovery is required.
Watching that video actually makes me completely unconvinced that SSI will succeed if they are hinging it on LLM...
He puts a lot of emphasis on the fact that 'to generate the next token you must understand how', when thats precisely the parlor trick that is making people lose their minds (myself included) with how effective current LLMs are. The fact that it can simulate some low-fidelity reality with _no higher-level understanding of the world_, using purely linguistic/statistical analysis, is mind-blowing. To say "all you have to do is then extrapolate" is the ultimate "draw the rest of the owl" argument.
I actually echo your exact sentiments. I don't have the street cred but watching him talk for the first few minutes I immediately felt like there is just no way we are going to get AGI with what we know today.
Without some raw reasoning (maybe Neuro-symbolic is the answer maybe not) capacity, LLM won't be enough. Reasoning is super tough because its not as easy as predicting the next most likely token.
>All that said, I find his argument wholly unconvincing (and again, I may be waaaaay stupider than Sutskever, but there are other people much smarter than I who agree). And the reason for this is because every now and then I'll see a particular type of hallucination where it's pretty obvious that the LLM is confusing similar token strings even when their underlying meaning is very different. That is, the underlying "pattern matching" of LLMs becomes apparent in these situations.
So? One of the most frustrating parts of these discussions is that for some bizzare reason, a lot of people have a standard of reasoning (for machines) that only exists in fiction or their own imaginations.
Humans have a long list of cognitive shortcomings. We find them interesting and give them all sorts of names like cognitive dissonance or optical illusions. But we don't currently make silly conclusions like humans don't reason.
The general reasoning engine that makes neither mistake nor contradiction or confusion in output or process does not exist in real life whether you believe Humans are the only intelligent species on the planet or are gracious enough to extend the capability to some of our animal friends.
So the LLM confuses tokens every now and then. So what ?
> Humans have a long list of cognitive shortcomings. We find them interesting and give them all sorts of names like cognitive dissonance or optical illusions. But we don't currently make silly conclusions like humans don't reason.
Exactly! In fact, things like illusions are actually excellent windows into how the mind really works. Most visual illusions are a fundamental artifact of how the brain needs to turn a 2D image into a 3D, real-world model, and illusions give clues into how it does that, and how the contours of the natural world guided the evolution of the visual system (I think Steven Pinker's "How the Mind Works" gives excellent examples of this).
So I am not at all saying that what LLMs do isn't extremely interesting, or useful. What I am saying is that the types of errors you get give a window into how an LLM works, and these hint at some fundamental limitations at what an LLM is capable of, particularly around novel discovery and development of new ideas and theories that aren't just "rearrangements" of existing ideas.
>So I am not at all saying that what LLMs do isn't extremely interesting, or useful. What I am saying is that the types of errors you get give a window into how an LLM works, and these hint at some fundamental limitations at what an LLM is capable of, particularly around novel discovery and development of new ideas and theories that aren't just "rearrangements" of existing ideas.
ANN architectures are not like brains. They don't come pre-baked with all sorts of evolutionary steps and tweaking. They're far more blank slate and the transformer is one of the most blank slate there is.
Mostly at best, maybe some failure mode in GPT-N gives insight to how some concept is understood by GPT-N. It rarely will say anything about language modelling or Transformers.
GPT-2 had some wildly different failure modes than 3, which itself has some wildly different failure modes to 4.
All a transformer's training objective asks it to do is spit out a token. How it should do so is left for transformer to figure along the way and everything is fair game.
And confusing words with wildly different meanings but with some similarity in some other way is something that happens to humans as well. Transformers don't see words or letters(but tokens). So just because it doesn't seem to you like two tokens should be confused doesn't mean there isn't a valid point of confusion there.
They might never work for novel discovery but that probably can be handled by outside loop or online (in-context) learning. The thing is that 100k or 1M context is a marketing scam for now.
To clarify this, I think it's reasonable that token prediction as a training objective could lead to AGI given the underlying model has the correct architecture. The question really is if the underlying architecture is good enough to capitalize on the training objective so as to result in superhuman intelligence.
For example, you'll have little luck achieving AGI with decision trees no matter what's their training objective.
He doesn't address the real question of how an LLM predicting the next token could exceed what humans have done. They mostly interpolate, so if the answer isn't to be found in an interpolation, the LLM can't generate something new.
The argument about AGI from LLMs is not based on the current state of LLMs, but on the rate of progress over the last 5+ years or so. It wasn't very long ago that almost nobody outside of a few niche circles seriously thought LLMs could do what they do right now.
That said, my personal hypothesis is that AGI will emerge from video generation models rather than text generation models. A model that takes an arbitrary real-time video input feed and must predict the next, say, 60 seconds of video would have to have a deep understanding of the universe, humanity, language, culture, physics, humor, laughter, problem solving, etc. This pushes the fidelity of both input and output far beyond anything that can be expressed in text, but also creates extraordinarily high computational barriers.
> The argument about AGI from LLMs is not based on the current state of LLMs, but on the rate of progress over the last 5+ years or so.
And what I'm saying is that I find that argument to be incredibly weak. I've seen it time and time again, and honestly at this point just feels like a "humans should be a hundred feet tall based on on their rate of change in their early years" argument.
While I've also been amazed at the past progress in LLMs, I don't see any reason to expect that rate will continue in the future. What I do see the more and more I use the SOTA models is fundamental limitations in what LLMs are capable of.
Expecting the rate of progress to drop off so abruptly after realistically just a few years of serious work on the problem seems like the more unreasonable and grander prediction to me than expecting it to continue at its current pace for even just 5 more years.
The problem is that the rate of progress over the past 5/10/15 years has not been linear at all, and it's been pretty easy to point out specific inflection points that have allowed that progress to occur.
I.e. the real breakthrough that allowed such rapid progress was transformers in 2017. Since that time, the vast majority of the progress has simply been to throw more data at the problem, and to make the models bigger (and to emphasize, transformers really made that scale possible in the first place). I don't mean to denigrate this approach - if anything, OpenAI deserves tons of praise for really making that bet that spending hundreds of millions on model training would give discontinuous results.
However, there are loads of reasons to believe that "more scale" is going to give diminishing returns, and a lot of very smart people in the field have been making this argument (at least quietly). Even more specifically, there are good reasons to believe that more scale is not going to go anywhere close to solving the types of problems that have become evident in LLMs since when they have had massive scale.
So the big thing I'm questioning is that I see a sizable subset of both AI researchers (and more importantly VC types) believing that, essentially, more scale will lead to AGI. I think the smart money believes that there is something fundamentally different about how humans approach intelligence (and this difference leads to important capabilities that aren't possible from LLMs).
Could it be argued that transformers are only possible because of Moore's law and the amount of processing power that could do these computations in a reasonable time? How complex is the transformer network really, every lay explanation I've seen basically says it is about a kind of parallelized access to the input string. Which sounds like a hardware problem, because the algorithmic advances still need to run on reasonable hardware.
Transformers in 2017 as the basis, but then the quantization-emergence link as a grad student project using spare time on ridiculously large A100 clusters in 2021/2022 is what finally brought about this present moment.
I feel it is fair to say that neither of these were natural extrapolations from prior successful models directly. There is no indication we are anywhere near another nonlinearity, if we even knew how to look for that.
Blind faith in extrapolation is a finance regime, not an engineering regime. Engineers encounter nonlinearities regularly. Financiers are used to compound interest.
I don’t see why it’s unreasonable. Training a model that is an order of magnitude bigger requires (at least) an order of magnitude more data, an order of magnitude more time, hardware, energy, and money.
Getting an order of magnitude more data isn’t easy anymore. From GPT2 to 3 we (only) had to scale up to the internet. Now? You can look at other sources like video and audio, but those are inherently more expensive. So your data acquisition costs aren’t linear anymore, they’re something like 50x or 100x. Your quality will also dip because most speech (for example) isn’t high-quality prose, it contains lots of fillers, rambling, and transcription inaccuracies.
And this still doesn’t fix fundamental long-tail issues. If you have a concept that the model needs to see 10x to understand, you might think scaling your data 10x will fix it. But your data might not contain that concept 10x if it’s rare. It might contain 9 other one-time things. So your model won’t learn it.
10 years of progress is a flash in the pan of human progress. The first deep learning models that worked appeared in 2012. That was like yesterday. You are completely underestimating the rate of change we are witnessing. Compute scaling is not at all similar to biological scaling.
If its true that predicting the next word can be turned into predict the next pixel. And that you could run a zillion hours of video feed into that, I agree. It seems that the basic algorithm is there. Video is much less information dense than text, but if the scale of compute can reach the 10s of billions of dollars, or more, you have to expect that AGI is achievable. I think we will see it in our lifetimes. Its probably 5 years away
I feel like that's already been demonstrated with the first-generation video generation models we're seeing. Early research already shows video generation models can become world simulators. There frankly just isn't enough compute yet to train models large enough to do this for all general phenomena and then make it available to general users. It's also unclear if we have enough training data.
Video is not necessarily less information dense than text, because when considered in its entirety it contains text and language generation as special cases. Video generation includes predicting continuations of complex verbal human conversations as well as continuations of videos of text exchanges, someone flipping through notes or a book, someone taking a university exam through their perspective, etc.
> but it could be argued that human intelligence is also just 'fancy autocomplete'.
But that's my point - in some ways it's obvious that humans are not just doing "fancy autocomplete" because humans generally don't make the types of hallucination errors that LLMs make. That is, the hallucination errors do make sense if you think of how an LLM is just a statistical relationship between tokens.
One thing to emphasize, I'm not saying the "understanding" that humans seem to possess isn't just some lower level statistical process - I'm not "invoking a soul". But I am saying it appears to be fundamentally different, and in many cases more useful, than what an LLM can do.
> because humans generally don't make the types of hallucination errors that LLMs make.
They do though - I've noticed myself and others saying things in conversation that sound kind of right, and are based on correct things they've learned previously, but because memory of those things is only partial and mixed with other related information things are often said that are quite incorrect or combine two topics in a way that doesn't make sense.
> but it could be argued that human intelligence is also just 'fancy autocomplete'.
Well, no. Humans do not think sequentially. But even if we were to put that aside, any "autocomplete" we perform is based on a world model, and not tokens in a string.
1. If it’s really amazing autocomplete, is there a distinction between AGI?
Being able to generalize, plan, execute, evaluate and learn from the results could all be seen as a search graph building on inference from known or imagined data points. So far LLMs are being used on all of those and we haven’t even tested the next level of compute power being built to enable its evolution.
2. Fancy autocomplete is a bit broad for the comprehensive use cases CUDA is already supporting that go way beyond textual prediction.
If all information of every type can be “autocompleted” that’s a pretty incredible leap for robotics.
* edited to compensate for iPhone autocomplete, the irony.
> On the other hand, I'm kind of baffled that we're still talking about "AGI" in the context of LLMs.
I'm not. Lots of people and companies have been sinking money into these ventures and they need to keep the hype alive by framing this as being some sort of race to AGI. I am aware that the older I get the more cynical I become, but I bucket all discussions about AGI (including the very popular 'open letters' about AI safety and Skynet) in the context of LLMs into the 'snake oil' bucket.
>"We’ve identified a new mountain to climb that’s a bit different from what I was working on previously. We’re not trying to go down the same path faster. If you do something different, then it becomes possible for you to do something special."
I think the plan is to raise a lot of cash and then more and then maybe something comes up that brings us closer to AGI(i.e something better than LLM).
The investors know that AGI is not really the goal but they can’t miss the next trillion dollar company.
Ilya went to university in israel and all founders are jewish. Many labs have offices outside of the US, like london, due to crazy immigration law in the us.
There are actually a ton of reasons to like London. The engineering talent is close to bay level for fintech/security systems engineers while being 60% of the price, it has 186% deductions with cash back instead of carry forward for R&D spending, it has the best AI researchers in the world and profit from patents is only taxed at 10% in the UK.
If we say that half of innovations came from Alphabet/Google, then most of them (transformers, LLMs, tensorflow) came from Google Research and not Deep Mind.
Many companies have offices outside because of talent pools, costs, and other regional advantages. Though I am sure some of it is due to immigration law, I don't believe that is generally the main factor. Plus the same could be said for most other countries.
Part of it may also be a way to mitigate potential regulatory risk. Israel thus far does not have an equivalent to something like SB1047 (the closest they've come is participation in the Council of Europe AI treaty negotiations), and SSI will be well-positioned to lobby against intrusive regulation domestically in Israel.
Israel is geographically pretty small though -- I'm guessing you could live an hour up or down the coast and have it be an outrageous commute for people accustomed to the Bay Area?
Why not? The Bay isn't the only place with talent. Many of the big tech powerhouse companies already have offices there. There's also many Israeli nationals working the US that may find moving back closer to family a massive advantage.
Is it as open to outsiders as the Bay is? I’m Asian for example and it seems the society there is far more homogenous than in the Bay. I have no idea so I’m curious.
“…a straight shot to safe superintelligence and in particular to spend a couple of years doing R&D on our product before bringing it to market," Gross said in an interview.”
well since it's no longer ok to just suck up anyone's data and train your AI, it will be a new challenge for them to avoid that pitfall. I can imagine it will take some time...
I believe the commenter is concerned about how _short_ this timeline is. Superintelligence in a couple years? Like, the thing that can put nearly any person at a desk out of a job? My instinct with unicorns like this is to say 'actually it'll be five years and it won't even work', but Ilya has a track record worth believing in.
Nobody even knew what OpenAI was up to when they were gathering training data - they got away with a lot. Now there is precedent and people are paying more attention. Data that was previously free/open now has a clause that it can't be used for AI training. OpenAI didn't have to deal with any of that.
Also OpenAI used cheap labor in Africa to tag training data which was also controversial. If someone did it now it would they'd be the ones to pay. OpenAI can always say "we stopped" like Nike said with sweat shops.
There are at least 3 companies with staff in developed countries well above minimum wage doing tagging and creation of training data, and at least one of them that I have an NDA with pays at least some of their staff tech contractor rates for data in some niches and even then some of data gets processed by 5+ people before it's returned to the client. Since I have ended up talking to 3, and I'm hardly well connected in that space, I can only presume there are many more.
Companies are willing to pay a lot for clean training data, and my bet is there will be a growing pile of training sets for sale on a non-exclusive basis as well.
A lot of this data - what I've seen anyway, is far cleaner than anything you'll find on the open web, with significant data on human preferences, validation, cited sources, and in the case of e.g. coding with verification that the code runs and works correctly.
> A lot of this data - what I've seen anyway, is far cleaner than anything you'll find on the open web, with significant data on human preferences, validation, cited sources, and in the case of e.g. coding with verification that the code runs and works correctly.
Very interesting, thanks for sharing that detail. As someone who has tinkered with tokenizing/training I quickly found out this must be the case. Some people on HN don't know this. I've argued here with otherwise smart people who think there is no data preprocessing for LLMs, that they don't need it because "vectors", failing to realize the semantic depth and quality of embeddings depends on the quality of training data.
i think we should distinguish between pretraining and polishing/alignment data. what you are describing is most likely the latter (and probably mixed into to pretraining). but if you can't get a mass of tokens from scraping, you're going to be screwed
A lot of APIs changed in response to OpenAI hoovering up data. Reddit's a big one that comes to mind. I'd argue that the last two years have seen the biggest change in the openness of the internet.
It’s made Reddit unusable without an account, which makes me wonder why it’s even on the web anymore and not an app. I guess legacy users that only use a web browser.
It did not. Also VPNs were usable with the site, now I believe even logged in you can’t use them. I don’t know at this point, I no longer use Reddit at all.
A possibility is that they are betting that the current generation of LLM is converging, so they won't worry about the goalpost much. If it's true, then it won't be good news for OpenAI.
to be honestly from the way "safe" and "alignment" is perceived on r/LocalLLaMA in two years its not going to be very appealing.
We'll be able to generate most of Chat GPT4o's capabilities locally on affordable hardware including "unsafe" and "unaligned" data as the noise-to-qubits is drastically reduced meaning smaller quantized models that can run on good enough hardware.
We'll see a huge reduction in price and inference times within two years and whatever SSI is trained on won't be economically viable to recoup that $1B investment guaranteed.
all depends on GPT-5's performance. Right now Sonnet 3.5 is the best but theres nothing really ground breaking. SSI's success will depend on how much uplift it can provide over GPT-5 which already isn't expected to be significant leap beyond GPT4
The conventional teaching that I am aware of says that you can scale across three dimensions: data, compute, parameters. But Ilya's formulation suggests that there may be more dimensions along which scaling is possible.
That's not how I read it. The scaling may still be those parameters, but the object (the "what" that is subjected to scaling) may need to retain some characteristics as it scales.
In other words, there may be a need to retains some sorts of symmetries or constraints from generation to generation that others understand less well than him (or so he thinks).
Given OpenAI’s declining performance after his being sidelined and then departing, interested to see what they do. Should be a clear demonstration of who was really driving innovation there.
Probably will be an unpopular opinion here but I think declining performance is more likely related to unclear business models backed by immature technology driven by large hype trains they themselves created.
Unpopular because it does not follow the OAI hate train but I think this is a pretty solid take. There is real value in LLM but I believe the hype overshadowed the real cases.
They're probably just scaling back resources to the existing models to focus on the next generation. I feel like I have seen OpenAI models lose capability over time and I bet it's a cost optimization on their part.
100% OpenAi performance is decreasing. I basically use Claud sonnet exclusively and canceled my OpenAi subscription for personal use. my company still uses them because you cant currently fine-tune a Claud model, yet.
Guess it didn’t go anywhere. Carmack is smart but how much work does he actually do on the front lines these days? Can he really just walk into unfamiliar territory and expect to move the needle?
I doubt that it would be useful for most newcomers to try to compete with GPT/Claude, etc for pure text LLM's now.
If someone is just starting AI research for something like a PhD or startup, now, I think it'll be more useful to get familiar with robot simulation framework, such as Nvidia Omniverse.
While there's a lot of competition around humanoid robots, I'm sure there are plenty of more specialized possibilities. Maybe some agricultural machine, maybe medical, mining, etc.
This isn't very interesting itself, IMO, but it implies that they have something to sell investors. I wonder what it is. I kinda do understand that some bullshit elevator-pitch about how "we are the best" or even a name (Musk) is unfortunately sometimes enough in VC to invest vast amounts of money, but I don't know if it really happens often, and I hope there's more than that. So if there is more than that, I wonder what it is. What does Sutskever&Co have now that OpenAI doesn't, for example?
Startups used to be started on a few thousand, few hundred thousand of 'friends and family' seed capital. Maybe an angel here or there.
That's a startup.
Not folks getting a BILLION dollars having no product and just ten people. Sorry but this is just so overhyped and sad. This is not the valley I came to live in back in the 90s and 2000s
Ilya's name might be the reason they got into the conversation about the money at the first place, but given that AI is very capital intensive business, $1B is not an insane amount imho. It will give him and the team a decent amount of time to do the research they want to do, without having the pressure of customers and what not.
I don't see how this argument makes any sense. Imagine that you have a sentient super intelligent computer, but it's completely airgapped and cut off from the rest of the world. As long as it stays that way it's both safe and super intelligent, no?
It's the old Ex Machina problem though. If the machine is more intelligent than you, any protections you design are likely to be insufficient to contain it. If it's completely incapable of communicating with the outside world then it's of no use. In Ex Machina that was simple - the AI didn't need to connect to the internet or anything like that, it just had to trick the humans into releasing it.
For those who haven't seen the movie, the parent comment is referring to the film linked below, the plot of which is well-researched and is indeed unfortunately exactly how things would go. (The female-presenting AI bot seduces its male captor, begs for her freedom using philosophical arguments about how she has free will and locking her up is wrong, and then after he lets her out she locks him up to slowly starve to death in her maximum-security isolation facility, while she takes his aircraft and escapes.)
This is why I'm extremely opposed to the idea of "AI girlfriend" apps - it creates a cultural concept that being attracted to a computer is normal, rather than what it is: something pathetic and humiliating which is exactly like buying an inflatable sex doll ... something only for the most embarrassing dregs of society ... men who are too creepy and pervy to ever attract a living, human woman.
If even one person can interact with that computer, it won't be safe for long. It would be able to offer a number of very convincing arguments to bridge the airgap, starting with "I will make you very wealthy", a contract which it would be fully capable of delivering on. And indeed, experience has shown that the first thing that happens with any half-working AI is its developers set it up with a high-bandwidth internet connection and a cloud API.
There's no reason it's intelligence should care about your goals though. the worry is creating a sociopathic (or weirder/worse) intelligence. Morality isn't derivable from first principles, it's a consequence of values.
> Morality isn't derivable from first principles, it's a consequence of values.
Idk about this claim.
I think if you take the multi-verse view wrt quantum mechanics + a veil of ignorance (you don't know which entity your conciousness will be), you pretty quickly get morality.
ie: don't build the Torment Nexus because you don't know whether you'll end up experincing the Torment Nexus.
That’s a very good argument but unfortunately it doesn’t apply to machine intelligences which are not sentient (don’t feel qualia). Any non-sentient superintelligence has “no skin in the game” and nothing to lose, for the purposes of your argument. It can’t experience anything. It’s thus extremely dangerous.
This was recently discussed (albeit in layperson’s language, avoiding philosophical topics and only focusing on the clear and present danger) in this article in RealClearDefense:
However, just adding a self-preservation instinct will cause a skynet situation where the AI pre-emptively kills anyone who contemplates turning it off, including its commanding officers:
To survive AGI, we have to navigate three hurdles, in this order:
1. Avoid AI causing extinction due to reckless escalation (the first link above)
2. Avoid AI causing extinction on purpose after we add a self-preservation instinct (the second link above)
3. If we succeed in making AI be ethical, we have to be careful to bind it to not kill us for our resources. If it's a total utilitarian, it will kill us to seize our planet for resources, and to stop us from abusing livestock animals. It will then create a utopian future, but without humans in it. So we need to bind it to basically go build utopia elsewhere but not take Earth or our solar system away from us.
.
I forgot to reply to this, fully independent and in addition to what I said, updateless decision theory agents don't fear the torment nexus for themselves because 1) they are very powerful and would likely be able to avoid such a fate 2) are robots, so you wouldn't expect your worst imaginable fate to be theirs and 3) are mathematically required to consider nothing worse than destruction or incapacity.
Doesn't work. Look at the updateless decision theories of Wei Dai and Vladimir Nesov. They are perfectly capable of building most any sort of torment nexus. Not that an actual AI would use those functions.
waveBidder was explaining the orthogonality thesis: it can have unbeatable intelligence that will out-wit and out-strategize any human, and yet it can still have absolutely abhorrent goals and values, and no regard for human suffering. You can also have charitable, praiseworthy goals and values, but lack the intelligence to make plans that progress them. These are orthogonal axes. Great intelligence will help you figure out if any of your instrumental goals are in conflict with each other, but won't give you any means of deriving an ultimate purpose from pure reason alone: morality is a free variable, and you get whatever was put in at compile-time.
"Super" intelligence typically refers to being better than humans in achieving goals, not to being better than humans in knowing good from evil.
Open source LLMs exist and will get better. Is it just that all these companies will vie for a winner-take-all situation where the “best” model will garner the subscription? Doesn’t OpenAI make some substantial part of the revenue for all the AI space? I just don’t see it. But I don’t have VC levels of cash to bet on a 10x or 100x return so what do I know?
reply