Hacker News new | past | comments | ask | show | jobs | submit login
Character.ai CEO Noam Shazeer Returns to Google (techcrunch.com)
116 points by treesciencebot 7 months ago | hide | past | favorite | 83 comments



Key quote (from Character.AI):

“Over the past two years, however, the landscape has shifted; many more pre-trained models are now available. Given these changes, we see an advantage in making greater use of third-party LLMs alongside our own. This allows us to devote even more resources to post-training and creating new product experiences for our growing user base.”

My interpretation is that Character.AI realized they don't actually need to train their own foundation models from scratch to support their product - they can build cheaper, faster and probably better if they use LLMs trained by other companies (could be GPT-4o/Claude/Gemini via APIs, could be Llama 3.1 self-hosted).

If they're not training foundation models any more, the talents of people like Noam Shazeer aren't so important to them. They need to focus on product development instead.


I think this highlights the winner-take-all stakes of intelligence. It also suggests that there is little to be gained by specialization. Building a brand might actually be more short-term profitable since you can swap in the latest AI models as they become available. In other words, if advancing the SOTA AI is your dream, a product company may not be the right place. And if building a product company is your dream, then building foundational AI might not be the best strategy.


> if advancing the SOTA AI is your dream, a product company may not be the right place.

Does Meta get in the way of this?

It's hard to compete with a company that is dead set on spending billions and seemingly wants to drive your SOTA AI product revenue to 0.

If you are OpenAI or Anthropic right now, it seems like trying to run a great restaurant at a reasonable price right next to a good (great?) restaurant that is serving everyone for free.


Yes. Meta's current strategy is extremely disruptive to other companies that are trying to build a business in the foundation model space.

Presumably this is because Meta desperately want to avoid becoming dependent on other companies in this new generative AI world. Mark Zuckerberg talks about not wanting a repeat of the Apple tax in his post about Llama 3.1 here: https://about.fb.com/news/2024/07/open-source-ai-is-the-path...


Seems to me, if the zuck is dropping a gorillion dollars to put out free trained models out there (for small outfits to fine-tune to their purposes), the play is now to take the enormous field of small outfits and put them on a level playing field against zuck's enemies such as google. so he's wrecking gogle's chances indirectly, clever play. Now as far as zuck, yes, not ideal all his tech is out there, but that doesn't mean he doesn't have better tech behind lock doors, and the silver lining is now in effect he has a gorillion people tuning and optimizing HIS model - priceless. I wish google entered into such a war to be honest.. and especially openAI


My take is that this has more to do with the coming years than the current climate.

I think it is just a consequence of the cost of getting to the next level of AI. The estimates for training a GPT-5 level foundational model are on the order of 1 billion. It isn't going to get cheaper from there. So even if your model is a bit better than the free models available today, unless you are spending that 1 billion+ today then you are going to look weak in 6 months to 1 year. And by then the GPT-6+ model training costs will be even higher, so you can't just wait and play catch up. You are probably right as well, in that there is a fear that a competitor based on an open source model gets close enough in capability to generate bad publicity.

I imagine character.ai (like inflection) did calculations and realized that there was no clear path to recoup that magnitude of investment based on their current product lines. And when they brainstormed ways to increase return they found that none of the paths strictly required a proprietary foundational model. Just my speculation, of course.


What does "GPT-5" and "GPT-6" even mean? I gently suggest they aren't currently meaningful, it's not like CPU Ghz frequency steppings. If anything it's more akin to chip fab processes, e.g. 10nm, 5nm, 3nm. Each reduction in feature size requires new physical technology, chip architecture, and a black box bag of tricks to eke out better performance.

Where is the data that costs a billion dollars to train on going to come from? These companies are already training on most of the available valuable information that exists.

While training will surely be expensive, I think it's even more expensive and challenging to organize and harness the brainpower to figure out and execute the next meaningful step forward.


I think the comparison with fab nodes is suitable. We do not know how much performance gains are going to come from it, but we do know it is going to be very expensive.

Data availability for LLM is becoming trickier. There are at least two avenues being explored: A) Synthetic data (in controlled ways) and B) Video data, in particular multi-modal embeddings between image/audio/text sequences. This may enable several magnitudes of increase in compute.


> What does "GPT-5" and "GPT-6" even mean?

That is a short hand for "the next two generations of LLMs created by Open AI". It is not meant to be a forward looking statement on how those models will be branded in the consumer market. It also isn't meant to be a prophecy that OpenAI will maintain its premier position since Anthropic or even a new entrant into the field might be the company to achieve that next step level.

> I think it's even more expensive and challenging to organize and harness the brainpower

Then you should invest with that in mind. What I find interesting is that Microsoft (with its acquisition of the research arm of inflection) and Google (with its acquisition of the research arm of character.ai) seem to see the foundational model and product categories as distinct. It is that distinction I am interested in.

There is no doubt some huge value in productizing these LLMs. However, it appears that the productization of LLMs and the advancement of the foundational models themselves are being decoupled by the market. That is, it seems they are segregating risk. Product companies can raise money to build products, "platform" companies (e.g. Microsoft, Google) can raise money to build foundational models. What seems less popular based on these recent moves is companies able to raise money to build foundational models for the purposes of specific products.


> Where is the data that costs a billion dollars to train on going to come from? These companies are already training on most of the available valuable information that exists.

This is a lack of imagination.

All books written; all movies that exist on dvd; all music released in cd; all tv programs; all radio programs; all whatsapp messages; all of youtube; blueprints from architecture and mechanical engineering.

The copyright and logistics are definitely an issue, but there is more data.


I actually did imagine many of these data sources (and some of your ideas are new to me :), but question the level of additional useful capability they would provide for an LLM in the context of responding to user queries. Is more data always better? Or what level of curation results in the most useful model?

At some point I expect putting in too much data from semi-random or very old sources will have a detrimental effect on output quality.

In the extreme case, you could feed /dev/urandom. Haha, only kidding, but I'm sure you get my idea.

Now I'm wondering what a model trained in the past 45 years of Usenet would be like. Or all of the history of public messages on IRC servers like EFNet or Freenode (afaik they are not fully logged). It is an interesting topic, but I'm still curious and uncertain what the effect adding some multiples of data in the form of often lower fidelity sources (e.g. WhatsApp messages) will have on the capability of the final model. It's hard to understand how such sources would be helpful or useful.


Specializing will happen in product implementation, not model implementation.

LLMs are become akin to tools, like programming languages. They’re blank slates, but require implementation to become special.


I'd argue that their own foundational models are getting outperformed by the Llama finetunes on HF and at this point they're shifting cost structures (getting rid of training clusters in favor of hosted inference).


Strange take since Noam was CEO. He didn't get kicked out. He left. Character AI and remaining employees is going to have a tough time ahead for survival as Google gouged it empty.


> If they're not training foundation models any more, the talents of people like Noam Shazeer aren't so important to them.

Why is the CEO important to model development regardless of talents? They've raised $150m+, have $15m+ ARR and ~200 employees, etc. Shouldn't the CEO be CEOing?

Edit: reading the comments below, it seems like maybe he thought the expected value of attempting to clear the hurdle of their valuation/liquidation preferences at a $250k/year salary as CEO was lower than a $5m+/year salary/RSUs from Google?


Noam Shazeer is a long-time googler who worked on machine learning for quite some time (take a look at his patent history, for example https://scholar.google.com/citations?view_op=view_citation&h...)). He was a favorite of Jeff Dean and did some of the most impressive work on ML that I saw at Google. I think at some point in the past, Noam saw that google wasn't supporting his work very well (google research went through a dark time where many researchers with creative ideas were shut down, either for business or reputation reasons, see https://www.businessinsider.com/google-ai-characterai-ceo-no...) and I figure he made a startup because it was a more convenient position for him to do research, even if there wasn't a strong revenue model. Now he returns to Google in a position of deep strength and will be able to continue to pursue extremely ambitious ideas with far less restraint.


I think your strongly strongly overestimate how fast Google Forgets about you. Deep strength is a very rosey way to Put it.


They forget about most people quick. Noam Shazeer is not most people.


This sort of toxic worship is so insane.


Toxic worship? The guy is one of the authors of the LLM paper (Attention is all you need). Are you saying it's not important?


You should see the reaction of Google employees on Twitter, it's as if Jesus came back from the dead.


and they aren't wrong


Do you hear yourself?


>Character.AI co-founder Daniel De Freitas is also joining Google with some other employees from the startup. Dominic Perella, Character.AI’s General Counsel, is becoming an interim CEO at the startup.

So basically leaving a shell of a company and the GC to try and run it / wind it down.


Is this a cheat code for doing an acquisition without FTC scrutiny?


Yes. Character.ai usage is way down from the peak and it is only getting easier for startups to compete in the space. However, the plausibility of an actual FAANG acquisition is almost zero, so this is the next best option… at least for certain stakeholders. Microsoft did this with Inflection.


What was the peak? I see estimated 211 million visits in June 2024, vs 86 million in Feb 2023.


Out of curiosity, where can one find reliable estimates?


Visits is an entirely meaningless metric


and conjecture is a better one?


This almost seems exactly it, just as the last paragraph of the TechCrunch article hints at.


And screw investors


per The Information -- investors don't get screwed, the board repurchased all shares at $88/share ($2.5B valuation).


The money to repurchase shares must be coming from an external source?

I assumed Character.AI wasn’t profitable, like most AI startups.


Absolutely not yet profitable, not even close.

Currently a money furnace.


These two stories are hard to rectify unless the Google licensing deal is a lump sum of several billion dollars. Which feels borderline impossible. Even Microsoft didn’t do that with open AI and the AI trade is cooking very fast.


They raised ~$200M total, and $150M at the series A. So if they're just paying back investors, they'd "only" need $500M.


I'm puzzled. Did Noam get $500mm+ from this?

Character.ai could've been at $100mm+ ARR if they did a bit more of a monetization push based on my very rough estimates. If it was an acquisition I would've been imagining $3b+ price range.

Huge get by Google! (Side note, Gemini 1.5's new alpha release from this week is now at the top of the lmsys leaderboard and sentiment on twitter for it is that it's strong, maybe as strong as sonnet 3.5, so it'll continue to be an interesting race between meta, openai, anthropic, gdm.)

Edit: Okay perhaps it's - Noam keeps his C.ai stock, gets a big pay package from Google ($5mm-$15mm/year kinda range? not sure). Most of the value in C.ai remains, and he keeps his stock.


Noam is pre-IPO google with extensive work done on the most important infrastructure in the company's history; likely he has large monetary reserves as well as a pool of investors more than happy to give him money simply to continue doing his research, regardless of future revenue. You can assume that in returning to google, he will get some base pay which will be prodigious, but also extensive pay in the form of bonuses and stock. From what I know of him, the real motivation is to get more access to large TPUs.


Having personally met him, this was my impression. He has zero concerns about finance.

Actually a nice and polite guy, too.


> I'm puzzled. Did Noam get $500mm+ from this?

where did you get $500mm number?..


I imagine he still had 15%+ of the company, so if there was a big payout, he would've gotten a big slice. Treat all my estimates with a grain of salt, though, please. I'm imagining he didn't get a huge payout and he actually kept most of his equity?


I think he just steps down as CEO, there is no company acquisition happening, so no one gets paid. I think company essentially failed to generate revenue and/or get new investment round.


https://www.theinformation.com/articles/google-hires-charact...

Investors are being bought at 2.5B valuation


I don't trust that site. They spread unconfirmable rumors behind paywall.

Base on those numbers, $150M of series A with 1B valuation would be bought now for 375M(which is high for failed startup), my understanding is that such transaction has to be disclosed.


This is public info. 2.5b and $88 per share for all


info becomes public when it is officially announced by some of the transaction participants or disclosed in some fillings.

I may be too skeptical, but it is just hard to believe someone (who?) throws 400M (to seed + A investors) plus significant amount to founders into essentially failed startup.


Yes that seems right. Though I think they could've gotten a new investment round if they wanted one, I'd be surprised if that was the cause.


I suppose a lot of the current batch don't have a business model that is sustainable. Another candidate is stability ai.


This is another inflection style "acquisition." Highly unethical of the founders and screws over all your employees and investors who are left holding the bag.

For those asking, c.ai has very high cost and looks like a typical consumer company that burns money for use, so they were decent on revenue but not near profitability.


> Character’s leaders told staff on Friday that investors would be bought out at a valuation of about $88 per share. That’s about 2.5 times the value of shares in Character’s 2023 Series A, which valued the company at $1 billion, they said.

https://www.theinformation.com/articles/google-hires-charact...


> investors would be bought out at a valuation of about $88 per share. That’s about 2.5 times the value of shares in Character’s 2023 Series A, which valued the company at $1 billion

Those investors almost certainly have a liquidation preference. How much did employee shareholders get? I'd guess zero.

"I am confident that the funds from the non-exclusive Google licensing agreement, together with the incredible Character.AI team, positions Character.AI for continued success in the future,” Shazeer said in a statement given to TechCrunch."

That's a pretty hilarious statement from a Founder/CEO, given the circumstances.


>liquidation preference. How much did employee shareholders get? I'd guess zero.

But they aren't filing for chapter 11? I assume all shareholders will be bought out, including the employees, and this will be paid for by Google who will license their models, presumably as a scheme to pay off of the investors as I doubt they actually need those models at all.

(assuming the linked source is correct.)


got it. so instead of employees and investors screwed, only employees are screwed..


To do better by the employees, the CEO really should have fought harder to have the whole company get acqui-hired, even if Google would have shut down the service. Maybe there were some other considerations that I’m not seeing (ie. There are good reasons the company should be kept going, and there’s a good path to success even without Noam. The article doesn’t specify how many employees are going over, so it’s hard to tell.) Landing the employees a relatively cushy Google SWE gig after helping build your company is the least you could do for them.


When you don't understand reasoning, always look at who the beneficiaries are in these situations.


I'm also wondering how much money they spend on legal fees, given that they are copying then likenesses of many celebrities without their permission (that's the only way I've heard abot them before).


Everybody has a price


don't forget Adept


Have some familiarity with the players here. The founders are core model nerds and accidentally happen upon success as a majority sex chat product. They have little interest in that and investors are likely saying they won’t support any more core model research. The Google deal lets the founders go back to doing core model work for Google and the company to focus on a consumer only product that uses third party models.


> core model research

What do you mean by this? Fundamental research? It makes sense, most of the innovations now will be in things like clever caching mechanisms and reducing compute.


AI is quickly becoming a commodity which is awesome for consumers. I think we’re going to see a huge shakeout of companies who dazzled with the initial allure of LLMs being replaced by the “killer AI apps” where we really start to rethink modern computing.


This is likely an aqui-hire structured to not trigger an anti-trust probe.

If there was any real intent to give c.ai a chance as a real business they would have hired a new ceo before the announcement.


>In a big move, Character.AI co-founder and CEO Noam Shazeer [...]

It's like those rich guy/poor guy jokes.

Poor guy leaves his job, tries bootstrap its own company and fails, comes back to its old job. "What a loser".

Rich guy leaves his job, gets 150M to start a company in a blue ocean with a significant competitive advantage over 99.99% of humans alive. Still manages to fail and comes back to its old job. "What a bold move!"


> gets 150M to start a company

Oh there is a place where rich guys can "get" $150M, but poor guys couldn't.


I think the rank and file employees will keep vesting at the 2.5 bn valuation for the next two years, according to The Information


At this point that valuation is almost certainly a mirage.


People seem to forget or be unaware running a startup is far from cozy and usually doesn't work out - it isn't for everyone and honestly - isn't for most.

The more money you raise the more the pressure - and for deep researchers, this is usually noise that takes you away from your passion.

Shazz may be the kind of person that just wants to concentrate on research and doesn't really care about the rest...


Seems like a way to do an "acquisition" while avoiding the brand risk of buying a mostly-porn company.


And I'm sure Google is hopeful that it also carries low regulatory risk from the FTC/DOJ


what is your basis for saying it is mostly-porn? they do a lot to make it SFW


I haven't used Character.ai personally, but assuming the user base is the same as the Silly Tavern userbase, and assuming that the Silly Tavern user base is accurately represented by the subreddit, then it's like 95% ERP


The new VC playbook could be.

1. Find an unhappy senior AI exec from who's published a few papers who's published a few papers. 2. Start a new org around them and hire a few key people with crazy salaries (which you can offer cause the time horizon for the company isn't that long). 3. Train a few models, release some good looking benchmarks (bonus points if big tech lend you their GPUs as part of some 'accelerator deal'). 4. Maybe find PMF and become incredibly rich. 4. If that fails, sign a massive but undisclosed licensing deal for your tech with big tech and give them your staff.

Seems like a good way to take big bets in AI, while hedging most of the risk.


In the CNN boom days after alexnet the playbook was: take your research lab, slap a c crop and new logo on it, keep doing your research and get acquired by FANG for high 8 figures.

Only difference in this cycle is that you can't do real LLM research in academia these days so all of the top researchers are already at FANG.


This isn't #4, there's no acquisition happening.


This playbook won't work within a year as M&A market cools and everyone wakes up to the real truth - that no one has any edge in LLM / AI infra development


Was this an Inflection.ai style acquisition considering C.AI was profitable?


In contrast to something like Adept.ai, this appears just be the CEO + some key employees exiting, not a complete team poaching that leaves the company a shell of what it was.

It doesn't necessairly imply Character.ai's business isn't doing well, but a CEO leaving is still indeed weird.


do we know they were profitable? I doubt it, if they pulled this. I think they had high DAU/MAU but low paid users.

this is basically Inflection 2.0.


All public information about their finances only mentions their revenue, not their profits, which means they're almost certainly not profitable.


I would be absolutely floored if any of these new crop of AI companies set profitable


They are likely unprofitable.





Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: