YaLM-100B: Pretrained language model with 100B parameters

narrator · on June 23, 2022

I love Yandex. They are the best search engine by far for politically controversial topics. They also release a language model to benefit everyone even if it says politically incorrect stuff. They also name their projects "cocaine" probably to perhaps to prevent western competitors from using them.

You look at OpenAI and how they don't release their models mainly because they fear "bad people" will use them for "bad stuff." This is the trend in the west. Technology is too powerful, we must control it! Russia is like... Hey, we are the bad guys you're talking about so who are we keeping this technology from? The west has bigger language models than we do, so who cares. Also their attitude to copyright and patents, etc. They don't care because that's not how their economy makes money. Cory Doctorow's end of general purpose computing[1] and locked down everything is very fast approaching. I'm glad the Russians are around and aren't very interested in that project.

[1]https://csclub.uwaterloo.ca/resources/tech-talks/cory-doctor...

abra0 · on June 23, 2022

>They are the best search engine by far for politically controversial topics.

This is an interesting take given the political censorship in Russia (for some ineffable reason much harsher now than it used to be 4 months ago) and cases like https://twitter.com/kevinrothrock/status/1510944781492531208.

narrator · on June 23, 2022

Search Google and Yandex for "2020 election fraud." The results are VERY different. The Zach Vorhies leak shows that Google regularly does blatant censorship for political purposes.[1]

[1]https://www.breitbart.com/tech/2021/08/19/google-whistleblow...

px43 · on June 23, 2022

Totally, just like how if you want to find out what really happened in Tiananmen Square in 1989, your best bet is Baidu. Totally different results than what Google gives you!

I sincerely have deep respect for Yandex for releasing this, and Baidu for some of the amazing research they've released over the years, but both are deeply deeply beholden to their local governments in a way that is incomparable to the relationship between Google and the US government.

Remember that the NSA was literally digging up and tapping fiber around Google data centers in a secret program called MUSCULAR because they didn't think Google was being cooperative enough when handing over data that they were requesting.

https://en.wikipedia.org/wiki/MUSCULAR_(surveillance_program...

alphabetting · on June 23, 2022

Google: 118M results. Top link is the best resource on verified election fraud cases.

Yandex: 9M results. The top two links are pretty suspect. Top link promotes Dinesh D'Souza's 2000 Mules documentary in the banner which at best is a one-sided take on election fraud. At worst, very misleading.

https://i.imgur.com/n5a9LOd.png

stainablesteel · on June 23, 2022

This is a weird comment because yes, that's exactly what the above person was saying. It shows results that google won't give you.

Secondly, I've yet to see any criticisms of 2000 mules data that aren't addressed by the stringency in the analysis they claim to have done.

I thought the information they presented was extremely valuable. Are we going to overturn an election at this point? No. But the vulnerabilities to the mail-in ballots were obvious, then lied about, then ignored, then clearly taken advantage of. I want to live in a democracy because voting matters. I especially don't want NPO's destroying this by taking advantage of flawed voting infrastructure.

If there are legitimate criticisms of the methods, the data, or anything else coming out of this film I expect a legitimate presentation that can break it down using the actual data in question. All I've seen so far is shilling and gas-lighting.

1024core · on June 23, 2022

> then clearly taken advantage of.

If you have any evidence, any evidence at all, of significant mail-in ballots fraud, then you should write it up and publish it; and even present it to the USDOJ, because you would have succeeded where Trump's highly-paid teams of lawyers failed.

If you don't have proof, then please STFU.

jokethrowaway · on June 24, 2022

It's not about whether it happened or not, it's whether it's reported or not.

I personally believe (with obviously no proof) there was definitely fraud going on, on both sides. With such an archaic system and such a great economic and power incentive, you would be stupid not to do it. For sure mail in ballots made it even easier than in the past.

I've heard about Russian hacking the elections after Trump won for a good 2 years.

jokethrowaway · on June 24, 2022

Precisely the point.

I didn't even hear about 2000 Mules until I heard some right wing commentator talk about it months after it was released.

Instead I'm shoved the latest Greta Thunberg song (You can shove your climate crisis up you **) 5 milliseconds after she sung it.

As an avid newspaper and news reader, the media bias shifted tremendously in the last 30 years.

biaachmonkie · on June 26, 2022

Why should 2000 Mules be promoted in results at all? It's total BS, it's progenitor is a criminal that's was guilty of Election Campaign Finance Fraud.

https://www.fbi.gov/contact-us/field-offices/newyork/news/pr...

Or maybe that means Dinesh D’Souza is really the most qualified on election fraud since he has done it himself???

GTFO!

skrebbel · on June 23, 2022

I don't know man, "thegatewaypundit.com" as a top reputable source? seems to me like it's not "honest two-sided results" but just, well, a rather random mix of result of widely varying quality. Mad Altavista vibes!

What I'm trying to say is that even if you believe that "was the 2020 US election stolen?" is worth debating, which it isn't, the yandex results are shit.

narrator · on June 23, 2022

If you get all your information through mainstream channels, and you don't want to see anything contradicting those channels then you should continue to use Google because they explicitly implement the algorithms on controversial topics to prefer mainstream news sources[1]. What I mean by "better" in terms of controversial searches is that on controversial matters, it will rank the searches the same way it does for all other searches. I mean yeah, I don't have access to the internal code base of Yandex, but it certainly feels more organic.

[1]https://www.breitbart.com/tech/2019/05/12/study-the-cnn-sear...

zaptrem · on June 23, 2022

Why link to Breitbart of all places instead of the original source?

https://www.cjr.org/tow_center/google-news-algorithm.php

Btw Wikipedia’s first few sentences on Breitbart are not inspiring

> Its journalists are widely considered to be ideologically driven, and much of its content has been called misogynistic, xenophobic, and racist by liberals and traditional conservatives alike.[10] The site has published a number of conspiracy theories[11][12] and intentionally misleading stories.[13][14]

narrator · on June 23, 2022

This is the association fallacy, which is, unfortunately, how most people determine what to believe these days.

An absurd example of this fallacy would be, Wikipedia, which you cite, has articles that indicate tobacco smoking may cause disease. The nazis were also anti-smoking[1]. Therefore Wikipedia is Nazi propaganda and you should not trust anything on there.

[1]https://www.amazon.com/Nazi-War-Cancer-Robert-Proctor/dp/069...

SR2Z · on June 24, 2022

It is not the association fallacy; the role of a news site is to provide news, which includes fact-checking the work of their "journalists."

If Breitbart pulled a Fox News and argued in court that their goal was to entertain and not inform, then you have a point! But until then, you have a terrible misunderstanding of journalistic integrity and what it means for a publisher to attach their name to a journalist's work.

jhgb · on June 23, 2022

> This is the trend in the west. Technology is too powerful, we must control it!

I take it that you're either too young or too untraveled to be aware of the level of state control of technology in "the east". Xerographic machines, mimeographs, and other similar reprographic devices used to be highly controlled machinery behind the Iron Curtain. This is absolutely not something exclusive or even peculiar to "the west".

risyachka · on June 23, 2022

>> They are the best search engine by far for politically controversial topics

FYI, they are Russian subject that follows ALL their censorship laws (and oh boy do they have a lot of it).

>> probably to perhaps to prevent western competitors from using them The irony here. All yandex products are exact copies of western, adjusted to local market.

cpursley · on June 23, 2022

Actually they're not, some of the Yandex products are actually better and pretty innovative (ignoring the political stuff). Maps and Go are especially good. Ditto with Russian banking apps, they put American bank apps to shame.

orbital-decay · on June 24, 2022

>some of the Yandex products are actually better and pretty innovative (ignoring the political stuff). Maps and Go are especially good.

Yeah, the same Yandex Maps that stopped showing state borders recently, as they are now "more focused on natural objects", in their words.

shakow · on June 24, 2022

At the same time, from a consumer perspective, this sacrifice to internal political pressures won't hamper your usage of the product. It's not like if international borders were the most important annotation on a map for everyday use.

Edit: in fact, I just checked, yandex maps still shows state borders.

orbital-decay · on June 24, 2022

It doesn't distinguish state and region borders anymore, like it did before. No borders on the overview map either. Just zoomed into a random place on the map, and I can tell where Turkey ends and Bulgaria begins only because the city names are different.

shakow · on June 24, 2022

Ah ok, I see what you mean.

I didn't see it as I was looking at France. It's weird, because at large scale there are no borders, at medium scale there is a weird mix of national and local borders (Western EU countries have state-level borders, RU/BY/UA/US/CN have local borders, ...).

And to take your example, I have to zoom quite close for BG/TR to switch from state-wide to local borders.

jhgb · on June 23, 2022

Wait, so you're saying it's a Russian company breaking Russian laws and getting away with it?

Barrin92 · on June 23, 2022

> it's a Russian company breaking Russian laws and getting away with it

I don't think you've lived in Russia if you need to ask that question. Breaking the law and getting away with it is a way of life in Russia, that goes for all institutions and social strata

jhgb · on June 23, 2022

Breaking random laws? Sure. Breaking laws specifically made to enable central government control of independent media? Uh, how do you do that? Have you not noticed for how minor things regarding freedom of expression have been Russian people getting into jail recently?

Barrin92 · on June 23, 2022

>Uh, how do you do that?

By being on the internet. Russia has always been good at literally hitting you with a physical club if you're crazy enough to take a sign to the streets, but the Russian state doesn't understand the internet, or really anything that's sort of underground or intangible.

There's a reason the country is probably the world's largest place for all things piracy related, scihub and so on. It's not just laxer IP laws, it's also that tech in particular has always skirted all kinds of regulation freely, it's why the country has a relatively healthy tech industry despite at times suffocating regulation. The prevalence of cybercrime in the country is another example of it. Being censorious doesn't make you competent.

Even Telegram which was at some point supposedly blocked was still used by everyone, including funnily enough the foreign ministry itself. These things never really work in Russia.(https://www.reuters.com/article/us-russia-telegram-ban-idUSK...)

cpursley · on June 23, 2022

I could have worded that better. I just mean ignoring the general political situation in Russia, Yandex makes some really good products.

make3 · on June 24, 2022

It's widely accepted that OpenAi doesn't release its models to make money from them, not because they really think they would be harmful

Moldoteck · on June 25, 2022

They literally have blocklist of sites that kremlin doesnt like and it acts somehow similar to yandex news in this part. The difference here is more that google filters stuff for usa and yandex for russia

throwaway_1928 · on June 24, 2022

> Hey, we are the bad guys you're talking about so who are we keeping this technology from?

Laughed out loud!

winddude · on June 24, 2022

I feel like you could be paid, or coerced by some country...

chinathrow · on June 23, 2022

Is this sarcasm?

braingenious · on June 23, 2022

This is one of the funniest threads I’ve ever seen on this website. People are yelling at eachother about the CIA and the legitimacy of Israel and Assange and the definition of fascism and… anything that pisses anybody off about international politics in general. In a thread about a piece of software that’s (to me and likely many others) prohibitively expensive to play around with.

Anyway I hope somebody creates a playground with this so I can make a computer write a fan fiction about Kirby and Solid Snake trying to raise a human baby on a yacht in the Caspian Sea or whatever other thing people will actually use this for.

braingenious · on June 23, 2022

What if Street Sharks were mormon missionaries? How would Emily Dickinson describe Angie Dickinson in a poem? How would Ramses II have used Bitcoin?

THESE are the important things to talk about when it comes to this topic.

lumost · on June 23, 2022

To add a voice of skepticism. The recent rush to open source these models may be indicative that the tens of millions that’s spent training these things has relatively poor roi. There may be a hope that someone else figures out how to make these commercially useful.

dandiep · on June 23, 2022

There are tons of commercial uses for these models. I've been experimenting with an app targeted toward language learners [1]. We use large language models to:

- Generate vocabulary - e.g. for biking: handlebars, pedals, shifters, etc

- Generate translation exercises for given topic a learner wants to learn about - e.g. I raised the seat on my bike

- Generate questions for the user - e.g. What are the different types of biking?

- Provide more fluent ways to say things - I went on my bike to the store -> I rode my bike to the store

- Provide explanations of the difference in meaning between two words

And we have fine tuned smaller models to do other thing like grammar correction, exercise grading, and embedded search.

These models are going to completely change the field of education in my opinion.

1) https://squidgies.app - be kind it's still a bit alpha

ketzu · on June 24, 2022

I started to work on something similar but way behind your project. I really believe AI models can help us as humans learn better! Do you have a blog or any other writeups on how you approached these problems?

tikwidd · on June 24, 2022

How does the vocabulary generation work?

MivLives · on June 23, 2022

We're using these at where I work (large retail site) to help make filler text on generated articles. Think the summary blurb no one reads at the top. As for why we're writing these articles (we have a paid team that writes them too), the answer is SEO. This is probably the only thing I've seen done with a text model in production usage. I'm not 100% sure what model they're using.

tobr · on June 23, 2022

Sorry but every part of that sounds so terrible.

MivLives · on June 25, 2022

Yeah I'm not a huge fan of it. I'll never forget the look in our UX person's eyes when she realized that our team doesn't exist to make customer's experience better (there's a ton of other teams for that) but to make Googlebot's experience better. Right now we're in the process of getting publishers you've heard of to write blurbs for best lists, but we're supplying the products so it's not really a best list.

I can't say I'm a big fan but my teams is great and I don't have time to look for a job right now.

varispeed · on June 23, 2022

I hate this so much. These tools are getting better, so often you realise only half way through that you are reading AI text. Then you have to flush your brain and take a mental note, to never visit that site again.

MivLives · on June 25, 2022

I'm not a big fan either. At least the pages are just like:

- useless ai generated intro text

- ten products that actually are the best reviewed per category by users

- brief ai blurb on product

- 3 actual user reviews of the product

So even with the ai text there's still some benefit to the page.

BonoboIO · on June 23, 2022

Content made for machines. Probably a billion dollar industry.

fab1an · on June 23, 2022

Content made for machines serving humans made by machines pretending to be human

jquery · on June 23, 2022

Made by machines, for machines. It’s poetic.

blippage · on June 25, 2022

There's a line in one of Douglas Adams' books where he says something along the lines that things like VCRs were invented to watch TV programs so that you don't have to.

Who would have thought that one man's joke would become a reality?

jandrese · on June 23, 2022

You just know that some Amazon listings are written by GANs.

MasterScrat · on June 23, 2022

HuggingFace will soon release their BigScience model: https://twitter.com/BigScienceLLM/status/1539941348656168961

"a 176 billion parameter transformer model that will be trained on roughly 300 billion words in 46 languages"

So anything smaller than that will become worthless. May be a factor, companies have a last chance to make a PR splash before it happens.

Read more about it: https://bigscience.huggingface.co/blog/model-training-launch...

lairv · on June 24, 2022

"worthless" huh, not everyone can afford inference of a ~500gb models, depending on the the speed/rate you need you might definitely go for smaller model

But maybe your sentence was more about "after BigScience model, open-sourcing anything smaller than that will be useless" which isn't necessarily true either, because there is still room to improve parameter efficiency, i.e. smaller models with comparabale performances

rahidz · on June 23, 2022

Not necessarily, only ~30% of the database is in English, so it likely won't be as good as a smaller model trained solely or mostly on English words.

https://bigscience.huggingface.co/blog/building-a-tb-scale-m...

TaylorAlexander · on June 24, 2022

It kinda seems like a model trained on multiple languages would to some extent be better at English than a model trained only on English? I mean so much of English comes from other languages, and understanding language as a concept transcends any specific language. Of course there are limits and it needs good English vocabulary and understanding, but I feel the extra languages would help rather than hinder English performance.

vgel · on June 23, 2022

My guess is they're mostly vanity projects for large tech companies. While the models have some value, they also serve as interesting research projects and help them attract ML talent to work on more profitable models like ad-targeting.

lostmsu · on June 23, 2022

They did not publish benchmarks about quality of the models, which is very suspicious.

I personally squinted hard when they said removing dropout improves training speed (which is in iterations per second), but said nothing about how it affects the performance (rate of mistakes in inference) of the trained model.

jasonphang · on June 23, 2022

I agree that the lack of benchmarks makes it hard to determine how valuable this model is. But on the topic of dropout, dropout has been dropped for the pretraining stage of several other large models. Off the top of my head: GPT-J-6B, GPT-NeoX-20B, and T5-1.1/LM.

gfodor · on June 23, 2022

An equally plausible frame is that once a technology becomes replicated across several companies, it makes sense to open source it since the marginal competitive advantage are the possible resultant external network effects.

I don't know if that's the right way to think about the open sourcing of large language models. I just think we really can't read too much into such releases regarding their motivation.

jenny91 · on June 27, 2022

Yes, commoditize your complements.

mumblemumble · on June 23, 2022

From what I've seen, using these huge models for inference at any kind of scale is expensive enough that it's difficult to find a business case that justifies the compute cost.

Voloskaya · on June 23, 2022

Those models aren't trained with the objective of being deployed in production. They are trained to be used as teachers during distillation into smaller models that fit the cost/latency requirements for whatever scenario those big companies have. That's where the real value is.

f311a · on June 23, 2022

Yandex uses it for search and voice assistant

HeavyStorm · on June 23, 2022

Maybe training it is not that expensive?

I know from practice that it takes a really really long time to train even a small nn (thousands of params) , so you'll need a lot more hardware to train one with billions... But, it's expensive to buy the hardware, not necessarily to use it. If you, for some reason, have a few hundred GPU lying around, it might be "cheap" to do the necessary training.

Now, that's not your point - cost != price. But, still...

raducu · on June 24, 2022

> If you, for some reason, have a few hundred GPU lying around

Not to nitpick, but that is like saying that if you have a Lamborghini lying around, a Sunday trip in one is not so expensive.

can16358p · on June 24, 2022

I can't think of anyone having a few hundred GPUs around unless:

- They were into Ethereum mining and quit.

- They've already built a cluster with them (e.g. in an academic setting).

- They live in a datacenter.

- They are a total psychopath.

But even assuming one magically has all those GPUs available and ready to train, I don't want to calculate the power cost of it anyway. Unless one has access to free or extremely cheap electricity it would still be very expensive.

t_mann · on June 24, 2022

Only half-joking use case: active communities like this one on HN make sites attractive to human visitors. A new site could use bots to fake activity. Not sure it would work in the long run though.

alexb_ · on June 23, 2022

I have to wonder if 10 years down the line, everyone will be able to run models like this on their own computers. Have to wonder what the knock-on effects of that will be, especially if the models improve drastically. With so much of our social lives being moved online, if we have the easy ability to create fake lives of fake people one has to wonder what's real and what isn't.

Maybe the dead internet theory will really come true; at least, in some sense of it. https://www.theatlantic.com/technology/archive/2021/08/dead-...

dav_Oz · on June 23, 2022

The bots/machine vs human reminds me of that famous experiment from the 30s in which Winthrop Kellogg[0], a comparative psychologist, and his wife decided to raise their human baby (Donald) simultaneously with a chimpanzee baby (Gua) in an effort to "humanize the ape". It was set out to last 5 years but was relatively quickly abrupted after only 9 months. The explicit reason wasn't stated only that it successfully proved the hereditary limits within the "nature vs nurture" debate of a chimpanzee, the reticent statement reads as follows:

>Gua, treated as a human child, behaved like a human child except when the structure of her body and brain prevented her. This being shown, the experiment was discontinued

There have been a lot of speculation as to other reasons of ending the experiment so prematurely. Maybe exhaustion. One thing which seemed to dawn on the parents - if one reads carefully - is that a human baby is far superior at imitating than the chimpanzee baby, frighteningly so, that they decided to abort the experiment early on in order to prevent any irreversible damage in the development to their human child which at that point had become far more similar to the chimpanzee than the chimpanzee to the human.

So, I would rephrase "the internet is dead" into "the internet becomes increasingly undead" because humans condition themselves in a far more accelerated way to behave like bots than bots are potentially able to do. From the wrong side this could be seen as progress when in fact it's opposite progress. It sure feels like that way for a lot of of people and is a crucial reciprocal element often overlooked/underplayed (mostly in a benign effort to reduce unnecessary complexities) when analyzing human behaviour in interactions with the environment.

[0]https://en.m.wikipedia.org/wiki/Winthrop_Kellogg#The_Ape_and...

boplicity · on June 23, 2022

Case in point: recently, I've noticed that I'm getting more and more emails with the sign off "Warm regards." This is not a coincidence. It is an autosuggestion from Google. If you start signing off an email, it will automatically suggest "Warm regards." It just appears there -- probably an idea generated from an AI network. There are more and more of these algorithmic "suggestions" appearing every day, in more and more contexts. This is true for many text messaging programs: There are "common" replies suggested. How often do people just click on one of the suggested replies, as opposed to writing their own? These suggestions push us into conforming to the expectations of the algorithm, which then reinforces those expectations, creating a cycle of further pushing us into the language use patterns generated by software -- as opposed to idiosyncratic language created by a human mind.

In other words, people are already behaving like bots; and we're building more and more software to encourage such behavior.

alephxyz · on June 23, 2022

Those suggestions appear in Google chat too and even if you don't click on them, the simple fact of reading the suggestion makes you much more likely to type it yourself. There's clearly a priming effect to it.

yoyopa · on June 23, 2022

depends on your personality

AlecSchueler · on June 23, 2022

On average, it doesn't. This is why advertising and magic work.

bgroat · on June 23, 2022

I'm a magician and a developer by training.

Now primarily employed in a marketing capacity.

Over my career I've worked with: - Doctors - Lawyers - Engineers - Fund managers - Academics (hard and soft sciences) - Mentalists/Hypnotists

All of them believed that they're specific training and temperament made them immune from simple persuasion techniques and that they were purely rational actors.

None of them struck me as any more rational/more independent thinkers than anyone else off the street

ocimbote · on June 23, 2022

It is typical to rate yourself above your actual self.

Even when someone rates oneself down like when saying of themself that they're dumb, ugly or whatever, they generally mean it in a lesser fashion than for any other peer they'd attribute as such.

ClumsyPilot · on June 23, 2022

But its not above, its ascribung a mythical ability that does not exist - we don't talk about people who think they are psycic as optimistic, we call them crazy.

these guys are similar, except it's common belief.

jcelerier · on June 23, 2022

nope, just being exposed to text influences you whether you want it or not

Izkata · on June 23, 2022

They're saying it might influence you to type something different. Some of us are just contrary.

Dylan16807 · on June 23, 2022

Sometimes. Good luck keeping that up the majority of the time something tries to influence you.

codeviking · on June 23, 2022

Which is why it's important for folks to start applying AI to more interesting (but harder, more nuanced) problems. Instead of making it easier for people to write emails, or targeting ads, it should be used to help doctors, surgeons and scientists.

The problem is that these problems are less profitable. And that the companies with enough compute to train these types of models are concerned about getting more eyeballs, not making the world a better place.

wskinner · on June 23, 2022

The problem is not that those problems are less profitable. The problem is a combination of 1. Those problems are much harder 2. The potential harm from getting them wrong is much larger

codeviking · on June 23, 2022

Yup, I definitely agree that they're harder (and noted this). But I'm not sure I agree with your second point. Or rather, I think there's some nuance to it.

Sure, using AI to treat people without a human in the loop would clearly do harm. But using AI as an assistant, to help a doctor make the right diagnosis, seems like it'd do the opposite. It'd help doctors serve a larger patient population, make less mistakes, and probably equate to less harm in the long run.

Anyway, I think we can all agree that using AI for anything other than ad targeting is a net win.

remram · on June 23, 2022

Those suggestions are very few so I suspect they were hand-picked.

mattnewton · on June 23, 2022

I don’t know if this is how it still works, but early attempts were modeled as classification problems with hundreds of hand picked completions. Can’t predict something really bad if it isn’t in your prediction list. This limits the surface of bad things to cases of tone mismatch like “sounds great” when talking about someone grieving a loss or something.

orbital-decay · on June 23, 2022

Doesn't GMail collect the data in some form of federated learning nowadays, like GBoard does? Federated learning does seem to be able to create the unintended positive feedback loop, converging on a single phrase and causing the users to lock themselves in a bubble.

r3trohack3r · on June 23, 2022

A tangentially related thought:

Actors attempt to imitate humans. “Good acting” is convincing; the audience believes the actor is giving a reasonable response to the portrayed situation.

But the audience is also trying to imitate the actors to some degree. Like you point out, humans imitate. For some subset of the population, I’d imagine the majority of social situations they are exposed to, and the responses to situations they observe, are portrayed by actors.

At what point are actors defining the social responses that they then try to imitate? In other words, at what point does acting beget acting and how much of our daily social interactions actually are driven by actors? And is this world of actors creating artificial social responses substantially different than bots doing the same?

jdsully · on June 23, 2022

This is a common phenomena where the fake is more believable than the real thing due to over exposure of the imitation.

Famously the bald eagle sounds nothing like it does in tv and the movies and explosions are rarely massive fireballs. For human interaction it’s much harder to pin down cause and effect but if it happens in other cases it would be very surprising to not happen there.

futureshock · on June 23, 2022

This is famously theorized by postmodernism. See: https://en.m.wikipedia.org/wiki/Simulacra_and_Simulation

dvirsky · on June 23, 2022

Someone wrote once about how Wall Street people started behaving like the slick image projected of them in movies in the 80s, namely of Michael Douglas; before that they were more like the "boring accountant" type.

freewizard · on June 23, 2022

So maybe the Turing Test is not about AI are smart enough, but about how stupid humans become?

rexpop · on June 23, 2022

Not stupid; imaginative and agreeable.

bgroat · on June 23, 2022

These are also the elements that make a good hypnosis subject.

I can't put a dumb person under.

I need someone with an active imagination who wants to work with me (for best results)

CRConrad · on June 25, 2022

Are we sure there's a difference?

iforgotpassword · on June 23, 2022

It's the commonly believed reason; the child starting to take on habits from Gua, like noises when she wanted something, and the way monkeys scratch themselves. No authoritative source for it though, it's what I've been told during a lecture back in college, and I think PlainlyDifficult mentions it too in their video about it.

https://youtu.be/VP8DD9TGNlU

alcover · on June 23, 2022

Nice post! But to me your analogy does not really stand : bots are the ones catching up with human conversation in an "accelerated way", feeding on a corpus that predates them. Bots are not an invariant nature that netizens imitate.

CRConrad · on June 25, 2022

I sincerely regret that I had only one upvote to give you. This shit is so insidious that IMO everyone should just simply stop doing it until they've thought it through a lot more.

> ...humans condition themselves in a far more accelerated way to behave like bots than bots are potentially able to do.

Than bots can condition themselves to behave like humans, I presume. They can already behave exactly like bots. :-)

gigglesupstairs · on June 23, 2022

Wow this is such a mind bending perspective. Thanks for sharing it.

deathemperor · on June 23, 2022

My mind is blown. Thanks for sharing. Especially with the movie analogy. I’m a very movie person and I imitate my personality traits a lot based on characters on movies…

jazzyjackson · on June 23, 2022

thanks for the awesome analogy, I always had the sinking feeling that the bots are finding it increasingly easy to fit in among the humans because the humans on social media act increasingly like bots.

"monkey see, monkey do"

tiborsaas · on June 23, 2022

I think there will be a trend where model's size will shrink due to better optimization / compression while hardware specs keep increasing.

You can already see this with Chinchilla:

https://towardsdatascience.com/a-new-ai-trend-chinchilla-70b...

Comevius · on June 23, 2022

That's definitely the future, personalized entertainment and social interactions will be big. I could watch a movie made for me, and discuss it with a bunch of chat bots. The future will be bubbly as hell, people will be decaying in their safe places as the hellscape rages on outside.

Peritract · on June 23, 2022

> I could watch a movie made for me

We're a long, long way from this. Stringing words/images together into a coherent sequence is arguably the easy bit of creating novels/films, and computers still lag a long way behind humans in this regard.

Structuring a narrative is a harder, subtler step. Our most advanced ML solutions are improving rapidly, but often struggle with coherence over a single paragraph; they're not going to be doing satisfying foreshadowing and emotional beats for a while.

jb_s · on June 23, 2022

For many movies, sure.

I'm pretty sure the Marvel franchise is shat out by an algorithm.

orbital-decay · on June 23, 2022

You jest, but it really is the case. When your movie has a goddamn board of directors, you can be 100% sure it will be A/B tested until it transmutes the surrounding air into gold.

fumblebee · on June 23, 2022

Maybe. But I think a lot of folks have a short term memory; it was not so long ago that Word2Vec and AlexNet were SOTA. Remember when the thought of a human besting a world-class player at Go was impossible? Me too.

We've come ludicrously far since then. That progress doesn't guarantee that innovation in the space will continue at its current pace, but it sure does feel like it's possible.

importantbrian · on June 23, 2022

I actually wouldn't be surprised if the technology catches up to this faster than we realize. I think the actual barrier to large scale adoption of it will be financial and social incentives.

A big reason all the major studios are moving to big franchises is that the real money is in licensing the merch. The movies and TV shows are really just there to sell more merch. Maybe this will work when we all have high quality 3d printers at our desks and we can just print the merch they sell us.

The other big barrier is social. A lot of what people watch, they watch because it was recommended to them by friends or colleagues, and they want to talk about what other people are talking about. I'm sure that there will be many people who will get really into watching custom movies and discussing those movies with chatbots, but I bet most people will still want to socialize and discuss the movies they watch with other humans. FOMO is an underestimated driver of media consumption.

axg11 · on June 23, 2022

> We're a long, long way from this.

We’re probably 18 months away from this. We’re probably less than 5 years away from being able to do this on local hardware. AI/ML is advancing faster than most people realise.

thatwasunusual · on June 23, 2022

> Structuring a narrative is a harder, subtler step.

You can say that about many movies/series made entirely by humans today. :)

natly · on June 23, 2022

We're probably a long way away from narrative, but dall-e for video is probably only a year or two away from now (they're probably training the model as we speak).

htrp · on June 24, 2022

https://video-diffusion.github.io/

Google's got you!

pydry · on June 23, 2022

I get the feeling that creative sci fi used to kind of help inoculate us against these kinds of future but it seems like there's much less of it than there used to be.

"Black mirror" was good but it's not nearly enough.

rasz · on June 23, 2022

You really dont want to live in Mindwarp (1992 Bruce Campbell movie) or in this !114! year old short story https://en.wikipedia.org/wiki/The_Machine_Stops

dTal · on June 23, 2022

The Machine Stops is eerily prescient - or perhaps just keenly observant of trends visible even at the time - but in fairness the humans in it are not socially isolated, as such; they do not converse with bots, but rather with each other. The primary social activity in the The Machine Stops is the Zoom meeting.

I do not look forward to the day when that story becomes an optimistic view of the future.

mrguyorama · on June 23, 2022

That story is already an optimistic view compared to our own: They have no ads

espadrine · on June 23, 2022

> I have to wonder if 10 years down the line, everyone will be able to run models like this on their own computers.

Isn’t that already the case? Sure, it costs $60K, but that is accessible to a surprisingly large minority, considering the potency of this software.

alexb_ · on June 23, 2022

...what? 60 thousand dollars for a dedicated computer that you can't use is not everyone, not on their own computers, and is also a crazy large amount of money for nearly everyone. Sure there are some that could, but that's not what I said.

H8crilA · on June 23, 2022

Indeed. What "everyone" can use is a ~$200 smartphone, so there's a ~300x gap to be bridged.

ben_w · on June 23, 2022

Most of the cost of a phone isn’t the processor, so probably closer to x1000. Hardware may get that much cheaper, but it was never guaranteed, and we’re not making progress as fast as we used to.

wjnc · on June 23, 2022

log(300) / log(2) = only 8.2 doublings away. That's near future material.

thfuran · on June 23, 2022

Maybe at 90s hardware growth rates, but not now.

rictic · on June 23, 2022

The dream of the 90s is alive in the GPU market: https://aiimpacts.org/2019-recent-trends-in-gpu-price-per-fl...

Moore's law didn't stop, just Dennard scaling. Expect graphics and AI to continue to improve radically in performance/price, while more ordinary workloads see only modest improvements.

thfuran · on June 23, 2022

GPU TDP seems on the verge of going exponential, cost per transistor isn't really decreasing so much at the very latest nodes, and even that article seems to suggest it'd likely be decades before 300x flops/$

paganel · on June 23, 2022

Plus, that's the energy costs involved when running a computer now worth 60k, I'm pretty sure that in the current socio-economic climate those power costs will surpass the initial acquisition cost (those 60k, that is) pretty easily.

colechristensen · on June 23, 2022

An 80GB nvidia A100 goes for $20k and uses 300 watts, the energy costs of using one (or three) isn’t going to surpass the hardware costs for… a while.

paganel · on June 23, 2022

I wanted to add that I was writing it metaphorically in a way, as in, seeing as how high those energy bills will be they might as well all add up to 60k.

Not sure about most of the people in here, but I would get really nervous at the thought of running something that eats up 3x300 watts per hour, for 24/7, just as part of a personal/hobby project. The incoming power bills would be too high, you have to be in the wage-percentile for which dropping 60k on a machine just to carry out some hobby project is ok, i.e. you’d have to be “high-ish” middle-class at least.

The recent increases in consumer power prices are a heavy blow for most of the middle-class around Europe (not sure about how things are in the States), so a project like this one is just a no-go for most of middle-class European programmers/computer people.

colechristensen · on June 23, 2022

At full power 3 of those would cost me ~$3.50 per day ($0.15 per kWh is what I paid for last month's electricity, though I could pay less if I made some difference choices), I occasionally have a more expensive coffee order, or have a cocktail worth three times as much.

Things are getting more expensive here but nothing like the situation in Europe (essentially none of our energy was imported from Russia, historically ~10% of oil imports but that was mostly to refine and re-export, we have all the natural gas locally that we need) The US crossed the line into being a net hyrdocarbon energy exporter a while ago (unsure what the case is recently but it is at worst about at parity)

golem14 · on June 23, 2022

You must not have a pool :)

px43 · on June 23, 2022

Eh, 60k is just a bit more expensive than your average car, and lots of people have cars, and that's just how things are today. I imagine capabilities will be skyrocketing and prices will fall drastically at the same time.

fuzzer37 · on June 23, 2022

> 60k is just a bit more expensive than your average car

If by "A bit" you mean about 30-40k

CRConrad · on June 25, 2022

> > 60k is just a bit more expensive than your average car

> If by "A bit" you mean about 30-40k

30k more expensive: Than your very-low-end-"average" car.

40k more expensive: Than your average used car.

AFAICS. All in what one sees as an "average" car, I suppose.

joshvm · on June 23, 2022

You could just run this on a desktop CPU, there's nothing stopping you in principle, you just need enough RAM. A big memory (256GB) machine is definitely doable at home. It's going to cost 1-2k on the DIMMs alone, less if you use 8x32GB, but that'll come down. You could definitely do it for less than $5k all in.

Inference latency is a lot higher in relative terms, but even for things like image processing running a CNN on a CPU isn't particularly bad if you're experimenting, or even for low load production work.

But for really transient loads you're better off just renting seconds-minutes on a VM.

sascha_sl · on June 23, 2022

From the readme, it looks like you need that RAM on your GPU.

joshvm · on June 23, 2022

There isn't any reason you can't run a neural net on a CPU. It's still just a bunch of big matrix operations. The advantage of the GPU is it's a lot faster, but "a lot" might be 1 second versus 10 seconds, and for some applications 10 seconds of inference latency is just fine (I have no idea how long this model would take). All the major ML libraries will operate in CPU-only mode if you request it.

visarga · on June 23, 2022

They are pretty slow even on GPU. The problem is that it's an autoregressive model. So it needs to do a forward pass for each token.

uniqueuid · on June 23, 2022

Nitpick: This uses 8x A100 which are at least $10k a piece to my knowledge. Add in the computer and you're closer to $100k.

taink · on June 23, 2022

I believe you're confusing the amount of A100 graphics cards used to train the model (the cluster was actually made up of 800 A100s), and the amount you need to run the model :

> The model [...] is supposed to run on multiple GPUs with tensor parallelism.

> It was tested on 4 (A100 80g) and 8 (V100 32g) GPUs, [but should work] with ≈200GB of GPU memory.

I don't know what the price of a V100 is, but given $10k a piece for A100s we would be closer to the $60k estimate.

uniqueuid · on June 23, 2022

The $10k price is for an A100 with 40GB ram, so you need 8 of those. If you can get your hands on the 80GB variant, 4 are enough.

Also, if you want to have a machine with eight of these cards, it will need to be a pretty high-spec rack-mounted or large tower. To feed these GPUs, you will want to have a decent amount of PCIe-4 lanes, meaning EPYC are the logical choice. So that's $20k for an AMD EPYC server with at least 1.6kw PSUs etc etc.

justinlloyd · on June 23, 2022

You don't need a "decent amount" of PCIe-4 lanes. You just need 16 of them. And they can be PCIe 3.0 and will work just fine. Deep learning compute boxes predominantly use a PCIe switch. e.g. the ASUS 8000 box, which handles eight cards just fine. You only need a metric tonne of PCIe bandwidth if you are constantly shuttling data in and out of the GPU, e.g. in a game or exceedinyl large training sets of computer vision data. A little latency of a few hundred milliseconds moving data to your GPU in a training session that will take hours if not days to complete is neither here nor then. I suspect this model, with a little tweaking, will run just fine on an eight way RTX A5000 setup, or a five-way A6000 completely unhindered. That puts the price around $20,000 to $30,000. If I put two more A5000s in my machine, I suspect I could figure out how to get the model to load.

It also sounds like they haven't optimized their model, or done any split on it, but if they did, I suspect they could load it up and have it infer slower on fewer GPUs, by using main memory.

riku_iki · on June 23, 2022

There is also $5k A6000 with 48GB

justinlloyd · on June 23, 2022

Which will work just fine with NVIDIA SWITCH and a decent GPU compute case from ASUS or IBM or even building your own out of an off-the-shelf PCIe switch and consumer motherboard.

taink · on June 23, 2022

Do you happen to know the cost of the 80GB variant?

uniqueuid · on June 23, 2022

The PNY variant is pretty much the only one you can try to buy as an individual part, and those go for ~$15k. If you can get them.

Note that A100 like other datacenter GPUs are passively cooled. You need a strong airflow and duct in any case that would house them.

colechristensen · on June 23, 2022

Dell sells them for $20k

uniqueuid · on June 23, 2022

Or ~25k in Euro. Ouch.

sascha_sl · on June 23, 2022

And also, NVIDIA does not sell them to the consumer market whatsoever. Linus Tech Tips could only show one because someone in the audience sent theirs over for review.

kamray23 · on June 23, 2022

You're grossly overestimating. People who make 60k annually are getting a bit rarer nowadays, it's not like everyone can afford it. For the majority of people it'd be a multi-decade project, for a few it might only take 7 years, very few people could buy it all at once.

wellthisisgreat · on June 23, 2022

What kind of computer would they be?

Can you spec it out roughly?

zackmorris · on June 23, 2022

Unpopular opinion: something will stop egalitarian power for the masses. I had high hopes for multicore computing in the late 90s and early 2000s but it got blocked every step of the way by everyone doubling down on DSP (glorified vertex buffer) approaches on video cards, leaving us with the contrived dichotomy we see today between CPU and GPU.

Whatever we think will happen will not happen. A less-inspired known-good state will take its place, creating another status quo. Which will funnel us into dystopian futures. I'm just going off my own observations and life experience of the last 20 years, and the way that people in leadership positions keep letting the rest of us down after they make it.

nradov · on June 23, 2022

In what sense is the dichotomy between CPU and GPU contrived? Those are designed around fundamentally different use cases. For low power devices you can get CPU and GPU integrated into a single SOC.

zackmorris · on June 24, 2022

That's a good question. I wish I could answer it succinctly.

For me, the issue is that use cases and power usage are secondary to the fundamental science of computation. So it's fine to have matrix-processing stuff like OpenGL and TensorFlow, but those should be built on general-purpose hardware or else we end up with the cookie cutter solutions we have today. Want to run a giant artificial life simulation with genetic algorithms? Sorry, you can't do that on a GPU. And it turns out that most of the next-gen stuff I'm interested in just can't be done on a GPU.

There was a lot of progress on transputers and clusters (the old Beowulf cluster jokes) in the 80s and 90s. But researchers came up against memory latency issues (Amdahl's law) and began to abandon those approaches after video cards like the 3dfx Voodoo arrived around 1997.

But there are countless other ways to implement concurrency and parallelism. If you think of all the techniques as a galaxy, then GPUs are way out at the very end of one spiral arm. We've been out on that arm for 25 years. And while video games have gotten faster (at enormous personal effort by millions of people), we've missed out on the low hanging fruit that's possible on the other arms.

For example, code can be auto-parallelized without intrinsics. It can be statically analyzed to detect contexts which don't affect others, and the instructions in those local contexts could be internally spread over many cores. Like what happens in shaders.

But IMHO the greatest travesty of the modern era is that those innovations happened (poorly) in GPUs instead of CPUs. We should be able to go to the system menu and get info on our computer and see something like 1024+ cores running at 3 GHz. We should be able to use languages like Clojure and Erlang and Go and MATLAB and even C++ that auto-parallelize to that many cores. So embarrassingly parallel stuff like affine rasterization and blitters would run in a few cycles with ordinary for-loops instead of needing loops that are unrolled by hand or whatever other tedium that distracts developers from getting real work done. Like, why do we need a completely different paradigm for shaders outside of our usual C/C++/C# workflow, where we can't access system APIs or even the memory in our main code directly? That's nonsense.

And I don't say that lightly. My words are imperfect, but I do have a computer engineering degree. I know what I'm talking about, down to a very low level. Wherever I look, I just see so much unnecessary effort where humans tailor themselves to match the whims of the hardware, which is an anti-pattern at least as bad as repeating yourself. Unfortunately, the more I talk about this, the more I come off as some kind of crackpot as the world keeps rushing headlong out on the GPU spiral arm without knowing there's no there there at the end of it.

My point is that for all the progress in AI and rendering and simulation, we could have had that 20 years ago for a tiny fraction of the effort with more inspired architecture choices. The complexity and gatekeeping we see today are artifacts of those unfortunate decisions.

I dream of a day when we can devote a paltry few billion transistors on a small $100 CPU to 1000+ cores. Instead we have stuff like the Cerebras CS-2 with a trillion transistors for many thousands of dollars, which is cool and everything, but is ultimately gatekeeping that will keep today's Anakin from building C-3PO.

https://en.wikipedia.org/wiki/Multi-core_(computing)#Hardwar...

ur-whale · on June 23, 2022

You're an optimist.

Before any of the things you describe happen, most states will mandate the equivalent of a carry permit to be able to freely use compute for undeclared and/or unapproved purposes.

arathore · on June 23, 2022

If by running models you mean just the inference phase, then even today you can run large family of ML models on commodity hardware (with some elbow grease, of course). The training phase is generally the one not easily replicated by non-corporations.

natly · on June 23, 2022

I know it's a sort of exaggerated paranoid thought. But like these things do all come down to scale and some areas of the world definitely could have the amount of compute available to make dall-e level quality full scale videos which we might be consuming right now. It really does make you start to wonder at what point we will rationally be able to have zero trust that not everything we watch online is fabricated.

thelamest · on June 23, 2022

Historically, hard-to-falsify documents are an anomaly, the norm was mostly socially conditional and enforced trust. Civilizations leaned and still lean on limited-trust technologies like personal connections, word of mouth, word on paper, signatures, seals, careful custody etc. I agree losing cheap trust can be a setback, just want to point out we’re adaptable.

ggktk · on June 23, 2022

I'm predicting that the upcoming Mac Pro will be very popular among ML developers, thanks to unified memory. It should be able to fit the entire model in memory.

Combine that with the fact that PyTorch recently added support for Apple silicon GPUs.

tehsauce · on June 23, 2022

upcoming mac pro will have pretty poor ML performance when compared to even an old nvidia gpu sadly.

uniqueuid · on June 23, 2022

Although memory capacity may matter more than speed for inference. As long as you're not training or fine tuning, the mac pro / studio may be just fine.

apart from the fact that you can't use any of the many nvidia-specific things; if you're dependent on cuda, nvcuvid, AMP or other things that's a hard no.

calrizien · on June 24, 2022

What are the current best ML language models to play with on the M1Max?

time_to_smile · on June 23, 2022

Comments like this make me feel like I'm losing my mind.

I think it's far more likely that in 10 years we'll all become more used to rolling blackouts, and fondly remember we all used to be able to afford to eat out, and laugh over a glass of cheap gin about how wild things were back in the old days before things got really bad.

10 years ago was a much more exciting and hopeful time than today. I remember watching Hinton show off what deep learning was just starting to do. It was frankly more interesting that high parameter language models. Startups were all working on some cool problems rather than just trying to screw over customers.

That's just technology. Economically, socially and ecologically things looks far brighter in 2012 than they do now, and in 2032 I suspect we'll feel the same about today, but far more dramatically.

We've already pass the peak of "things are getting better all the time!" but people are just in denial about this.

alluro2 · on June 24, 2022

You're not alone - especially based on observations during the pandemic, it seem we are woefully unaware and unprepared for how fragile the structures supporting our current way of living are, and how easy they would collapse into a much worse state of living conditions when it comes to power and food, let alone luxuries and internet as we know it...

It also seems to me that most people would not be ready to give up more than 10% of their luxuries / way of living up-front in order to protect those structures and would continue to watch funny TikTok videos and post IG photos until the very moment their internet access goes out and doesn't come back.

lostmsu · on June 23, 2022

Running the models like this on own computer is already possible with DeepSpeed. I think it even supports training albeit it would be extremely slow.

https://www.deepspeed.ai/

pgt · on June 23, 2022

The Move to the Edge is one of the strongest trends in technology. So, yes. I would never best against it.

(applies to computing and other technologies like power production and agriculture)

user3939382 · on June 23, 2022

When I see AWS, cloud, and server side rendering frameworks it seems like we’re moving the other way in some sense.

ubercore · on June 23, 2022

There's a strong trend to push to the edge of the cloud though -- cloudfront workers, deno.deploy, etc

psychoslave · on June 23, 2022

I don’t know for you, but most of my online interactions are text based. Context of interpretation matter far much than the form of the content. If you know it’s easy to fake text exchanges, you might be more careful about text origin, and other contextual hints. Even it’s the syntax imitate your children verbal oddities, you may not necessarily run to comply thoughtlessly to an unusual demand you just receive by SMS from their phone number. Trust and check.

TuringNYC · on June 23, 2022

>> I have to wonder if 10 years down the line, everyone will be able to run models like this on their own computers.

Do you mean train or run? My assumption was all these models could be run on most computers, probably with a simple docker container, as long as there is sufficient RAM to hold the network, which should be most laptops > 16gb ram.

Speaking of which, anyone have recommendations on pre-trained docker containers with weights included?

nonrandomstring · on June 23, 2022

> one has to wonder what's real and what isn't.

And whether it really matters. That's the bigger question.

I think, for most of us, it does matter. But we're not sure why and what a loss of human reality would really mean.

For a few who wholeheartedly embrace it there's some resonance with the psychedelic/60s creed that sees this as some kind of "liberation".

Byamarro · on June 23, 2022

It could be possible with analog chips. I.e. ones that Mythic works on.

redox99 · on June 23, 2022

I'm not sure why you got downvoted. Yes, ASICs (either analog or digital) that have some model hardcoded in would probably make it feasible, but it won't be programmable which is the interesting part.

trasz · on June 23, 2022

Totally not my field, but why wouldn't they be programmable? Analog FPGA's already exist.

redox99 · on June 23, 2022

Yes, true. I was referring to the Mythic ones the other comment mentioned which are only for inference of a specific model.

simonh · on June 23, 2022

It's more likely, if not inevitable that these things will become ubiquitously available remotely, like Siri and Alexa. It's access that's important, not hosting.

unixhero · on June 23, 2022

Yes, the vision is that everyone has an AI cube in their house.

shaky-carrousel · on June 23, 2022

Then, we'll hack all those cubes to build an AGI.

Jimmy · on June 23, 2022

There’s a very simple solution, of course: turn off the computer and physically interact with real people.

ketzu · on June 23, 2022

Seeing those gigantic models it makes me sad that even the 4090 is supposed to stay at 24GB of RAM max. I really would like to be able to run/experiment on larger models at home.

thejosh · on June 23, 2022

It's also a power issue. The 4090 sounds like you're going to need a much, MUCH higher PSU than you currently use.. or it'll suddenly turn off as it uses 2-3x the power.

You'll need your own wiring to run your PC soon :-)

melenaboija · on June 23, 2022

I think it is a stupid question, but does the power consumption needed by processors to infer compared to human brains demonstrate that there is something fundamentally wrong for the AI approach or is it more physics related?

I am not a physicist or biologist or anything like that so my intuition is probably completely wrong but it seems to me that for more basic inference operations (lets say add two numbers) power consumption from a processor and a brain is not that different. It’s like seeing how expensive it is for computers to infer for any NLP model, humans should be continuously eating carbs just to talk.

agalunar · on June 23, 2022

Around room temperature, an ideal silicon transistor has a 60 mV/decade subthreshold swing, which (roughly speaking) means that a 10-fold increase in current requires at least a 60 mV increase in gate potential. There are some techniques (e.g. tunneling) that can allow you to get a bit below this, but it's a fairly fundamental limitation of transistors' efficiency.

[It's been quite a while since I studied this stuff, so I can't recall whether 60 mV/decade is a constant for silicon specifically or all semiconductors.]

googlryas · on June 23, 2022

> but it seems to me that for more basic inference operations (lets say add two numbers) power consumption from a processor and a brain is not that different

Sure it is - it is too hard to figure it out based on 2 numbers number, but lets multiply that by a billion - how much energy does it take a computer to add two billion numbers? Far less than the energy it would take a human brain to add them.

visarga · on June 23, 2022

The AI is much faster than the brain, if you batch requests the cost goes down.

PartiallyTyped · on June 23, 2022

I bought a 1500w psu soon after the previous crypto collapse for around $150, one of the best purchases I did.

Dylan16807 · on June 23, 2022

The RAM is not using all that much of the power, and I think that scales more on bus width than capacity.

perryizgr8 · on June 23, 2022

Nvidia deliberately keeps their consumer/gamer cards limited in memory. If you have a use for more RAM, they want you to buy their workstation offerings like RTX A6000 which has 48G DDR6 RAM or A100 which has 80G.

justinlloyd · on June 23, 2022

What NVIDIA predominantly does on their consumer cards is limit the RAM sharing, not the RAM itself. The inability for each GPU to share RAM is the limiting factor. It is why I have RTX A5000 GPUs and not RTX 3090 GPUs.

Voloskaya · on June 23, 2022

If you don't care about inference speed being in the 1-5sec range, then that should be doable with CPU offloading, with e.g. DeepSpeed.

qayxc · on June 23, 2022

200+ GiB of RAM still sounds like a pretty steep hardware requirement.

Voloskaya · on June 23, 2022

If you have an nvme deepspeed can offload there as a second tier once the RAM is full.

175 GB aggregate on both RAM and nvme is in the realm of home deep learning workstation.

As long as you aren’t too fussy about inference speed of course.

justinlloyd · on June 23, 2022

Oh yeah, that $750 for 256GB of DDR-4 is going to totally break the bank.

kfrzcode · on June 23, 2022

Damn I didn't know ram was so cheap

justinlloyd · on June 23, 2022

It only gets expensive if you insist on sourcing it from enterprise vendors. The first 256GB I paid $2,400 for. The second 256GB I paid $1,200 a little over a year later. And the third 256GB I paid $800 about seven months later. I've got a workstation with 768GB DDR4 and I am considering upping that to 1.5TB if the prices on the 256GB sticks will come down.

josu · on June 23, 2022

For the people that didn't click on the link:

>but is able to work with different configurations with ≈200GB of GPU memory in total which divide weight dimensions correctly (e.g. 16, 64, 128).

out_of_protocol · on June 23, 2022

Take a look at Apple's M1 Max, a lot of fast unified memory. No idea how useful though

jeroenhd · on June 23, 2022

What's the difference between Apple's unified memory and the shared memory pool Intel and AMD integrated GPUs have had for years?

In theory you could probably assign a powerful enough iGPU a few hundred gigabytes of memory already, but just like Apple Silicon the integrated GPU isn't exactly very powerful. The difference between the M1 iGPU and the AMD 5700G is less than 10% and a loaded out system should theoretically be tweakable to dedicate hundreds of gigabytes of VRAM to it.

It's just a waste of space. An RTX3090 is 6 to 7 times faster than even the M1, and the promised performance increase of about 35% for the M2 will means nothing when the 4090 will be released this year.

I think there are better solutions for this. Leveraging the high throughput of PCIe 5 and resizable BAR support might be used to quickly swap out banks of GPU memory, for example, at a performance decrease.

One big problem with this is that GPU manufacturers have incentive to not implement ways for consumers GPUs to compete with their datacenter products. If a 3080 with some memory tricks can approach an A800 well enough, Nvidia might let a lot of profit slip through their hands and they can't have that.

Maybe Apple's tensor chip will be able to provide a performance boost here, but it's stuck on working with macOS and the implementations all seem proprietary so I don't think cross platform researchers will really care about using it. You're restricted by Apple's memory limitations anyway, it's not like you can upgrade their hardware.

zaptrem · on June 23, 2022

Apple gets significant latency and frequency benefits from placing their LPDDR4 on the SoC itself.

thereddaikon · on June 23, 2022

Unified memory is and always has been a cost cutting tactic. Its not a feature not matter how much manufacturers who use it try to claim it is.

postalrat · on June 23, 2022

Apple is selling M1's with > 200gb ram? Have a link so I can buy one?

MrBuddyCasino · on June 23, 2022

Wondering if Apple Silicon will bring arge amounts of unified main memory with high bandwidth to the masses?

The Mac Studio maxes out at 128GB currently for around $5K, so 256GB isn't that far out and might work with the ~200GB Yandex says is required.

Havoc · on June 23, 2022

Perhaps on quantity. Substantially slower though around ~3x from what I can tell…substantial roadblock if you’re training models that take weeks.

MrBuddyCasino · on June 23, 2022

I meant for inference, not training. People just want to run the magic genies locally and post funny AI content.

Havoc · on June 23, 2022

ah right - gotcha

EugeneOZ · on June 23, 2022

Can Apple Silicone's unified memory be an answer?

lostmsu · on June 23, 2022

I downloaded the weights and made a .torrent file (also a magnet link, see raw README.md). Can somebody else who downloaded the files as well doublecheck the checksums?

https://github.com/lostmsu/YaLM-100B/tree/Torrent