Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
LLM discourse needs more nuance (proofinprogress.com)
108 points by timdaub on Feb 1, 2023 | hide | past | favorite | 155 comments



Is this a good summary?

LLMs have solved the language processing problem. Its responses are fluent and rapidly becoming indistinguishable from human output. Or if anything, its responses are too good to be mistaken for an average human.

However, that’s different from producing accurate knowledge and insights. It’s often bullshitting, in the sense that it’s convincing prose often turns out to not reflect reality.

The reference problem seems to be one example of this.


People were worried that if we created an AI on our image, it would be a belligerent entity. But we did create an AI on our image, and it’s a bullshitter.


"Artificial Inanity" as Neal Stephenson called it in Anathem:

“Well, bad crap would be an unformatted document consisting of random letters. Good crap would be a beautifully typeset, well-written document that contained a hundred correct, verifiable sentences and one that was subtly false. It’s a lot harder to generate good crap. At first they had to hire humans to churn it out. They mostly did it by taking legitimate documents and inserting errors-swapping one name for another, say. But it didn’t really take off until the military got interested.”


On a philosophical level, I wonder if ChatGPT gives us mostly nonsense because most human output is bullshit.

To paraphrase a famous quip, what if this is proof that 99% of everything humans have created is bullshit? The problem perhaps has never been machine learning but identifying the process that enables we humans to somehow go through life and create objective societal and scientific progress out of a massive pile of nonsense.

I might be completely wrong here, but the next ChatGPT will assume this comment of mine to be absolute truth and happily ingest it wholesale.


A key problem is that the transformer operates over very short fragments of text. Long range correlations need to be maintained to ensure 100% accuracy in reproduction/inference of relationships.

I realize this is a philosophical take. It also suggests a technical solution: model longer range correlations. GPTs seem so capable because previous models were not as good at this, but the context lengths modeled with transformers are still extremely short relative to those that we work with.

In result, GPTs are very good at translation and reformatting over short ranges, but struggle to deal with long form texts---a typical technique makes summaries of summaries to build out a model of a longer text. That's hardly ideal, and it's not how our minds work.


That's a good observation, though I'm not sure it's just a matter of longer range correlations, i.e. fixable by merely adding more computational power. The truth problem is fundamentally different than what those neural networks are trying to solve. We got the learning part almost right, we have made no progress towards understanding.

I might be completely off the mark, my knowledge of AI is very limited. But earlier today, while reading on Lisp's history and it's demise during the AI Winter, I very much enjoyed a Wikipedia article describing how symbolic AI, the main paradigm of AI research from the 60s to the 2000s, solves a completely different problem than the currently hyped neural networks—which are very good at learning, but not much more. We might need a complementary, unifying model of both approaches to reach AGI.

https://en.m.wikipedia.org/wiki/Symbolic_artificial_intellig...


The average of two true statements can easily become false, so models like ChatGPT will spout nonsense even if it was only trained on 100% truthful statements.


Can you give an example as to what you mean please?


One classic example from linguistics is the two (true) sentences

"Politicians lie."

"Cast iron sinks."

Combine them syntactically and you get "Politicians lie in cast iron sinks", perhaps not completely false but certainly less useful than the component parts.

EDIT: It seems this is actually from Douglas Hofstadter


It gives us nonsense because it's goal is to generate something that looks like it fits in with training data. You can have training data that is all completely factual and still have a generator spit out nonsense that looks completely correct without fact-checking.


So, exactly like people then?


Sure, a lot of human output is incoherent, but current models give nonsense for a different reason - they lack many mechanisms typical for human intelligence.


Seems to me it is more of a babbler. You know, like a young toddler just learning to talk.


Or a million monkeys with typewriters, only these monkeys are not entirely random, they have been conditioned to write things similar to what people actually write. Most of the time the output looks very natural, but others you get nonsense because the noise is too loud.


ChatGPT is only non-beligerent because it's biased with prompt preface that tells it it's not belligerent. Raw LLMs will absolutely spit out racist, violent garbage and threaten you.

Ironically, the only reason people get LLMs to say they hate humans and that AI is going to rise up and kill us all, is because their training data has so much text from us speculating about a robot revolution. If we'd just stayed silent about those concerns instead of always talking about robots taking over in most conversations and movies about AI, then LLMs at least would be basically incapable of even conceiving of that. So it's a bit of a self-fulfilling prophecy


Isn't a bullshitter a belligerent entity?


Belligerence is an intention. Bullshitting is a technique.

The bot is incapable of intention so the entity here is still the human who prompted it. What the bullshit it produces is used for depends on their intentions.


I think belligerence can also be a technique and bullshitting can also be an intention.


Hook a gun up to it and you have a belligerent bullshitter. Seems fairly accurate for a good bit of humanity.


I think a lot of people (like OP) seize on the problems with using LLMs for search while ignoring the many things that it can already do more or less flawlessly.

- Summarize text: If you paste in a large amount of text, it can almost perfectly deliver a good summary. Imperfections are virtually never hallucinations; they're grammatically oddities. - Create content: As long as you're supplying the underlying facts, LLMs are great at writing blog posts/emails/etc. Intercom just released a feature that allows you to type a response and use AI to change the tone, expand it out, shorten it up, etc. LLMs are fantastic at this. - Search/information retrieval in some categories: Look, we get it, LLMs hallucinate. And of course, as many people point out, you can easily get them to hallucinate by asking specifically odd questions. But the reality is they're still really good at finding a lot of things - recipes are a great example. I've yet to see ChatGPT throw in a bizarre ingredient or omit something critical.

I think too much criticism of LLMs inaccurately assume a single use case and throw out the whole concept because they're imperfect for that use case.


>As long as you're supplying the underlying facts, LLMs are great at writing blog posts

I have trouble agreeing.

I've experimented with this and found that ChatGPT can produce very pedestrian prose that might pass muster on a content farm and possibly other sites with fairly low quality standards but I certainly wouldn't call it "great."

That said, it could serve as an assistant for a human writing certain types of posts.


I think so yes. I think it has solved the problem of making believable human language with some sense of consistency. That's no small feat, and for some tasks (writing limericks, reformulating a letter, translation) it's even the whole point of the exercise, so we have AI making something a "Solved problem" to the point that we had it recognize digits better than humans, years back. No small feat.

But for other tasks the key isn't just producing believable consistent text, but something else. Perhaps the text must be factually correct or creative or traceable or have some other trait like that. And that's the majority of applications by far. And I'm sure it's the majority of monetizable applications.

I think at this point people, especially non-technical people, are lured by the glitter of ChatGPT. When it spits out an amazing limerick I immediately hear managers in my company raving about how it could revolutionize our product help documentation. They ask it a question, which it generates a correct answer to "How do you make text bold in [ourproduct]" to which it replied "You select the text and hit Ctrl+B". Manager now immediately convinced ChatGPT is intelligent enough to basically run the company. I'm trying to downplay the answer, explaining that it's more a statistical sentence-completing thing than world knowledge, and that the Ctrl+B might just have been a generic answer...

With that said, I also have an opposite argument here: that current LLMs while not intelligent, are very close to a tipping point that is going to be as important as AGI will:

The big AI revolution will be when we have AI's that people think are intelligent, not when they actually are.

There was much said about the google engineer who thought his chatbot was intelligent. And while it might not have been, we clearly saw the societal impact of AI right there. It's not what the AI can do for us, it's what ut will do TO us. Imagine if your pocket AI tamagotchi was the only being you had opened up to about something, for the last 10 years, and you felt as though it really listened. Now how would you feel if someone broke it? What happens to our communications when we can't tell artificial from real? This "societal singularity" we'd have to deal with might be a few years away, even if actual AGI is centuries away.


> I'm trying to downplay the answer, explaining that it's more a statistical sentence-completing thing than world knowledge, and that the Ctrl+B might just have been a generic answer...

Yes; one of the problems seems to be that ChatGPT is playing right into some of our biggest cognitive loopholes: confirmation bias, selection bias, narrative bias. Like talking to fortune tellers, people only remember the right answers, or will silently contort the wrong ones to fit what they expected to hear.

I do believe that AI will continue to progress, but the current form is nothing but two mirrors face to face, with us standing in the middle.


> Imagine if your pocket AI tamagotchi was the only being you had opened up to about something, for the last 10 years, and you felt as though it really listened. Now how would you feel if someone broke it?

See also, Her (2013).


> See also, Her (2013).

And also, Ex Machina (2014).


> Is this a good summary?

On par with best available AI summary! ;) [1]

[1] https://labs.kagi.com/ai/sum?url=https://proofinprogress.com...


> It’s often bullshitting, in the sense that it’s convincing prose often turns out to not reflect reality.

Maybe that’s all we as humans want, output to communicate with other humans?


"LLMs have solved the language processing problem."

No, it hasn't. It has made great strides in certain directions, but is woefully deficient in others.

For one thing, for as impressive as it looks, it's really quite difficult to use it. The "knowledge" it has about text isn't an obviously usable understanding of the grammar and some other symbolic representation of the information that can be used by other technologies, it's more a self-referential knowledge embedded in opaque neural net values that can almost only be used to extend the text. That is not 100% true, but using it for anything else is really hard. I would expect a technology that has "solved" the language processing problem to be useful for a much wider variety of topics, to be, for instance, something I could hook up to the Unix shell to provide a safe and reliable human language interface to it.

(Note that prompting this model for some shell commands is light years from what I mean; along with the dangerous unreliability of the resulting commands, it also isn't actually hooked up to your shell and operating in a space where it knows your directories, files, and their contents. I want "give me a list of files relating to my business deals with Comcast" to literally work on the shell, to the point the result could then be piped to some other plain-text request, not what ChatGPT can do.)

"Its responses are fluent and rapidly becoming indistinguishable from human output."

Again, in some directions.

"It’s often bullshitting, in the sense that it’s convincing prose often turns out to not reflect reality."

While I acknowledge the second clause makes your sentence accurate on your terms, I still would say it's a better way to think of LLMs that it is always bullshitting, always confabulating. It's just built in a way that the maximum-probability confabulation of a factual topic may happen to be the fact. I mean, if you think about it, that shouldn't seem that surprising a statement. I've done it myself in real life, just accurately guessed a fact I didn't really know. Less often than I've been wrong when making such guesses, of course, but it's still something not outside our experience range.

But it's not like the LLM in any sense "knows" when it is telling the truth and "knows" when it is bullshitting. You can get spun into a real philosophical tizzy about what that "knows" means, but it doesn't matter, because for any sensible definition this particular model doesn't "know". It isn't built to "know". It just spits out the maximum probability continuation. This is not a claim about all AI architectures, it's just about LLMs. They don't know. Even if you convince one to output that it doesn't know, it doesn't know that either, it just thinks that's the most likely logical continuation. While this may be surprising to anyone who has been on the Internet for a while, there are in fact samples of people saying "I don't really know" on the Internet that would have worked into the training data.

I will say this; I think GPT does serve as the final refutation of the Turing Test as a measure of AI. It turns out to be entirely possible to build an "AI" that is basically optimized to pass it, yet, strangely incapable of almost anything else we would consider an "AI" to be.

I'm actually much more impressed by GPT than I sound on HN. It is a legitimately interesting technology and very impressive. The problem is, the Gartner hype cycle has massively overshot the degree to which it is interesting and impressive, so I sound very down on it as I try to bring people back to Earth. I keep going back to the video game metaphor... video games look far more interesting and complex than they often actually are, because the graphics can look so amazing and awesome and we've made such exponential advances in that field over the past few years that we can forget that the state of the art in, say, NPC dialog, remains the "choose your own adventure" dialog tree that was a familiar staple in games for over 30 years now, with the only elaboration since then being full speech voice acting (by humans). ChatGPT is the video game graphics of the AI world. Still impressive on its own terms! It's amazing what we pump to our screens at 120Hz/4K with the right hardware... yet, what is behind those graphics is nowhere near as impressive. Similarly for ChatGPT. Very impressive! But it's a munchkin that dumped all its stat points into "sounding good to humans" but doesn't have hardly any stat points anywhere else.


Try some prompt engineering courses. It’ll change your perception. Some of my prompts are like 1200 words just to set context. But I use it for email, marketing outreach, blogs, you name it. It’s drastically increased my productivity as a solo consultant.


That wouldn't even remotely change my impression. I'm already aware, as I said, that ChatGPT is very good at continuing text.

Indeed, the scariest thing about ChatGPT as a technology is that the thing it is most useful for, the most obvious use case, is exactly what you are using it for: Increasing the amount of bullshit text in the world without even the gate of a human having to write it, which was already not much of a gate at all.

The problem is, that's by far the thing it is best at, and by far the most useful use for it. But that's a net loss for humanity overall. Hooray. Unfortunately, it is not something that be stopped; there's so many people rushing to take a dump in that Commons that it is unstoppable now. If indeed it hasn't already been happening for a while now (Dead Internet theory; if it isn't already true, it certainly will be).


I write around 100-200 emails a day, coordinate across various projects and write a lot of copywrite around the solutions and systems the company develops.

Explain to me how this is ‘bullshit text’. It sounds to me like you are just repeating other criticisms you’ve heard because you don’t like the tool. To me, it would be rude to respond to certain emails in a way that I can tell chatGPT. I can give chat GPT a very basic framework and it will respond in a way that is not overly laconic.

Writing certain copywrite towards a requirement or proposal is also not “bullshit”. Chat got is useful for sketching out a rough outline and it’s easier to go and edit it to your liking then writing it from scratch.

I reject your definition of emails, copywrite and proposals as bullshit. They are necessary, and if I can use chat gpt to help me save time by proofreading, generating outlines, managing tone and generally making my communication more streamlined and shorter, how could you call all of that “bullshit text”.


ChatGPT ~is Auto-tune for text. It already kinda sucked to be poor but it is going to really suck going forward.

https://en.wikipedia.org/wiki/Auto-Tune


Do you have some courses you recommend?


I kind of feel like that is exactly what your comment is, you synthesized a lot of other material and are throwing it out there for analysis. We go through a lot of bad ideas that humans believe that are basically bullshit.

This applies to my own comment/response, it is the same bullshit, may be right time will tell.

I guess my point is, what if this is exactly how new human ideas are created?


It’s an interesting situation because on one level, the technology is at least somewhat misrepresented (best case results rising to the top of everyone’s feeds), but on the other side, we haven’t even begun to see what integrated Chat GPT will bring. If I could replace Siri with Chat GPT, I would do so immediately, as it’s objectively better. The author makes some points about the eye watering costs; however, if this is something that people truly want—and I think we do want it—the market will find a way to deliver it at scale in a cost effective way.


I agree, it is a bit of an odd moment. On the one hand, you have the normal amount of hype and bullshit. The people calling out that hype are right of course, as they have been right about 1000 other vaporware demos and outright marketing lies. The groundbreaking product does not yet exist and may never exist.

But on the other hand, there is something real and incredible here. It has massively pushed forward the frontier of what a computer can do. If you compare the output of these models to sci-fi like Star Trek, the fantasy was too conservative. It calls into question what it means to be human, what makes us special, and what is art. It’s the most interesting moment in art since postmodernism.

So many people are rightly looking at the trend and trajectory. Incredible things have already been created. Humans have been beat at their own games. Humans performance has already been matched on a wide range of tasks. If you factor in 1 more year of progress, 5 more, 50 more, you might reach some terrifying conclusions. It’s only human.


> It calls into question what it means to be human, what makes us special, and what is art.

How?


> the market will find a way to deliver it at scale in a cost effective way

Who is this market, who is so wise in the ways of science?


Says the person typing the message on a personal supercomputer that has enough battery power to last all day and fit in their pocket at the same time....

(assuming that is, that this message came from a portable device)


Portable devices haven’t achieved cost effectiveness without dangerous conditions, slavery and child labor on the other end as externalized costs


All costs are externalized by the premature heat death of the universe.

And I mean really, if this is the bar you want to set then no product is going to meet those goals, so no huge point in pointing out electronics, versus say just about anything you eat on a given day, or the clothes you wear.


uh - no product is going to meet the goals of avoiding slave and child labor and hazardous work conditions? not sure why you're going to bat for that besides to defend that it's hugely profitable. i think many products outside portable devices do meet that goal but it's not in service of market competition and isn't a scalable solution without a bigger systemic change, so i do agree in a sense with saying that at scale no product meets that goal sure.


Honestly, if the price of a smartphone was even $10,000, I would be happy to pay it. The value I get out of it is incredible. In my case (living in a mid-sized city), if I had to choose between my phone and my car, I'd definitely sell the car first.


well solutions for these problems are not going to come from consumers independently deciding to pay more to "green wash" their purchases


You might enjoy SiriGPT: https://gpt3demo.com/apps/chatgpt-sirigpt-apple-ios

(It's voice-activated ChatGPT on your iOS device using shortcuts.)


> the market will find a way to deliver it at scale in a cost effective way

This really isn't a given. Look at the problems encountered trying to scale Bitcoin - slow transaction speed and high energy use. They're still unsolved, because some properties of a given technology are inherent to that technology.

There's no guarantee something is scalable, especially something that might (as the article suggests) have exponentially increasing costs in its default configuration.


> Look at the problems encountered trying to scale Bitcoin - slow transaction speed and high energy use.

They're not unsolved, they're actually trivially solved by using an exchange. That just comes with its own risks.


How is the high energy cost of mining Bitcoin solved by exchanges? Also if I want to use crypto for anything except speculation within an exchange, the transaction speed is still an issue, even with apps like Coinbase. (Or at least it was last time I used it).

I know Lightning Network helps but that's like a layer on top of Bitcoin because the Bitcoin tech itself doesn't scale well


> the market will find a way to deliver it at scale in a cost effective way

I wonder if it would be possible to build the model in hardware as an ASIC. That would probably bring down cost a lot but I'm not sure it makes sense if they expect to release new improved models regularly. The hardware might be obsolete by the time it reaches production.


I think that if we see hardware specific to a particular model it would be in automotive. And the focus would be on fitting inside a power envelope, not cost in particular. So it still could be that cost of asic + operation will be greater than gpu/tpu + operation, but it will fit in whatever limit (1500 watts? Idk) is designed for driving systems more easily.


Recently wrote about the economics of both training and inferencing: https://sunyan.substack.com/p/the-economics-of-large-languag...


Submit it as a top level post. It’s a good article.


Maybe eventually, but there are plenty of examples of things taking quite a long time before they are economically viable.


Do we have an idea of what the costs to run ChatGPT might be? Are we costing OpenAI $5 everytime we ask it to generate python in the style of Donald Trump?



So way cheaper than an SMS not so long ago


SMS are free (they use up left over space in the signalling protocol a phone needs to use to talk to towers/inform the network of where it can be found - a process you're already charged for). Doesn't seem like phone networks collecting money for no effort (which is why it trends towards free) is a fair comparison to running an expensive supercomputer with very real hardware, energy and real estate costs.


We can probably make a decent guess by looking at the price they charge for GPT-3, where asking the most expensive model for its literal "two cents" response gives you a thousand tokens.


I've read estimates of $0.02 per query


Probably training it on specific data sets and contexts is the expensive part.


Running a gigantic model across multiple GPUs isn't cheap either


I’ve asked ChatGPT about hardware. Here’s the response:

> As for your specific computer, with 64 GB of RAM and a high-performance GPU like the GeForce 1080 Ti, it should have sufficient resources to run a language model like me for many common tasks.

Based on the models open sourced by OpenAI, they are using PyTorch and CUDA. This means their stack requires nVidia GPUs. I think the main reason for their high costs is a single sentence in the EULA of GeForce drivers: https://www.datacenterdynamics.com/en/news/nvidia-updates-ge...

It’s technically possible to port their GPGPU code from CUDA somewhere else. Here’s a vendor-agnostic DirectCompute re-implementation of their Whisper model: https://github.com/Const-me/Whisper

On servers, DirectCompute is not great ‘coz Windows server licenses are expensive. Still, I did that port alone, and spent couple weeks doing that.

OpenAI probably has resources to port their inference to vendor-agnostic Vulkan Compute, running on Linux servers equipped with reasonably-priced AMD or Intel GPUs. For instance, Intel A770 16GB only costs $350, but delivers similar performance to nVidia A30 which costs $16000. Intel consumes more electricity but not by much, 225W versus 165W. That’s like 40x difference in cost efficiency of that chat.


I couldn't resist and tried it:

Prompt: Write a small python program in the style of Donald Trump.

chatGPT: I'm sorry, but it wouldn't be appropriate to write a program in the style of a political figure, especially if the language and tone used may be considered offensive or disrespectful. Additionally, OpenAI's policy prohibits the creation of content that is harmful, abusive, or hateful. Is there anything else I can assist you with?


Is the author conflating the cost to train an AI model with the cost to use the resulting models, here?

I can certainly believe that model training cost is exponential as the # of parameters goes exponential. But that is a one time(-ish) cost relative to actually using those models.

Or am I completely off base here?


Inference costs are non-trivial, and I wouldn’t be surprised if the cost of running ChatGPT (given the 3M/day figure) has surpassed that of training it. Without optimizations, training only uses ~3 times the memory as inference, so exponential parameter/cost scaling still affects both.

There’s ongoing research to reduce the computational costs of inference, but to my knowledge they only offer linear improvements (although I wouldn’t bet against more substantial reductions in the near future, particularly as these techniques are compounded).


OpenAI has stated that their costs per query are a few cents. Not nothing, but nothing outrageous either. The cost will go down to < 1c within a couple of years I'm sure. The cost may be prohibitive to use LLMs for absolutely everything, but the universe of applications where companies can afford to spend 1c on an LLM answer is very large.


> the universe of applications where companies can afford to spend 1c on an LLM answer is very large.

I hear this a lot, but don't see it very often in practice. GPT and it's ilk are coming up on 3 years old now, and our most novel application thus far is the same textbox + response that Talk to Transformer had. That's sad.

I've heard the sell before ("imagine AI spreadsheets!") but I don't think people are willing to pay for answers that are regularly wrong in the long run. If your bridge only works half the time, people probably won't be inclined to pay your toll anymore.


Today, the typical customer support experience involves waiting for hours/days to get a response that is useless and/or wrong. LLM powered customer support software can be transformational, and this is just one example.


LLM-powered customer support can also be useless and wrong. Even in an ideal case, you're still retaining some customer support employees and only displacing the pitifully-paid call-center workers that staff the current Chat-As-A-Service offerings.

Until I see something like this rolled out with widespread success, I'm gonna doubt it. The second someone puts an AI agent on their website, it's a race to get the brand to endorse the most abhorrent thing possible. Then what?


You don't want the LLM to be completely unsupervised. The goal is assistance, not the equivalent of full self-driving. It takes a few seconds for a support tech to approve a response that would take 15 minutes to compose by a human. That's where the value is. You automate the rote repetitive work so the interesting intelligent work remains.


"The value" is in the self-driving part though. The support tech wants the good answer, not the part where they refresh ChatGPT 3 times because the output is unintelligible. If you're going to pay a human to do the job either way, I'd bet that a skilled technician out-performs an AI model operated by a classifier employee.


In the long run, a model that isn’t regularly updated with new data won’t be very viable for many purposes. The (re)training cost therefore will have some relevance.


Well, the number of parameters is directly related to the size of the machine to keep the model around. And so while you can host GPT-2 on a single consumer GPU, for ChatGPT you allegedly need 4-8-10k-a-piece GPUs. Which also consume 8-16 the power of your consumer GPU. So if you continue that, the cost to society will be too high to bear for this kind of games. (and in my eyes already is, as the outputs of large generative models are 100% used to generate more stuff to consume for the same amount of already information-overloaded people)


A single international flight literally consumes a swimming pools worth of fuel. a few GPUs is nothing especially as we transition to wind and solar.


well, then stop flying and the useless content-creation and let's get back to our "few GPUs" for consumption, when living, food and flying have transitioned to wind and solar. Then people can start setting up their text-based Holodeck-precursor consuming 1kW during usage.

Just be clear: the alternative to letting a future GPT-derivative write your novel is to go to the library and pick a book. That works now and has negligible emissions. A future GPT (currently you won't get a consistent long form text) might do the same - optimistically it will do it on only 2-3kWh of energy (which amounts to the cost of creating a physical book). Granted, you only can sustain a human on that for a day, but then the humans don't go away because of AI.

Conclusion to me: the things are fascinating and so are H-bombs...

Question: which kind of time do you expect people to spend with these models? Which human needs will they satisfy? How will their energy consumption then still be negligible?


This reads like you ran it through a translator.


so? translation is not the point of this models anymore.


> Well, the number of parameters is directly related to the size of the machine to keep the model around.

I bet this has some potential.


I pretty much always wanted to know the cost of running those models, not training those.

That should be order of magnitude less.


stable diffusion will generate an image in ~30 seconds on an NVidia 3060, so the cost is on the order of cents per image, power use on the order of 170 watts during the generation, or 0.0014166666667 kwh. @ around 5 cents per kwh


You also have to calculate the life expectancy of the card against 30 seconds of its life as an ongoing cost.


Which is also on the order of 0.01c per query (answering 5 million queries in the lifetime of a $500 card).


Just ask ChatGPT what the cost is? ;)


Sure, but just remember that it can't do math, so any mathy answer it gives is just bullshit.


I'm not sure that there's anything truly new in this article. Transformer-based models are very compute- and parameter-heavy, yes - but that's because they're optimized for generality and easily parallelizable training, even at a very large scale. The ML community has always been aware of this as an unaddressed issue. Once compute cost for both training and inference becomes a relevant metric, there's lots of things you can do as far as model architecture goes to make smaller and leaner varieties more applicable.


Arguably, that was the defense for cryptocurrency too.

"Oh, power consumption? Totally on our radar... what if we switch to Proof of Work?

Personally, I don't think power consumption is the only legitimate argument against the application of GPT. There's the fact that it's proprietary, unreliable and even consistently wrong on certain topics. It's expensive to apply at-scale and most AI-generated text sounds sterile and impersonal. Even now, after years of development, GPT is too unreliable to be called anything other than a novelty.

The dream of a "suitable application" for AI is like the dream of a "suitable application" for the blockchain. Both are technology-first solutions to social problems. Cool in concept, but regularly broken in execution.


> "Oh, power consumption? Totally on our radar... what if we switch to Proof of Work?

I think you meant "Proof of Stake".

But yes, when Bitcoin hit the transaction limit years ago it's hard to understate just how much on the radar it was. Differences in opinion on how to solve it was the root cause of the Bitcoin Cash fork.

There are two solutions floating about now - proof of stake and Lightning. Being completely distributed Lightning is near infinitely scalable, to the extent that it may well end up being the cheapest global currency we have.

Thus crypto has made big leaps forwards on it's scaling issues. If your analogy is correct then AI will do the same. But if it follows the same pattern, it will take decades.


> Personally, I don't think power consumption is the only legitimate argument against the application of GPT.

Well, it depends what you mean by "the application of GPT". As a toy and a technical proof of concept, it works quite well. The problem is that people want something that's mostly correct and won't just make stuff up - and that's just not what GPT is for.


Right. Fiction-as-a-service is a fantastical API, but not a very practical one. When you boil it down, you're paying a lot of money for some carefully-arranged noise to get sent over the internet.


In a sense, some people are getting ahead of themselves - which by the way is normal in technological cycles, with typically financial innovation going beyond what the underlying innovation can support.

If AI fails to move into high risk applications and instead stays largely confined to low risk applications, then the revenue pools will be stay limited to more consumer and small business facing services. Those are big pools but it will also not create the trust needed for certain high risk applications.


> If AI fails to move into high risk applications and instead stays largely confined to low risk applications, then the revenue pools will be stay limited to more consumer and small business facing services.

This.

The output from the majority (if not all) of black box neural net based AI is not transparent or trustworthy and cannot explain its own decisions. Usually it either gets confused on a single pixel or generates garbage output confidently; like what ChatGPT does and there is no transparent way for it to explain its decisions.

It is for that reason why it is unsuitable for applications that involve high risk to human safety, financial matters and legal situations which all three are life changing which the AI hypesquad seems to be forgetting.


Why most people are blind to the most important point of those models: the progress is moving at astonishing pace.

IMO it is a useless discussion to debate about the current SOTA or economics. In two years we will certainly have some breakthrough as we have been seen for the past years. And things are speeding up.

edit: typo.


> And things are speeding up.

That’s not how technical progress works. It’s not continuous. The past doesn’t predict the future. Progress may stall at any time.


Sorry, I didn’t want to sound too sarcastic and negative: I don’t disagree with you. But looking from a broader timescale it is clear that in general we’re speeding up our technological progress in a logarithmic scale.

I have been following closely the NLP field for over a decade and the progress is speeding up. Can it stall? Sure. But everything is pointing out that it won’t.


I kinda disagree. We've made some leaps and bounds, but our actual "progress" has mostly been defined by orgs like OpenAI throwing money at the problem. It's only technological progress in the sense that grinding stone bricks is "technological" or "progress".

The difference between ChatGPT and Talk to Transformer is frankly not that large, at least when treated as a black-box. 90% of the people freaking out over ChatGPT on Twitter would have also freaked out over the original GPT, had they known it existed. The extra nuance that we're adding on feels like stalling, and while some of the optimizations have been cool (gpt-neo-2.7b, Stable Diffusion) it feels like we're hitting the top of our progress curve.


Surely we can say that about our last 10s years of our human existence.


There are extremely interesting technical and economical critiques in the article. They may be right or wrong but at least they’re intellectually rigorous - much more than the average GPT commentary.

Your comment, in a nutshell and if I’m reading it right, is that it’s not worth to engage with any of these ideas because progress - whatever that means, however that’s measured - is fast.


Unfortunately the measurement if something was worth engaging in is only really able to determine its value in hindsight.

If it were 1890 we'd be talking about how our cities will soon be buried in animal dung which will lead to the collapse of mankind. The people debating that could not have reasonably foreseen in 100 years that CO2 would be the greater risk, and 100 years from now the greater risk will be something most of us have not imagined.

Are the issues you point out worth talking about? Of course, but are they worth the amount of time and effort that we will debate them? Get back to me in 5 years and we'll see.


> And things are speeding up.

Gpt-3 came 3 years after transformers were invented. Now we are close to 3 years after Gpt-3, did anything nearly as big happen during that time? Things aren't speeding up at all.


> we will certainly have some breakthrough as we have been seen for the past years. And things are speeding up.

In the past years we have seen a tremendous scaling of a) workforce, b) hardware, c) data. For me a breakthrough would be on data/energy efficiency? What areas are promising there?


Investors don't "believe" in AI anymore than they "believed" in crypto. But they can smell a bubble, and hope to bail out at the top.


They'll never admit it now, but a lot of investors genuinely bought into crypto and blockchain bullshit.

That doesn't change that their plan was to bail at the top (isn't that what VCs do normally?), but many of them really thought that crypto had a chance to eat into Visa/MC's market share, or that blockchain would somehow "solve" supply chain, identity management, etc.

The idea that crypto/blockchain investors knew all along it was a ponzi/pyramid/casino/scam/whatever-you-want-to-call it is revisionist history.


While personally, the experience lived up to some of the hype, social media claims of it implementing entire front ends turned out to be fake. Clearly, this models performs well on in-sample tasks, and is far off on anything else.

That wasn't my experience. ChatGPT dealt with everything I threw at it, except when it refused to, and even then I could get round it by telling it to pretend etc. Every wild example that I saw on twitter I was able to reproduce


Agree. Here is a twitter thread with video examples of using ghostwriter, an LLM similar to ChatGPT to build several example sites.

https://twitter.com/Replit/status/1620445121476202497?t=6Mm9...

I think the nuance here is that using an LLM is a bit like googling. It doesn't seem like it would be a skill but it is. You kind of have to nudge it in a way similar to what you would do in your own mind after finding the almost relevant stackoverflow post.


Things it performed badly for me (our of sample):

- Implement quadratic voting (a concept defined in Radical Markets by Weyl). When trying to explain it, it still invented smth that looked good

- Implement a First in first out calculation to file tax returns where within a 1 year holding period, tax rate is 0%. It calculated some really implausible stuff and it seemed easier to just figure it out myself.

- there were many such cases for me, but also good outcomes too


>> But only those that users liked and retweeted ended up circulating, which contributed to a strange spread between a users' expectations and experiences they had when first talking to chatGPT.

Which is exactly what makes this so dangerous. In our current culture, only those voices that are promoted and amplified by social media matter. So a tech than can produce material in industrial quantities, even if most of it is tripe, could be a powerful manipulation tool. It will certainly be cheaper than hiring flesh-and-blood people for man a troll farm.


JUMP! (Mike Solana)


A lot of what this takes are missing is that the unreliability of those models is not a hurdle for a tremendous amount of applications. If there is a human in the loop that can now write a couple of words in a fuzzy human language, check or select or edit the resulting response, and thus do their job twice as fast or god beware 10x as fast, that is absolutely a game changer.

My main criticism for LLMs are:

- the way they were rolled out was counterproductive. Unleashing a chatbot that pretends it knows everything, without any background and guardrails, is directly responsible for the hype untethered discourse that is prevalent in the mainstream. In the literature and among practitioners, every body is well aware that these things don't "think"

- for the first time, it feels that a significant amount of what my value as a aprogrammer will be fully owned by a corporation and trickled out to me for $99.95 a month. It's already the case with copilot. I can't imagine going back to a world where I work without gpt3 and copilot, which gives me no choice but to fully embrace my corporate overlords. I fully feel what farmers feel wrt their tractors.

The best I can do for now is figure out what the real usecases are for me and how to leverage GPT3, and start looking heavily into open models, so that I can help out with whatever unix <> bsd situation we are going to end up with.

None of this has anything to do with the end of human culture, education, discourse, or the end of quality in software. If software quality could be any lower, under capitalism, it would be. It's not like I can get really shitty code that pretends to do something for $5/h on upwork.


I'll stick to waiting this one out, because I know "investor" personality types are mostly comparable to an individual sitting on a pink cloud wearing sunglasses that won't allow a singly light particle to enter.

I'm hearing quite little about the downfall of AI route that science fiction predicted at times. Always good to waddle into that middle outcome.


The part about compute cost ruining the economics of AI companies is somewhat hilarious and eye rolling to me for the following simple reason: users have computers right in front of them.

Some models are small enough to run locally. For the rest is it conceivable to ask the customer to contribute cycles? Especially for anything cheap or free?

There might be major technical challenges here but that’s not my point. My point is that I get the distinct impression this has barely occurred to most of these people as a possibility to even explore.

Are we so far into peak SaaS now that the industry has forgotten about local compute the same way we forgot about data centers during peak PC?


It's not just a question of cycles; training these large models requires a lot of RAM and I/O, far beyond what's on a personal machine. Training these kinds of large models is not conducive to being parcelled-out in small chunks.


Training maybe but most of a model’s use is evaluation.


> Are we so far into peak SaaS now that the industry has forgotten about local compute the same way we forgot about data centers during peak PC?

I mean, yes? You can host your own email server at home, but Google has still made billions selling email. Regardless, paying a company to allow them to run cycles of their proprietary model on my hardware doesn't feel like a great deal to me. Why would I choose that?


If it cost less? Or if you bought the model and ran it locally instead?

The reason it’s hard to DIY a mail server is spam and all the attendant black lists and white lists. It’s not a resource problem.


Not sure about ChatGPT, but Midjourney et al have definitely transformed many markets already. Why would I get any illustrations from fiverr.com, when I can get better turnaround times for $0 from Midjourney?


I tried to use a couple of the AI image generators when building a landing page recently, so I can answer this.

Look at the homepage for a popular site like Stripe, Figma, Digital Ocean, whatever, take your pick. Try to get Midjourney to generate high-quality art like you see on those homepages, in a format where it is ready to go directly into a site without further processing in an image tool. My experience was that I could have gotten better results at Fiver.


There's no way you're going to get top quality like those sites at a place like Fiverr either. They've spent millions on branding and marketing.

I can see Midjourney already replacing the low end. The question is how good can it get, and then as the bar is raised what can be done to differentiate. Answer to the latter ironically may be going back to old school human interfaces.


> There's no way you're going to get top quality like those sites at a place like Fiverr either.

There is a grain of truth to this, but again just choose whatever lower-budget website you want. And then just try the exercise I suggested.

Midjourney isn't producing work at a "bad Fiverr" quality level, it's not producing usable work at all yet (at least without a lot of hands-on prompt tuning, at which point why not just use Fiverr?). At least with Fiverr, I am likely to end up with something I can put on the site, which is not true of Midjourney yet.


The language models are probably more actually useful today. I've played around with writing hypothetical articles using them. For topics I know where it's a fairly straightforward, e.g. Five qualities some job role needs, it does a "decent" job.

Decent in this context doesn't mean something I could just hand to an editor. But it does mean a pretty good starting point that I could amend, flesh out, add some links, add a quote or two. I could certainly see using it to give me a sort of pre-draft on some fairly evergreen topic.

I could also use it to generate some boilerplate definitions or historical background to include in an article.

But, sadly, I think you'll see the LLMs being used to generate a lot of blog/article content with a minimum of human effort for even less than the small amount being paid for a lot of this today.


supposedly the Fiverr-contractor soon just will be a prompt-tuner with a large cache of "good" images.


I fired up T100 instance of Google Collab and got something usable in about an hour of being focused on what it was I wanted, using in painting and having it redraw some parts (I was trying to get a guy holding a guitar, hands are hard and took about 20 minutes). Then the lighting was off so I threw it into GIMP, split the part I wanted with magic scissors/lasso or whatever, and color corrected that part of the image using the dropper average sampling and a layer mask based on my cutout, then changed the layer type.

I never would have done any of this stuff before, and there was a LOT of super low quality art in the space (D&D character art), so low quality I never paid for it as it just didn’t tickle my fancy.

Now I can get stuff that looks like a Frank Frazetta painting, which I can’t get at any price since old Frankie died in 2010.


That's really great for your use case! However, it fails my more workaday needs in a couple of ways:

- you had to spend your time iterating on the prompts, redrawing, etc.

- you had to do post-processing yourself

- you had to pay for a GPU

In all -- not a good substitute for e.g. Fiverr for typical work-related artwork.


With a lot of prompt hacking, you can get pretty impressive results from the generative art models especially if they don't involve people's anatomy and they just need to be displayed as fairly small images on the web.

But, hiring someone with decent drawing ability for simple tasks is pretty cheap and they can produce something to your exact specifications (and/or show some creativity). A co-worker did a book cover for me recently that wasn't technically complex but was far better than anything I would have designed on my own or an AI would have come up with. Also a lot of Creative Commons and public domain content out there too.

Furthermore, no company is going to (or at least should) make official use of generative AI work until any potential legal issues get worked out. The open source IP lawyers I know think it's probably OK but no one is really sure.


Correct me if I’m wrong, but aren’t systems like Midjourney inadvertently stealing from the people who are paid to make it?

Also, in some ways that’s like asking “why should I shop local when I can buy from Walmart and get the exact same product?“ Well obviously if it’s all down to simple dollar costs, you do you. But there are several motivations/considerations for choosing why we shop where we shop.


Even lawyers aren't sure, the arguments can go either way: https://youtu.be/G08hY8dSrUY

Also note some new research: https://arxiv.org/abs/2301.13188


So the legal thinking I have been reading is around the reproduction of copyrighted works.

However to get to a point where you _can_ recreate copyrighted works implies copyright infringement in the dataset. Thats not really been tackled.

Also, there is the GDPR angle here as well, collecting people's faces on the internet for something other than the person reasonably intended is also on grey legal ground I would argue.


I suspect you're wrong that the stealing is inadvertent.

I'm suspicious the makers of the art generator AI systems know exactly what they're doing and see no issues with it.


I guess I’m just trying to be a little generous. They probably know it’s a risk and either rationalize it away with something along the lines of “well, it’s such a tiny minuscule piece from so many different people that it’s not really stealing“ or just pretend they don’t know. At this point the ignorance does seem willful.

Either way, I’m more just curious what other people have to say on the matter as I am not as knowledgeable of subject. I am on the creator side and it’s all bad news over here.


> “well, it’s such a tiny minuscule piece from so many different people that it’s not really stealing“

RMS wrote 'the right to read' because issues like this, and if you've not read it, I recommend you do. "A bunch of little bits of different things" isn't stealing, it's society and culture. If you got your wish of everything I create if fully copyrighted, you'd find two things. One, that you're last in line and don't come up with original ideas often, even in original artwork. And two, that monied corporations would quickly buy up rights to everything and make life as an artist completely impossible.


I'm not even sure that's what is happening to be honest, I'm just speculating as to how somehow could reason it away for themselves. I don't know what the line is for "too much inspiration" where it becomes stealing. I'll make sure to check that out though, appreciate it!


Your choice to be generous is a kind and good one.

I don't have great insight into copyright issues, but as one who loves music and used to play a lot of it, I'm right there with you on the "it's all bad news over here".

It seems to me that the current crop of art generators ape styles but don't come up with their own.

I worry that if they put most human artists out of work, the arts will stagnate badly.

I guess we'll see.


>I worry that if they put most human artists out of work, the arts will stagnate badly.

Right there with you. And it's hard to even decide what qualifies as "adapt or die" you know? Do I just use AI-assisted tools? Do I "just" pivot my career and go somewhere else?


If a bunch of random people on the internet see the potential copyright infringement and need for clarification, then it's doubtful makers of art generator algorithmsbdidnt think of the same potential issues and choose to ignore them.


There are a whole bunch of people who think copyrights have held back progress, and this is a technological solution to allowing the sharing of information more freely.


You’re definitely screwing the other person in the prisoners dilemma.


If chatgpt and AI friends is the solution, what is the problem? If we push the AI madness aside the problem definition seems to be: I want to query large amounts of textual data to find an "optimal match/answer" to a given user input/"question".

What is the precise nature of this optimal match? It is unclear (as it is essentially encoded in a black box algorithm and its training data set). Different algorithms and different training data sets would provide different "answers".


The tweet from Sam Altman in this article made me laugh out loud.


Was he trolling?


i think, but am not sure, that he was trying to sell OpenAI as unimaginably (near exponentially) awesome



To show the cost of extra parameters is exponentional, they made a graph with price linear and parameter count logarithmic and then fit an exponential? It seems very misleading:

https://proofinprogress.com/assets/images/cost-llms.png

With linear or polynomial increase per parameter it would also look exponential with that graph setup.


mhh yeah you might have a point. I didn‘t do it on purpose though. I‘ll go through this again


Matrix multiply scales polynomially so don't be thrown off by that. You'd also expect some increases in cost where things move from single GPU to multi-GPU to networked machines. I'm still not sure if that pushes things to exponential or would just step up cost at each transition.


As long as these AI models are unable to transparently explain themselves then all of this is essentially a vacuum of hype to those who don't understand that such black box models are essentially 'throw more money and data on it' for the gamble on accuracy and the risk of overfitting and wasting money.

No different to the majority of proof-of-waste cryptocurrencies like Bitcoin.


I think everybody overestimates in the beginning, here it’s the same. AI will solve every problem, nope it won’t, but it will help.


I feel like we already passed that low point, I mean, years ago that was a promise and a trend, then I got disillusioned, but now, talking to ChatGPT is actually making me more positive about AI again.


I spent lots of time to think of the title „The AI Crowd is Mad“ because I wanted to hit all triggers for virality (belief, belonging, behavior). A bit disappointed it was moderated/changed


Reading this article is like walking through deep mud. This is largely due to the author‘s excessive and baffling use of commas. But also due to myriad grammatical errors and ambiguous sentence constructions. Parsing some of his sentences is like looking at an optical illusion or an ambiguous painting. All things considered, I had a bad time attempting to read this article. I do not look forward to reading more of this author‘s writing in the future.


Granted. I didn‘t use Grammarly this time to fix the article‘s numerous mistakes. But frankly, it‘s also a way to prove that my article wasn‘t chatGPT generated hahahahaha


What a pompous style. I got annoyed trying to read it.

By the way, https://www.theatlantic.com/technology/archive/2023/01/chatg... shows that at least some of the broader press gets what's going on with "AI."


Not the plane with the dots again!


content warning: crypto/web3 bro


:‘(


Imagine spending hundreds of thousands of dollars on infrastructure to serve out content that isn't even yours to begin with, and then wonder why there's such a blowback in its usage


I don't think anyone is being "deceived" when it comes to ChatGPT. In fact I've found it to be just the opposite. People are still completely unaware of how game changing and revolutionary it is until they sit down and actually use it. It's impossible to describe to someone just how different this is than anything else before, as the AI well has been so thouroughly poisoned by the false promises of prior tools.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: