Hacker News new | past | comments | ask | show | jobs | submit login

Even if ClosedAI doesn't get government to ban sales of powerful GPUs to the public. Homebrew models will not be of comparable quality in near future. Don't be fooled by how cheap ChatGPT is, it is not profitable.



"Homebrew" Stable Diffusion beat DALL-E 2. There's zero reason this won't repeat for LLMs. All indications are that it will.


Sorry I don't think if you need to wait half an hour to see a result it's the same thing really.

You don't need to worry about deluge of AI content if it can only come from a few crazy people with top of the line hardware spending days iterating on prompts. Especially if really good GPUs start requiring a license. At some point it's just faster to learn to write properly.


We already have publicly-available models that are good enough for spam and scams. When running on CPU they are already faster than most people can type.

The cat is out of the bag. While the "AI alignment" jackasses were writing their Terminator fanfiction and wringing their hands about paperclips, they had already destroyed the world wide web as we know it.


Keep in mind the topic, it's not about WWW which can be totally swamped by a handful of dedicated malicious actors with powerful hardware, it's about verifying applicants that they did not cheat.

These models you talk about are either not good or unbearably slow, the output of a model that today runs "as fast as you can type" on average hardware will never be reliably mistaken for the real thing. If you try to cheat with it it would be more likely to fail you than if you spend $20 on a human freelancer to write stuff

The only factor that breaks this today is cheap availability of chatgpt and such. They are reasonably high quality but unprofitable to run, they are subsidized to hook public up so that later MS can safely jack up prices (ideally after getting an exclusive AI license from the government).


You could have said the same things about image generation 18 months ago.


I could, and actually I did, and if I did not then I will now, say the same things about image generation today. Unless you know something I don't know or have top of the line hardware, general purpose image generation with homegrown models is either unbearably slow or poor quality.


You can run Stable Diffusion on an iPhone 11 and it completes in under a minute or two. Running on CPU generally takes around 5 minutes. My almost top of the line macbook runs a batch of four in around 30 seconds on Metal, and I'm sure it's much faster with a mid-range GPU considering how unoptimized Metal is with Torch. And yes, you can go take a look around reddit and 4chan, the vast majority of those aren't dreambooth/MJ/remote models.

That's not even taking into account local LoRAs and scripts that are possible instead of some company's untweakable crap. The open source around this is healthy, has pushed past DALL-E, and there's no real roadblock to Open Source LLMs except of course, the training cost. Even still, people are getting $200k+ models in their hands for free from various training runs and donated computing and LoRAing them and fine tuning them all to make them comparable to the closed off remote models.

Any "cryptographic" scheme with the generations of these will just catch the lazy. The lazy already include the confabulated sources in their papers, and don't try to normalize the Error Level Analysis in generated images (probably the quickest way to determine whether an image is generated), so I don't think it's actually a net benefit. It's a cat and mouse game, and will push the mice further into the walls.

You can't possibly say that generative images like this are "poor quality"

https://www.reddit.com/r/StableDiffusion/comments/131lpks/my...

https://i.imgur.com/3iDf43z.png


Again, poor quality or slow on conventional hardware.

The first of your examples was generated on a desktop computer with 2080 Ti and even then still glaring uncanny hands. We don't know how long it took but I think the reason for the hands is that it's too slow to generate a dozen of these in hopes that hands would come out right.

The other one I can see done on any laptop in a few minutes, but it's more primitive and just a monochrome sketch. I skip over obvious issues e.g. with shape of glasses.

For both examples you don't need any specialized tools or watermarking to notice this stuff.

Maybe you see what I mean why indie homegrown AI is not such a big deal ;) Sure there are people who will invest in hardware but those people will are not and for now won't be mainstream enough to matter. Especially if it will be licensed, most people don't like to violate laws. Most people will just use chatgpt or dall-e.


I don't see what your point is. The regulatory capture going on right now with the attempt to license things is ludicrous and akin to licensing matrix multiplications. It won't stick. Stable Diffusion and Approximated Functions (neural networks) are not something magical despite the fear they want to impart on them.

Commercial AI has all of those issues you mentioned, and more, and less. Midjourney is just a bunch of LoRAs layered on top and scripting to generate the images. But since they do that, midjourney images have a specific "feel" that it can't seem to get rid of. It's nothing really out of the reach for someone sufficiently motivated to reproduce.

DALL-E is laughable now, and it's only been a year. Certainly has been surpassed by open source, and outside competitors. I'm not sure what your motivation is to discount open source. People are already running LLM inference on their phones.


My point: 99% of people will use chatgpt etc. because homebrew alternatives are either bad (easy to detect with naked eye) or slow. Probably Microsoft will also make sure no competitors can offer good enough AI by pushing for regulation. So if those big platforms are required to watermark/detect own AI results that's good enough. Remaining 1% of crazy people don't count.

Your point?


That midjourney et al are also detectable to the naked eye.

Why do you need to watermark them, again? The error level analysis is off the charts with generative images. They light up like a Christmas tree. Just because you and uninformed legislators and journalists don't know how to check the ELA of an image, doesn't mean they're undetectable. And the cheaters include the bogus sources spit out by ChatGPT already. The cryptographic qualities will be lost as soon as an editor gets their hands on it, automated editing or not. It's a cat and mouse game.

And also, I find it telling that you think if someone doesn't have high end hardware, they're going to pay $20/mo to OpenAI. For $20/mo, you can buy a mid-range video card and write it off. For an extra $10/mo, you can deprecate the cost and buy a high end laptop for that price, if you're a professional, and you're not locked into OpenAI. You're also assuming that 1. hardware doesn't get better and 2. techniques don't improve to run them on limited hardware.


> That midjourney et al are also detectable to the naked eye.

Again, either too slow, requiring outrageous hardware, or obviously noticeable. So far no examples to the contrary.

Don't forget, the topic is using special measures to detect undetectable with naked eye. When you can simply see the screwed up hands on a photo it's not even necessary.

> Why do you need to watermark them, again?

Why do you think I need to watermark them again?

> a mid-range video card

and a PC to put it in, a space to put the PC in, etc. With a laptop we're back in wait for an hour to see a result.

> hardware doesn't get better and 2. techniques don't improve to run them on limited hardware

We can revisit this if it consumer hardware gets good enough...


>With a laptop we're back in wait for an hour to see a result.

Any laptop within the last five years with decent memory can run stable diffusion on the cpu in around 12 minutes. My MacBook Pro runs a batch of four on Metal in around 30 seconds.

>We can revisit this if it consumer hardware gets good enough...

I mean, I just showed you a quantized llama running on a Pixel 5 and 6. And, I wouldn't discount most of the next generation of hardware having ML co processing like MacBooks and iPhones and Pixels do with all of this hype.


> Any laptop within the last five years with decent memory can run stable diffusion on the cpu in around 12 minutes.

Majority of output is bad so you need to try dozens of takes to get a result that is reasonably realistic. Multipy 12 accordingly

> quantized llama

I don't know what that means but if it's better than chatgpt/gpt4 then sure.


You are fooling yourself if you think the next generation of CPUs won't have ML coprocessors.


Apple's neural engine has been around in phones and laptops for years. Not even close to what's required for offline chatgpt. What's your point? Which of us is fooling themselves?

Average consumer hardware is not showing any signs of being capable of this in near term. Hype is hype, but use your own head.

Don't forget that these homegrown models also require training data, scraped and cleaned, an individual or nonprofit can't do that.


> Apple's neural engine has been around in phones and laptops for years. Not even close to what's required for offline chatgpt.

Well, yeah. It's been around for years and chatgpt is new. My years old GPU doesn't run the latest games either, and chatgpt is novel technology so the hardware will of course lag behind a bit. But it will come.


>an individual or nonprofit can't do that.

https://laion.ai/

https://www.mosaicml.com/blog/mpt-7b

There's dozens if not hundreds more individuals and nonprofits doing these things.


Yes, these are used by mainstream commercial AI thingies. But they are not their only sources. Plus they have armies of people preparing and cleaning this data, are you ready to employ one?


People are literally doing this for free. You said that there is no individuals, and no non-profits, that can do this. Which is laughable. They're doing it right now: https://arxiv.org/abs/2305.11206

It may even be the case that all of that RLHF training that OpenAI does simply lessens the quality of generations, as suggested by one of their own papers and the paper above.


65 billion parameters, and what hardware does this require to run fast enough to be usable?)



stop wasting time. Yes you can run this on any hardware. The matter is it is either too slow, requires outrageous hardware, or obviously bad (no tool required to see that it is generated).


>requires outrageous hardware

Who is going to pay $20/mo other than professionals? You'd assume professionals have professional hardware. A mid-range GPU or a video editing laptop is not exactly breaking the bank.

>no tool required to see that it is generated

But again, that also applies to commercial generative images. They're easily discernible, if you just look. Midjourney is stable diffusion with a bunch of LoRA stacked on top. And it can't shed the "midjourney look" because of that. That's not in dispute by anyone.


> They're easily discernible, if you just look.

100%, although dall-e is getting better with hands specifically.

Anyway, Microsoft can keep throwing compute on it until a point where it becomes impossible to distinguish fakes by sight and a mechanism like suggested will make sense.

But with homegrown ones I don't see it happening soon. Only those who spend a lot of money on top of the line GPUs and keep desktop PCs may get to that point. Those GPUs will jump in price like they did at crypto mining peak or become impossible to buy if Microsoft gets the government to require "AI license" for them.

> Who is going to pay $20/mo other than professionals

Apple Music costs $10 and people easily spend ten times that on Patreon...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: