Hacker News new | past | comments | ask | show | jobs | submit login
DALL-E 2 open source implementation (github.com/lucidrains)
520 points by jonas_kgomo on May 1, 2022 | hide | past | favorite | 152 comments

One thing I’ve not seen mentioned/asked about DALL-E 2, which I’m really, really intrigued about is what hardware its typically running on and the average time it takes for image generation? Maybe I’ve somehow missed it, but I’m really intrigued by what their hardware set up is and how long it on average takes to generate the eight images from a users text input? I’m guessing it’s really quick, but I honestly have no idea… How does this system scale? Typical cost per image generated?

Not a complete answer to your question but you may find this discussion interesting:


Inference cost and scale seems to be much more favourable than large language models (for now).

In case anyone else is put off by the link referencing an answer that then links to something else with most likely higher hardware requirements that are not stated, the end of the rabbit hole seems to be here: https://github.com/openai/dalle-2-preview/issues/6#issuecomm...

TL;DR: A single NVidia A100 is most likely sufficient; with a lot of optimization and stepwise execution a single 3090 Ti might also be within the realm of possibility.

Allegedly, based on info from a DALL-E 2 paper, it takes about 100k-200k GPU hours to train the model:


Assuming those are V100 GPUs at $3/hr, that’s at least $300k.

Relative to compensation for the developers and researchers working on projects like these this looks a small component.

GPT was rumored to cost in the millions, perhaps the hours estimate is conservative?

And presumably one should also factor in the cost of experimentation before a good setting for the final training is found.

How feasible is it to evaluate the quality of chosen parameters/settings without going through an extensive training? I mean <BIGNUM> GPU hours to discover your results are meh, jesus.

(Small timer with absolutely no idea of practical ML here)

4.6 millions

Also very interested in this. AFAIK, the best alternative to DALLE-type generation is CLIP-Guided generation (such as Disco Diffusion [1] and MidJourney[2]) which can take anywhere from 1 - 20 minutes on an RTX A5000.

[1]: https://github.com/alembics/disco-diffusion

[2]: https://twitter.com/midjourney?t=-kKC5UE-gjIkMvAb709SyQ&s=09

I think it would be relatively quick.

The Latent_Diffusion_LAION_400M notebook generates six 512x512 images in about 45 seconds on a K80 on Collab.

DALL-E2 is more complicated but presumably also better optimised.

This technology always reminds me of the TNG episode Schisms, where the holodeck renders an alien site based on evolving text descriptions.


I seem to remember once they finally found a way to wake up in the real alien world it was a bit different than the group had imagined it, which was a nice touch.

Certainly one of the creepier TNG episodes even if it was a one off meant to cash in on the popularity of alien abduction in media at the time.

"Delete the wife":


Yes! I have this exact recurring memory when I'm playing around in DALLE2. Specifically the part where they're trying to make the table.

The original clickbait: “Clicking noises. More. Faster. Louder.”

The overlap between the hardware requirements for model training and web3 mining has me thinking could there be a coin where the proof of work could be training a model ?

Unfortunately ML model training is too easy to 'cheat' - ie. To look at the test data to fine-tune the model to overfit a lot, do really well at test time, but actually be useless for real world stuff.

To work out, you'd need a way to keep some test data secret, which in turn kinda kills decentralisation.

You could do something like kaggle does, where you have some test data but the final score is determined by a more complete test data that you don't have access to.

Wouldn't you then be penalized when you train a model that yields bad results on the test data, even if you're doing what you're meant to be doing?

AFAIK it works like this: you have a test data you develop against and some secret (bigger) test data that you only have access for the final score. While you are developing you can overfit if you want, but then probably you won't perform that well with the secret tests. What you are meant to be doing is to perform with the test data without overfiting. Even if it is not optimal, it mostly solves the overfiting problem. It might work for a cryptocurrency too.

I think the problem described is the following:

1. A researcher puts out a model with bad initial parameters/data.

2. The chain workers/miners train the model as request.

3. The model fails at the test or verification dataset due to the bad setup.

In this case, the miners would not get paid despite doing exactly what was asked of them.

You don't need to test that they perform well, just that they perform the same (for algorithms that should be bit for bit identical) / similarly (for ones that are less so). If multiple people train the same thing and the results are the same you could trust they have faithfully run the training as asked. That rewards "train as asked" rather than "train and get a good result".

Lack of trust there can be addressed with a few other things like staking, and a broad desire for all miners that people trust the system as a whole. Quite how to design those things is a complex problem but not insurmountable I think.

Disclaimer, this kind of decentralised more useful kind of work is something I'm investigating now.

Many coins are not really decentralized , there is some centralization anyways as we recently heard with the Ronin bridge attack or many examples from before.

I am not sure that is a really big turn off.

I don't think there's much overlap. Crypto stuff is all integer maths as far as I know (maybe there are some weird floating point coins?) but ML is all low precision floating point. ML training hardware has a ton of powerful FPUs but much less integer compute.

Interesting idea for proof of work, a large nonlinear function where you find weights so that f(x_0) = x_1 < x_0 or something like that. To exploit ML advances.

I meant the hardware not the math, while there are new GPUs which disable mining, and there are GPUs and ASICs designed for mining, the fact is last 5 years, the GPU market has been hogged by mining.

Potentially, I was recently introduced to this: https://bws.xyz/

Not affiliated and haven't researched yet but looks interesting. There are a few other crypto mining companies offering general compute to manage changing power/price/hardware resources.

There's been some discussion below on OpenAI not being open. Contrary to most arguments, I have to share that I like their current approach. We can all easily imagine the sort of negative things that could come out of the misuse of their models, and it is their responsibility to ensure it drips into public domain rather than just putting it out there. It also allows them to more carefully consider any implications that they might have missed during development. And even when you do have access to them, at least in my experience they make sure to properly caveat some uses even during runtime (eg, in their playground for the DaVinci model). So while I'm not associated with OpenAI in any way, I would like to go out and say I broadly support their "closedness" for now and hope we can all be in a place in the future well release of high-performance models does not need to be so carefully considered because people would be more reasonable overall.

> We can all easily imagine the sort of negative things that could come out of the misuse of their models, and it is their responsibility to ensure it drips into public domain rather than just putting it out there.

Can we? I’m not sure about that - one issue being that they’re reproducable, so other people can recreate them without constraints. It turns out to not be that hard.

I think a better explanation is there’s a bunch of spare “AI ethicists” hanging around because they read too much material from effective altruism cultists and decided they’re going to stop the world conquering AGI. But as OpenAI work doesn’t actually produce AGI, the best they can do is make up random things that sound ethical like “don’t put faces in the training output” and do that.

Btw, their GPT3 playground happily produces incredibly offensive/racist material in the same friendly voice InstructGPT uses for everything else, but there they decided to add a “this text looks kind of offensive, sorry” warning instead of not training it in at all. (And to be clear I think that's the right decision, but man some of the text is surprising when you get it.)

Derailing a little here but I have to say I'm not a fan of implying this kind of 1:1 link between having an interest or participating in effective altruism and the AI ethics scene. The Venn diagram is not a circle, and the overlapping part may well be a loud minority. I get that "cultists" may be talking about just that minority but it doesn't much read like that.

Well, GiveWell and GiveDirectly are good, I've given to them.

But every time I meet someone who talks about this/has related keywords in their twitter profile/is in a Vox explainer, they always have strange ideas about existential risks and how we're going to be enslaved by evil computers or hit by asteroids or reincarnated by future robot torture gods. I think as soon as you decide you're going to be "effective" using objective rational calculations, you're just opening yourself to adversarial attacks where someone tells you a story about an evil Russell's teapot and you assign it infinite badness points so now you need to spend infinite effort on it even though it's silly. You need a meta-rational step where you just use your brain normally, decide it's silly and ignore it without doing math.

> It also allows them to more carefully consider any implications that they might have missed during development.

Do you have some proof of this? That they follow a thoughtfully considered ethical framework for each project? Because I never see that addressed, at all…

This is weird take in a thread whose very topic is an open source implementation of the model less than a full month after it originally came out. So obviously an evil mastermind can already copy them, and anyone strapped for cash to buy the hardware can hire people on fiverr to do the same thing.

I think the primary reason for the closedness is that it hypes up the research and makes it look more significant than it is. Same reason Palantir is allegedly happy about negative news coverage, it just makes them seem like James Bond villains.

Software developer that knows nothing about ML: What would it take to mess around with this?

From a skim of the README it seems like install the package, and then find some training dataset, and then ??? to use it.

Well you most likely need a CUDA compatible GPU with a non-negligible amount of vram to load the model. And a few hours of your time to mash your head at the keyboard in frustration because of the inevitable dependencies that won't want to resolve properly.

Most "open source" models don't actully upload the final models either, just the code they used to train them (not sure if this is the case here), so the next step after that is downloading 100G of data and running your workstation for a few days to train the actual model first.

I recall at one point I attempted to run BERT or GPT-J or one of those lighter language models locally, and it was going well until I realized I needed a 24G of vRAM to even load the model hah.

Do CUDA AI projects not containerize well? I understand that AI projects often have many, many dependencies, but this seems like something containers ("docker") would be ideally suited for. Something like `docker pull dall-e` and `docker run -ie -n dalle dall-e` probably volume mount in a local folder? I'm curious what the hurdles are here for packaging something like this.

They do containerize but it’s not enough. You still need the right drivers and CUDA stuff installed on the docker host, which can be finicky to setup. Not to mention figuring out how to get docker to actually pass the GPU to the container.

I see this has not gotten any better in the last 5 years. For future reference, if you as a dev are interacting with a product that is this hard to use and there is no other option but to use it (CUDA), you should buy their stock.

I'd much rather look out for competitors.

Resting on laurels won't last. If a company stops improving its offer, a competitor might catch up.

The competition has even more broken software (AMD). At least Nvidia's works when you need it to.

I think OpenCL support has been getting better, and it may be possible to run a lot of models with it, but that just doubles your already frustrating amount of hours trying to set the damn thing up.

> Not to mention figuring out how to get docker to actually pass the GPU to the container.

Should just be passing `--gpus all` these days, shouldn't it?

> I'm curious what the hurdles are here for packaging something like this.

No hurdle, it's just that Docker is not part of the standard toolkit of ML people. ML people all use conda (or virtualenv etc.) which already solves most of the dependency problems, making learning Docker not especially appealing.

But virtually all training/inference platform (including the ones used by OpenAI) are using docker. It's not a technical limitation.

No, ML people use Docker to deploy their models. Nobody got time or patience to replicate environments for inference.

Why not have some ML-rendition of BitTorrent designed for sharing pre-trained models? At least then you're not paying for AWS egress ha

That's not the issue. You can already download weights of other models through BitTorrent or from large archive sites like Hugging Face or the-eye. OpenAI just doesn't want to release them, so that they can sell access to it through their cloud service.

An older, but similar and still impressive alternative is available here: https://github.com/CompVis/latent-diffusion

If you have a decent amount of VRAM, you can use it to start generating images with their pre-trained models. They're nowhere near as impressive as DALL-E 2, but they're still pretty damn cool. I don't know what the exact memory requirements are, but I've gotten it to run on a 1080 TI with 11gb.

EDIT: I also tried a 980 with 4GB of RAM a while back, but that failed...so you probably need more than that.

This uses several models. One of them is already trained and released, which you can use or you can train it too yourself. The other models need training. Training means having a labelled dataset and let the model run several iterations until the loss function has a good value in order to use the model. This code is quite complex as it is not one model but several ones and also you can make some changes to generate bigger images and to use better models for some parts. Once you have everything trained (you might need a GPU or a cluster of GPUs to train it for several days/weeks/months), you can use it.

I know this wont help you very much. But, either you are willing to spend a lot of time messing around with the code and have access to a good hardware to train the models, or you would have to wait until someone releases something already trained.

> Once you have everything trained (you might need a GPU or a cluster of GPUs to train it for several days/weeks/months), you can use it.

It takes a whole team to replicate an advanced model. Take a look at our open source working groups: Eleuther, LAION, BigScience. They worked for months for one release and burned millions of dollars on GPU (gracefully donated by well meaning sponsors).



The BigScience group is massive:


I didn't even know anything like those groups existed. That's very helpful. Thank you!

The readme says they're working on a command line interface (CLI). You might want to wait for that if you don't want to learn PyTorch.

Once that's done it might be as simple as: install the package, and then find some training dataset, and then run the training CLI (for days or weeks or more), and then run the image generation CLI. Or even: just download an already trained model and use the image generation CLI immediately.

It looks like big-sleep [1] is their CLI? I did:

    pip install big-sleep
    dream "a pyramid made of basketballs"
and after a few hours got this: https://i.imgur.com/FxdfdmV.png

Not nearly as cool as the real DALL-e, but maybe I'm missing something.

[1] https://github.com/lucidrains/big-sleep

Is there a reason why the pretrained models are not included as well? Or perhaps this is coming at some point in the future? There are many users who simply want to play around with the functionality, who lack the time and hardware required to replicate the training steps.

As someone who has trained transformers with only ~5 layers on 1000 data point inputs for a decoder of length like 10 (so far fewer than the 12 billion parameters of DALL-E [1]), I learned two things:

1. Training is very finicky and time-consuming.

2. PyTorch model files are pretty large.

For point 1, when I was training a different autoencoder for example, I would run it for a week on 4 GPUs, only to return and see that its results are subpar. For point 2, I would get 100 MB model files for a miniscule transformer (relative to DALL-E numbers). Combined with the strong dependence of the transformer on the training data, this can make it problematic when it comes to sharing pre-trained models for tasks that the user doesn't specify a priori.

[1] - https://venturebeat.com/2021/01/05/openai-debuts-dall-e-for-...

I will gladly download a terabyte of model data if that’s what it would take, your numbers don’t scare me!

Large models generally require beefier GPUs and more RAM. Wide (not just deep) models require more GPU cores, total model size requires more RAM. Some models require multiple GPUs to train and/or run. Some require non-consumer server GPUs. These are non-trivial hardware investments if purchased, and non-trivial cloud bills if rented, even at preemptible spot pricing.

I'd pay money for pretrained models that can run on consumer gaming GPUs (8gb vram) for things like this. I wonder if anyone is selling these.

The main issue is that what neural nets are doing is just very fancy heuristic compression of vast amounts of data into a few gigs of vram (and a useful way to access it), so the smaller the model is the less it'll be able to do.

Most of the pretrained models I see for the stuff I'm working on (Coqui, ESRGAN, etc.) are really tiny in size. I'm sure the much better models are bigger but still fit under 8gb vram. It seems like there could be a market for models that fit into 8gb vram, but I'm not familiar enough with users of ML to say for sure.

> Is there a reason why the pretrained models are not included as well?

Because every one else, too, lacks the time and hardware required to replicate the training steps.

There is just not that many organizations that can pay for 100-200 thousands hours of training on A100 GPUs.

So the reason why the pretained model is not included, is that it does not exist (yet).

There should be a system in place like Folding@Home or SETI@Home for situations like this.

Agreed. This feels like the next big step in open source machine learning. I would love to be able to dedicate CPU / GPU towards chunking out open-source GPT-3 / DALL-E 2 clones so long as the trained weights are shared with the world afterwards.

That feels more long-term productive to me than crunching coins.

For what it's worth, the architecture for seti@home doesn't make much sense any more in the modern era for most types of data processing. It's simply too slow and error-prone to transfer large amounts of data between user machines. Modern radio telescope data analysis happens at datacenters with a lot of GPUs in them, just like modern AI model training.

Okay, gotcha. Just out of interest, why is it error prone? Because of some kind of merge conflicts of the results?

No, I just mean you design your system expecting the data from machine 1462 to get analyzed at some point, but that machine belongs to some random person who uninstalled your software without you knowing about it, so you never get back an analysis. Or you have bugs that come from 100 different versions of your software running because you can't simply upgrade the software on all the machines of the datacenter.

I understand.

> you can't simply upgrade the software on all the machines

You can update a website. And a website can use the local gpu.

The break the training task in smaller pieces, parallelize and distribute it part is indeed beyond my skills.

In a way, it is similar to our approach at Q Blocks. Decentralized GPU computing for ML.

> There is just not that many organizations that can pay for 100-200 thousands hours of training on A100 GPUs.

How much is this in money?

With no discount, one A100 is about 3$/hour, so we are talking about 600K$ for a full training. Probably closer to 1M$ since in practice you never train successfully to the end on the first shot.

IIRC a K80 goes for prices between 0.3 and 3 USD per GPU-hour

Pretty expensive all in all

between a a few hundred K and a few millions IMO

It literally says:

> To train CLIP, you can either use x-clip package, or join the LAION discord, where a lot of replication efforts are already underway.

Lucidrains is an incredible guy. Nice guy in his discord too.

He started quite a bit of other reimplementations, see https://github.com/lucidrains . Before Deepmind open sourced AlpahFold2, his implementation was one of only two available.

He really is.. how has he not been snatched up by OpenAi etc.? He embodies the commitment to open source.

Open AI aren't open. Hopefully it never happens haha

> He embodies the commitment to open source.

That's probably why, since OpenAI isn't open source.

is there a pre trained middle for it? what's the smallest cluster of GPUs you'll need to run this?

what is his discord? is it publicly accessible?

It's in the readme

is it LAION ?

He left Discord lol

Is it just me or is OpenAI the exact opposite of open?

They may as well just call it ClosedAI at this point, seems like every interesting thing they do is kept under wraps.

I feel like their initial mission of "democratizing AI" was just a way to recruit people. They're everything but "open".

Right. The switch away from openness coincided with their conversion to a for-profit and the Microsoft investment.

Democratising AI... through our opaque proprietary API...

> ... through our opaque proprietary API...

after you get approved on their wait list of course

They're as open as microsoft is micro.

Pretty sure Microsoft still makes `soft`ware for `micro`computers

Based on the constant pushes I get in Windows to use MS services, I believe the software is just an advertising platform, and the actual moneymakers are cloud services and selling users' data.

It’s a grey area, but I’d say they haven’t made software for microcomputers for some time.


I think that's a stretch, but even if you use that definition Windows IoT core fits the bill.

Ya sounds about right. Also curious if DALL-E 2 allows people to request edits to the generated images, or just one-shot attempts

You can use it to infill subregions. Or you could add a “but <your requested edit>“ to your and run again. It’s not super robust to prompts involving complicated composition though so that might not be a reliable strategy.

I agree; perhaps they are afraid of negative impacts of releasing their work to the public. Either way, I’m not sure I understand the name choice at this point.

Honestly, I think the main problem is, that training large models like this, or GPT-any, or anything like that costs a lot of compute power - which costs a lot of money. So if you're just investing a lot of compute but give everything away for free, that isn't a strategy which can be maintained.

So what we really need, are ways to train models in a distributed fashion. Their initial goal was to 'democratize AI'. But democratizing these models doesn't just mean giving everyone access to your trained parameters, it also means that everyone should be able to spend an effort on improving it. I think a good solution for this would be huge.

So when they formed OpenAI and thought "we are gonna hire the most brilliant scientists in the field" they forgot to account for hardware costs?

At this point it's more like doublespeak.

They just announced they'll be rebranding as OpenYourWallet

Do they have a revenue model? I thought it was a non-profit / public benefit company.

The training set is a one time computation though, running the model still costs cycles and people have to pay for it (see sites like Hugging Face or GooseAI). What people are asking about is the code itself and the one time computed training set. They're not asking to run the trained model for free.

Anyway, getting cycles isn't too hard, expect CoreWeave or another compute-centric company to hop on board in due course.

> They're not asking to run the trained model for free.

I am :D

> So what we really need, are ways to train models in a distributed fashion.

How is the replication effort for this project handling that? They have over 600 people on their Discord right now, do they have some way for them to cooperate on training a single model?

As far as I can see, No.

GPT-J and GPT-Neo are previous open source reproductions of GPT, but distributed training is overrated. There’s way too much communications overhead. I’m not sure any of the various @home projects have ever accomplished much for that matter.

Instead they use donated time on a TPU or a GPU cloud provider.

An exception is Leela Zero since you can realistically have it play chess at home.

> Instead they use donated time on a TPU or a GPU cloud provider.

So instead distribute the funding? And distributing running models too? Something something crowdfunding blockchain cryptocurrency

Or alternatively https://www.microsoft.com/en-us/research/publication/decentr...

>donated time on a TPU or a GPU cloud provider

hmmm... what about people who mine crypto on their hardware, and then that hardware goes into a pool of shared $ used to rent out a TPU

The could have chosen to work on smaller models and used their already substantial amount of funding to be truly open. Instead they chose a path that demonstrates the Open part in OpenAI was just marketing.

That was their fear with GPT-3 because it would enable whole "fake news" articles to be written without much effort. At this point why should they release trained models? AI developers have a problem now, either show their work (locked down and limited to protect their image), or don't show it. If they do show it, it'll be reproduced. What's interesting to me is how we are approaching the free open source use of these systems. An open source DALL-E 2 could create some very abhorrent imagery.

It's the same phenomena as when a protocol or data interchange format brands itself as "simple"

This uses OpenAI’s CLIP model which is open source: https://github.com/openai/clip

I still think what happens is that they started out on a mission of "we're going to democratize AGI access!" and then shortly thereafter, the not-crazy people managed to convince them what an astoundingly terrible idea that would be.

Doesn't AGI fall under the first amendment? ;)

I would say that they are not as open as when they formed as a non-profit.

That said, their commercial APIs for GPT-3, etc. are very good and my monthly bill for using them seems like very good value.

What do you use the API for? I've not really seen any interesting commercial uses other than customer support bots that are sightly more convincing but not something consumers actually want (they actually want the human that can solve their problem).

Github copilot?

Isn't that common for startups that want to gain users anyway? (Not charging entirety of what market will bear, anyway)

It’s important to recognize the definition of “open” that sama and team embrace versus the definition of open source and “free as in beer.” The early team gathered around the idea of achieving lift (as in marketing lift) of the idea of sharing research results and building upon open source. They do that really well, relative to MS Research and Deepmind. OpenAI was never going to focus primarily on open source or “free as in beer”—- their goal was to market themselves such to draw attention away from that. Well, I’m sure some of the team wanted that, but I’m also sure sama was more than willing to embrace his own definition of success versus that if the dev community.

ClosedAI is a bit harsh, but they have definitely both divided the community somewhat on the definition of “Open” and been very comfortable with focusing on their toys and ignoring the conversation. I mean sama, especially when it comes to Worldcoin, has been very direct about ignoring criticism in general.

How much does it matter? Everyone's practically forgotten that compviz and Midjourney were just in the past few months, Sberbank has ruDALL-E which largest model is somewhere between DALL-E 1 and 2, FB has Make-A-Scene which is somewhat below GLIDE/DALL-E 2, and Tsinghua's Cogview2 paper was released just days ago and is claiming roughly DALL-E 2 parity. AI capabilities are advancing fast enough that whether or not OA releases something, there will be competitors soon. (Heck, you want a GPT-3 competitor? There's like a dozen of them of similar or better quality, several of which are public or API - BigScience, Aleph Alpha, & AI2 come to mind.)

Im not in this space...do you mind explaining more please? I'm really curious.

They were suppose to be a non-profit company that will help to develop AI. AI community is very open, and here you have someone different from OpenAI releasing an open source code from a research from OpenAI (because they didn't release their code)

It seems like it started open, but then they might have switched directions

This needs distributed training...

Years ago I made a shared tensor library[1] which should allow people to do training in a distributed fashion around the world. Even with relatively slow internet connections, training should still make good use of all the compute available because the whole lot runs asynchronously with highly compressed and approximate updates to shared weights.

The end result is that every bit of computation added has some benefits.

Obviously for a real large scale effort, anti-cheat and anti-spam mechanisms would be needed to ensure nodes aren't deliberately sending bad data to hurt the group effort.

[1]: https://github.com/Hello1024/shared-tensor

Sorry I'm being too lazy to run the code, but are there any examples of its output? Like, how comparable is it to the openAi's

Per the discussions, a prototype trained version of the model was released a few hours ago.

There probably won't be demos for a bit, but sooner-than-later.

We are at the beginning of a post verification society.

Any image or video of anything will soon be faked by commoners. This will be a wild ride!

One solution is chain-of-custody attestation from trusted hardware. Have cameras sign the video or images (and audio) they record.

In a world where "no, mRNA vaccines don't change your DNA" has to be explained (often, unsuccessfully), I'm not sure I wanna try explaining digital signatures with elections on the line.

my peer at work didnt take the shot because he doesnt believe in "new technologies". hes an engineer, MSC, phd and worked as a researcher. I mean, I'm not talking about someone totally out of science

People who deal with complexities understand that experience is the encounter with the disruptive unexpected. Similarly to the Socratic principle, ancient counterpart of the Dunning-Kruger effect. Trust is heavily "modulated" in the experienced, it's kept at a distance. It's normal.

So, back to the topic, we may have a few ways to build new shields against the new weapons, but this will be not trivial in technical and social matters, and their interactions - having the involved parts adequately understand the new realities: "a document may not be just trusted", "there is no boolean trust value but a 0..1 fraction", "signatures may not be a total warranty"...

To some people, "If you received an e-mail using that sender's name it does not mean he sent it; no, easily his account was not hacked (etc.)" is already a massive blow to their inner world, that makes them throw themselves in the chair finding it unbearable if taken seriously and incomprehensible if considered.

And, some people will find it "less painful" to distrust /you/ than to break the laws in their inner world (it's a phenomenon you may also meet in this very forum) - some minds observe a regimen of strict conservative low expenditure. Complexity involves social dangers.

> Similarly to the Socratic principle, ancient counterpart of the Dunning-Kruger effect.

I don't remember ever hearing of "the Socratic principle" – what do you mean by that?

I had a quick google. There are only 12 pages of results for the phrase, i.e. it's not commonly used, and every hit for "the Socratic principle" I find seems different. The first four I found: "Virtue is knowledge", "follow the argument wherever it leads", "Wherever a man posts himself on his own conviction that this is best or on orders from his commander, there, I do believe, he should remain", and "Whenever we must choose between exclusive and exhaustive alternatives which we have come to perceive as, respectively, just and unjust or, more generally, as virtuous and vicious, that very perception of them should decide our choice. Further deliberation would be useless, for none of the non-moral goods we might hope to gain, taken singly or in combination, could compensate us for the loss of a moral good." If I had to say what I thought Socrates' main principle was, it might be "study humans and ethics, not nature/physics or maths". It doesn't sound like you meant any of those, but I'm not sure.

Apologies for the unintended obscurity - I thought the meaning would be apparent from the context and that mention of the Dunning-Kruger as a posterior counterpart. I meant that "the more you investigate, the more you realize that the most solid knowledge you have is that of how little you know"

(Socrates was said by the Pythia, the Oracle of Delphi, that he was the wisest man in Athens: Plato has he comment in the Apology that if there is one thing that makes him wise, it could be that he does not delude himself about his ignorance involving all the things he does not know).

To such "Socratic principle on wisdom and knowledge", the wiser you get the more you see limits in your knowledge; to its counterpart the D-K, lack of experience has subjects underestimate their ignorance.

People experienced in dealing with problems experience a large amount of unexpected obstacles, unexpected in the mental framework they brought: after some long while, difficulties and complications are expected preemptively, they are part of the picture one forms. Experience brings diffidence.

> People experienced in dealing with problems experience a large amount of unexpected obstacles, unexpected in the mental framework they brought: after some long while, difficulties and complications are expected preemptively, they are part of the picture one forms.

i.e. "Experienced people expect obstacles."

That of «obstacles» is not really the point. It is more that the experienced knows ignorance, one's and other's, so he will build distrust.

The experienced knows that such naïve «mental framework» is very common: he will expect naivety.

Is manual image editing not already faking?

We're talking about the scale. This is "manual image editing" on steroid.

Time stamping and trust authorities are going to be very valuable. Keysigning ceremonies will become the law of the land.

I’m not an AI person, when can I use my MacBook to make crazy cool images I’ve seen on various twitter threads via DALL-E 2?

If you have some command line experience you can rent out some compute and then run it on rented hardware. I am renting an A40 with 48G of vram for $0.40/hr on Vast.ai right now to make cool images. I am still using Big Dream which I think is still based on DALL-E, not DALL-E 2. I couldn't figure out how to run this latest iteration (yet!).

Ballpark answer: ~5 years until this is running on local hardware owned by the average person. Hardware is always improving and the algorithms will get more efficient.

Until then, you'll likely be able to use an API to generate images within the next 6 months.

I remember lucidrains from his site, Epicmafia. Good times! Made so many friends there, many of whom I still talk to today.

This is a wonderful thing. But does it have any positive applications outside of entertainment?

I can think of lots of malicious things it can be used for.

I expect all usages of this tool to border on fiction in some way. After all, the images don't depict anything real. They might accompany a nonfiction article as an "artist's impression," though, as so often happens in popular science articles about things we have no images of yet.

It's become increasingly common to start off articles with an image, even when the article is describing an abstract concept, just to avoid having a "wall of text." If that's what you need to do, it seems like a more interesting alternative to clip art.

This honestly seems like just a fun activity. Why not start every doc with a DALL-E generated image?

I'd much prefer tech being open to everyone rather than kept to megacorporations that OpenAI deems worthy (and has received funding from). A neural network isn't a lethal weapon, the potential for malicious use is much more limited than its potential positive creative applications.

“ A neural network isn't a lethal weapon “

If a click generates a deepfake that causes someone to commit suicide, it might be considered lethal in some jurisdiction.

Let’s consider two individuals A and B.

A is a mentally healthy person living a fulfilling life. Will watching a disturbing deepfake affect A severely enough to cause them to commit suicide? No.

B leads a life of extreme isolation and faces troubling financial difficulties. They come across a deepfake that pushes them over the edge at the wrong time and they give up on their life. Is the deepfake to blame? Or were there more important factors that we could’ve focused on to save B and potentially thousands of others.

And since we’re discussing the unintended consequences of deepfakes, let’s not forget that a carefully crafted one can even save lives hypothetically.

When I read the parent comment I was thinking a deepfake designed to cause extreme reputational damage (cheating on spouse, etc.) in which case even person A would be at risk of having their social circle destroyed.

I think it’s a lot easier to create a harmful deepfake than a helpful one. It’s easier to create harmful lies than helpful truths.

Imagine - you could convince a parent that their child has died. Show them photographic evidence. I don’t know how close to the edge they’d need to be.

The guy drawing funny memes for John Oliver's Last Week Tonight (Like the following) can be replaced by this!


Infinite supply of Dixit cards?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact