Actually distributing the images would be blatant copyright infringement and, even if you don't care, is prohibitively massive.
So, yes, modifying the source images will be effective against new players who want to scrape the dataset... and yes, as the tech to 'build your own stable diffusion' becomes more commoditized, you can be 100% assured that multiple groups will be doing it in the future.
...this also has a FAQ that states even a limited number of poisoned images (eg. most recent art by Greg) can cause artifacting in outputs.
2) ...but more importantly, are you joking?
Artists complain and artstation does what, update their TOS?
That obviously does nothing.
So, companies start inventing this anti-ai watermarking tech that shops like deviantart and artstation can pickup and put into their pipelines and make into a pro-feature.
If you think it's not going to happen, you are 100% deluding yourself. The playbook is so obvious, I'd be amazed if it's not already a WIP for these places.
Artist friendly. Make money. Big players can work around it and small players are screwed out of any good images to use for training.
It's going to happen.
3) '40 minutes to return a single image...'
Oh come on. Are you objecting because you think the process has no technical merits, or because you don't like it?
Technically, this isn't a limitation. These places already have massive image processing pipelines for resizing, thumbnailing, etc.
That's because AI art was a mere research toy back then.
Now it has massive economic potential, do you really think downloading say 100TB of images is a big deal to any company? Less than $100000 to get an invaluable database that's safe and sound. Its also not prohibitive at all to just clone this data into a NAS array, and just literally ship the hard drives to whomever needing it.
Google has already downloaded terabytes of images and they store resized thumbnails of those images on their own servers. If google can do it why can't everyone else?
1) That's not how it works.
LAION is a set of urls, not images.
Actually distributing the images would be blatant copyright infringement and, even if you don't care, is prohibitively massive.
So, yes, modifying the source images will be effective against new players who want to scrape the dataset... and yes, as the tech to 'build your own stable diffusion' becomes more commoditized, you can be 100% assured that multiple groups will be doing it in the future.
...this also has a FAQ that states even a limited number of poisoned images (eg. most recent art by Greg) can cause artifacting in outputs.
2) ...but more importantly, are you joking?
Artists complain and artstation does what, update their TOS?
That obviously does nothing.
So, companies start inventing this anti-ai watermarking tech that shops like deviantart and artstation can pickup and put into their pipelines and make into a pro-feature.
If you think it's not going to happen, you are 100% deluding yourself. The playbook is so obvious, I'd be amazed if it's not already a WIP for these places.
Artist friendly. Make money. Big players can work around it and small players are screwed out of any good images to use for training.
It's going to happen.
3) '40 minutes to return a single image...'
Oh come on. Are you objecting because you think the process has no technical merits, or because you don't like it?
Technically, this isn't a limitation. These places already have massive image processing pipelines for resizing, thumbnailing, etc.