as a layman, i imagine for someone at the scale required it may not be worth the risk or the added effort vs paying or using a different model but it'd be funny if we see companies creating a subsidiary that just acts as a web-passthrough to "legalize" llama2 output as training data
as long as it's about content they generated which weren't modified (at most cropped, scaled) this is quite viable
you e. g. can encode a subtle pattern in the generated image which surives compression and isn't really human visible
then you make a browser extension to spot that pattern and indicate it to the users "in some way"
given that there is a overlap between AI company owners and biggest browser producers and mobile OS vendors this doesn't even need to be an extension but can be build in
obviously any bad actor is likely able to remove it or otherwise still trick users
> encode a subtle pattern in the generated image which surives compression and isn't really human visible
This is basically a contradiction in terms. Compression attempts to throw away any and all data that "isn't really human visible," that's how it works. There isn't space for invisible watermarks by design. You can kind of get away with something "at the edge" that survives an initial JPEG encoding, but there's no way it's going to reliably survive e.g. resizing, cropping, and recompressing and still remain invisible.
Also, most AI generation content is presumably going to be text, not images. Good luck watermarking text that's a paragraph long. (There are potential tools that can operate on text the size of a news article, but are also trivially defeated by swapping a few prepositions and synonyms.)
> There isn't space for invisible watermarks by design.
Very incorrect. Steghide [1] supports JPEG. JPEG and other lossy image formats are ultimately just fancy file formats; there's nothing preventing you from encoding arbitrary messages in a compressed image.
> You can kind of get away with something "at the edge" that survives an initial JPEG encoding, but there's no way it's going to reliably survive e.g. resizing, cropping, and recompressing and still remain invisible.
I am pretty sure that I can design steganagraphy algorithm that disperses a small message across a JPEG in a way that is:
1. invariant to resizing (absolutely certain this is possible),
2. robust to cropping (invariant to cropping up to some limit is definitely possible; eg if you crop 100% of the image then obviously everything goes out the window),
3. robust or even invariant to recompression. This seems a lot harder but I'm pretty sure it's possible.
> Also, most AI generation content is presumably going to be text, not images. Good luck watermarking text that's a paragraph long. (There are potential tools that can operate on text the size of a news article, but are also trivially defeated by swapping a few prepositions and synonyms.)
Yeah, text seems more difficult. Images are also difficult/impossible if you assume the model user is adversarial and competent, which I'm not sure what you wouldn't assume.
For any particular model you can probably do detection with a fair bit of inaccuracy. But I would definitely put detection in the "doomed" category.
I also think the threat is real but wildly over-stated relative to the non-AI status quo. We're slightly democratizing Photoshop and copywriting skills, which weren't exactly scarce to begin with. It's not an AI problem, and it's barely a technology problem. It's primarily a political problem.
> 3. robust or even invariant to recompression. This seems a lot harder but I'm pretty sure it's possible.
No, that's my main point. By definition, "perfect" compression will discard everything not human-noticeable, which leaves no room for watermarks/steganography. So the only room for watermarks is in the margin where compression is currently imperfect, i.e. encoding more detail than needed.
But that's relying on artifacts that vary dramatically with compression technique (JPG vs PNG vs WEBM etc.), with basic image manipulation (adjusting brightness, contrast, color, etc.), and other basic operations like resizing. So as soon as you chain any of these together, watermarking falls apart.
> Steghide [1] supports JPEG.
Yes, I already said in my comment 'You can kind of get away with something "at the edge" that survives an initial JPEG encoding'. But as I'm saying, it's not robust or reliable as images get reused. The whole point of a watermark is that it survives copying -- e.g. they would show up as dark text if you xeroxed a watermarked document. That type of robustness or reliability is just not possible here as users download and re-upload images that get re-encoded, because the entire point of image compression is to try to throw away anything and everything the human eye doesn't care about.
I think you're being distracted and confused by imprecise and inaccurate descriptions of the intent of various algorithms, instead of considering what actual algorithms actually do.
No existing compression algorithm was designed to be "perfect" in your sense of the word, and none are. They were designed to be good enough, under a lot of different constraints (ease of implementation, extant mathematical tools/knowledge at the time, computation time, etc.)
Instead of talking in hand-wavy terms about hypothetical objects in a wishy-washy way, let's remember that lossy image formats and compression schemes are just pieces of mathematics. E.g., the basic JPEG algorithm is a fairly simple procedure that can be explained to any college student and even moderately above average middle schoolers. Is it perfect? No. Is it what actually gets used in reality? Yes.
> That type of robustness or reliability is just not possible here as users download and re-upload images that get re-encoded, because the entire point of image compression is to try to throw away anything and everything the human eye doesn't care about.
Let F : IMG -> IMG be a set of functions that most/all users and platforms use for compressing images. The question is whether there exist a pair of functions s,t such that for an image i \in IMG:
1. s(i) is roughly the same as s to the bare human eye but contains a message m. (Or not? Depends on the use-case.)
2. t(s(i)) ~= m for some notion of similarity ~= which is sufficient for watermarking.
3. for any f1, ..., fn \in F, t(fn(...f1(s(i))...)) ~= m.
We can relax constraint 1 because we probably only care about a subset of IMG, etc. etc.
Your impossibility conjecture about the existence of s,t for common extant F's doesn't seem nearly as obvious as you're claiming. And there are CERTAINLY choices of F for which s,t do exist. Eg for the basic JPEG algorithm I'm pretty darn confident I can design s,t that are robust to various parameters and also where you only need at least k pixels uncropped to recover a message ~= m, for example. And not just design it, but write a fairly short and intuitive mathematical proof explaining precisely why it works.
In fact, if you know how JPEG works and other applications of Fourier transforms (eg in acoustics and perhaps also crypto), you might see why it would be more surprising if doing this were NOT possible at least for various JPEG implementations/parameterizations!
Stepping aside from JPEG in particular, you might need to know something about how each function in F works, perhaps intimately, and there is probably some clever mathematics involved for many choices of F.
you could also view this as an optimization problem and use various tools that got really popular in 2016 or so to build quite robust solutions that don't depend on the particularities of your choice of F. I'm less certain this would give absolute guarantees but I bet you'd end up with stuff that works well in practice.
But in any case, it's unclear why you are so convinced this is impossible.
I would like to know the same thing. In the case of JPEG and similar schemes in particular, an impossibility result that isn't unrealistically narrowly scoped (again assuming non-adversarial user) would be highly surprising.
> then you make a browser extension to spot that pattern and indicate it to the users "in some way"
If that's an Open standard, and if that browser extension is Open Source, then anyone who wants to avoid that can mess with the final image until the Open Source free standard that everyone is using no longer detects the image.
The only way this is viable is with DRM or a closed service; if it's a standard everyone follows, then circumventing it is trivial. The only way it would work is if it's shrouded in secrecy and attackers can't freely use red-team against the tool. These kinds of watermarks work when there's a very limited pool of people checking for the watermarks, they don't tell people who the watermarks are generated, and they don't tell people how to check for the watermarks.
But that's not really useful for the current situation -- we don't want to further entrench these companies and we don't want it to be costly to check if an image is AI-generated.
I don't think it's viable to do this without significantly curtailing user agency or designing a system that is fully opaque and inaccessible to most people.
Yeah, this is my general feeling about why this area is doomed. It's why I haven't bothered to write up a patent even though I had some good ideas a few years ago. Maybe that was a mistake since governments and corporate politicians are stupider than I assumed.
If you have a central source of authority then the problem is totally trivial and the fact that the images are AI-generated (or not) is a complete red herring.
If you don't have a central source of authority then any reasonable adversarial model makes the watermarking problem somewhere between very difficult and impossible.
Detection from known models is still possible, at least for images. But that's not really watermarking per se.
In a weird and perhaps slightly twisted way I am hoping that LLMs become so good at spam and creating garbage to flood the net in english that they forfeit the ability to do so in other languages resulting in a balkanization (in a good way) of discussion and thought back to the native languages of peoples and thus more integrated with their culture.
in practice translation engines (e.g. deepl or LLMs themselves - though i still expect deepl to be better) probably will throw a wrench into this, but perhaps some localization approach in CAPTCHAs or simply just outright banning geoips not belonging to the countries with natives of that language (or significant minority populations) is a quick enough fix. i know some imageboards (e.g. british ones) use this because otherwise they would be flooded with americans.
In practice it may well mean the anglophone internet fills up with spam as the ESLs realize their formally dying(or dead) websites and forums and boards are basically free from digital black death and intellectual thought makes a retreat from the world language.
The irony here is that the Finnish internet will probably be less impacted by AI spam as a result. Perhaps in the end internet times it will be nothing but bots and finns. A fitting outcome.
There's a lot of stuff in various news articles about various forms of imperialism and such, but i would eat my hat if the internal workings of LLMs makes their high jouranlistic cutoff for publishing.
Tokenization strategies will almost certainly play a significant role in language extinction, and as someone with tremendous respect for linguistics and the role that language plays in thought and culture, it pisses me the hell off.
It unironically helped accelerate the death of the game.
This wasn't counter-strike with millions of players and a huge esports scene, this was a tiny mod with maybe a few thousand players that required ~5v5 or higher for a good experience
Was für eine Überraschung, ein arroganter Deutscher, der nie Natural-Selection gespielt hat, hält einen Vortrag.
Hier geht es um die Gemeinschaft des Spiels, nicht um das Spiel selbst. Dieser Mann hat das Spiel nicht programmiert, nur eine kleine Ergänzung zu einer Ergänzung.
I used to play NS a ton, even was in a clan for a while. It was one of the most original and novel video games i've ever played, only planetside comes close.
Combat played a sizeable role in killing the original game and turning it into something i'd expect from a CS mod. An entry level, bland generic version of the original that gutted all the strategy and coordination and teamwork that you required to win on ns_ maps. (A good commander alone was worth gold)
You built upon what many people considered cancer (combat to some extent, but definitely xmenu combat) - tumors get serious when they hit the lymph nodes, and they probably saw it as that.
I quit before this era happened but I probably would have hated you if i hadn't - even if you aren't technically the one to blame.
I'm the author, and I agree with pretty much everything you wrote - NS was special and trying to preserve that specialness was why I put the buildings into Combat in the first place. I miss that era of experimentation both in mods and commercial releases. I would've loved to be online during the heyday of Tribes 2 or Planetside.
Ironically, I don't think I ever consciously built in xmenu<->buildmenu support (though I don't have the .amxx file to hand to check); that was a surprise to me. IIRC, both ExtraLevels and Combat Buildings used the NS module for AMXX to read write the "unspent level points" for the player. Placing buildings was a lot less crazy when you had to spend one of your nine precious levelups on a welder or a gorge morph, and then spend another on each building.
Combat Buildings made Combat pretty stupid, ExtraLevels made Combat really stupid, but the combination was way stupider than the sum of its parts.
Fate was probably determined at the point that combat came out, xmenu just accelerated it with absurd upgrades that let one person steamroll the server (though it was sometimes fun to be that one person, it wasn't for the other half of the server...)
At least your mod brought back the building aspect, cheers for that.
Both of planetside and tribes were amazing. This was really before the age of competitive gaming so it was actually fun to play and not just another day job
I couldn't remember if I'd played this mod - then I saw the teeth/mouth that surrounds the screen when you play on the alien team in one of the screenshots and it all came flooding back.
Original HL had an absolutely awesome modding community. As well as the huge one (CS) there were tweaks to make it better in all sorts of ways:
- Adrenaline Gamer or AG - which changed the movement a little to make deathmatch super fast-paced.
- Bubble Mod - which make a few gameplay tweaks to make deathmatch more sensible and manageable in large servers
...and I even wrote my own for the deathmatch games we played on our LAN at Uni, which announced a particularly annoying players name whenever he got fragged.
You guys should try the current crop of FPS games like "Hell Let Loose" which actually require a lot of strategy and coordination! HLL, for example, has huge battles with commanders and several squads of different types with squad leaders, and you earn upgrades and abilities over the game as you take positions and etc. I haven't played that one for a while, but it could scratch that same itch that NS scratched.
I play NS2 sometimes, which still has a pretty active community.
It's a tough game to get into and even harder to get good at commanding since being a bad commander will sink your whole team. But it feels extremely rewarding to play on a well-coordinated team.