I can't tell if this is a joke app or seriously some snake oil (like AI detectors).
Isn't it trivially easy to just detect these unicode characters and filter them out? This is the sort of thing a junior programmer can probably do during an interview.
Let me clarify, when I perform interviews, I tell my candidates they can do _everything_ you would do in a normal job, including using AI and googling for answers.
But just to humor you (since I did make that strong statement), without googling or checking anything, I would start with basic regular expression ranges (^[A-za-z\s\.\-*]) etc and do a find-replace on that until things looked coherent without too much loss of words/text.
But the problem isn't me, is it? It's the AI companies and their crawlers, that can trivially be changed to get around this. At the end of the day, they have access to all the data to know exactly which unicode sequences are used in words, etc.
Good point. Then it's actually an active attempt, right?
Also I realized my statement was a bit harsh, I know someone probably worked hard on this, but I just feel it's easily circumvented, as opposed to some of the watermarks in images (like Google's, which they really should open source)
Isn't it trivially easy to just detect these unicode characters and filter them out? This is the sort of thing a junior programmer can probably do during an interview.