The items one is supposed to select often overlap the grids, so it becomes a kind of Keynesian Beauty Contest at that point; I assume they validate based on how much in alignment you are with previous answers, so it becomes a problem of “What nearby grids would a person reasonably select when there’s overlap”, or, you’re tasked at selecting <some_item> and you see <some_item> in the distant background of the image you’re supposed to classify and you need to determine “How visible would <some_item> need to be for a reasonable person to classify <some_item> as being in this image”.
On top of this, when you’re clicking through image after image after image, the additional frustrating thing is that you’re helping train their algorithms; you’re doing work because their service isn’t smart enough to know you’re human, or, you’re seen as a marginal customer they can piss off by forcing you to work for them for free.
It’s a frustrating experience when it fails, and my current strategy is to leave the website when the ‘I’m not a robot’ checkbox fails.
now that's over thinking :D
It's easily the worst UX I encounter online. I can't describe the relief I feel when I come across a "normal" CAPTCHA. It's coming to the point that if I could pay a few cents to outsource every reCAPTCHA, I gladly would simply to avoid this atrocity.
It's interesting to find out that this might have nothing to do with my ability or inability to recognize objects in blurry photos.
I don't think Google's publicly said what they do with any data derived from the new "I'm not a robot" reCAPTCHA, but, given the content it usually uses, it seems likely that they're still using it for image classification in Street View, or for their self-driving car projects.
Given that I'm not doing anything unusual, it really feels to me like reCAPTCHA, for all its complexity, boils down to "what's your history using Google software? Oh you rarely use it? I'm gonna give you a captcha". It didn't used to be this aggressive, but it's really ramped up in the past few weeks.
> During testing I had to shut it off because it became increasingly more complicated every page reload. First it was just one page of traffic lights, but 20 minutes later I was having to click through 5-6 pages of images. This worries me that user's might get pissed off.
Why are you now badmouthing someone else for deleting Spotify over this exact same issue?
But if they are doing so because they are disabled, and the difference means they receive a worse experience, may result in an ADA complaint (especially if a government service falling under section 508 is involved).
That only really leaves blind deaf people out, at which point we might be reaching the limits of any technology to provide access to everyone without a tooooon of work.
But honestly, I wish it would just die.
Some financial and government benefit web sites query web trackers as an extra factor in the enrollment process.
Making (online or offline) life more difficult for people who don't want to use company X products could escalate to the point where you either accept the yoke and are admitted to the walled garden of "society" which company X has firmly cemented themselves under -- or you say no and find yourself unable to drive/fly/get a job/go to college/buy groceries in your town. It sounds like a big leap to make right now, but is a real possibility if Amazon/Google/FB don't get broken up soon.
So perhaps it's Safari and/or the ad blocker that are to blame? Hard to say, though.
I'm convinced that part of the reason Google released headless Chrome is as a honeypot for bot authors to use. The idea is that instead of going through the effort of fingerprinting and identifying new bot software, release something that bot authors will use instead that you have a capability to detect.
Somewhere inside of headless Chrome, there's one or more subtle changes that make it so Google can detect whether you're using headless Chrome or normal Chrome. There's no limit to how subtle the indicator could be - maybe headless Chrome renders certain CSS elements slightly slower than normal Chrome, etc.
It sounds pretty crazy/complicated but I could definitely see it being worth it if it means detecting $X,000,000 worth of ad fraud every year
I once ran into a piece of code from the scammy advertising world that tried to redirect users to a phishing site. They cleverly tried to hide themselves from the automated quality checks some ad networks do, by checking for these functions and appearing benign if they saw them. One of the checks even created an exception and then inspected the stack trace for certain flags that apparently are only there on some type of headless browser. Clever!
Login attempts are usually spread over a massive botnet of residential IPs as well, where they'll only use each IP for one or two login attempts before moving on to the next.
It's a very fascinating problem space
In a parallel universe of fluffy niceness we willingly provide our help and in that way we get all those old books converted to ASCII and available for us to read online. Our efforts are for the good of mankind. Similarly with the newer challenges, we help the maps be up to date and again this is for the good of mankind and those needing help getting around.
Clearly this doesn't work in an era where the 'don't be evil' mantra is long forgotten and people only see Google as some advertiser friendly capitalist monopolist beast.
Google need to work on their relationship with their customers, to be a benevolent dictatorship of sorts. They are lousy at customer service and there are other pain points that they are ignorant to. I don't see how this helps.
Google's customers are those who buy advertising. The rest of us are just cannon fodder.
Have come to the conclusion it really means any patch of grass.
Seems pretty bilateral.
Checking the box is actually not that hard. There is no advanced measurements of your mouse and touch speed. This is an Internet myth. It's more a game of cookies, making them age well, and having an organic set of headers.
From their site:
Is scraping legal?
In the United States, scraping public resources falls under the Fair Use doctrine, and is protected by the First Amendment. See the LinkedIn Vs. hiQ scraper ruling for more information. This does not constitute legal advice, and you should seek the counsel of an attorney on your specific matter to comply with the laws in your jurisdiction.
ROFL, I guess if you are able to ignore the layers of other issues TOS, breaking of technology to specifically exclude your use case, etc and are only willing to apply some very tangential case law against your reasoning it is "legal".
> Employee #427's job was simple: he sat at his desk in room 427 and he pushed buttons on a keyboard.
> Orders came to him through a monitor on his desk, telling him what buttons to push, how long to push them, and in what order.
And the community rules try to block people from writing firm "you're full of shit"-like answers, even though every other answer of Quora is full of lies like "Linux is fast, because it was designed for 16-bit computers".
Quite a lot of the “extremely good looking” answers on Quora straight up said that you couldn’t do e-mail in AWS. These were answers from after workmail was a thing by the way.
So I started looking at other Quora answers on stuff I wouldn’t normally need an answer for, and it’s frighteningly how often completely wrong answers look correct.
Don’t get me wrong, there is a lot of truly amazing answers as well, and it’s entirely possible that I just suck at it, but I don’t think I can always tell the amazing answer from the completely wrong one.
This assessment isn't quite right. The actual question is about how the captcha differentiates between a human and a bot at the box checking stage.
You're right, though, that it is both full of filler and also doesn't address the question as posed at all.
> Why can’t a bot tick the 'I'm not a robot' box?
It can, by taking over the mouse...except...
LUCKILY, the top answer (on my screen at least it's https://www.quora.com/Why-can-t-a-bot-tick-the-Im-not-a-robo...) does actually try to answer the question.
I feel like the OP submission might have just been some sort of submarine self-promotion for the "CEO of <redacted>".
It also links to a patent describing a novel mobile captcha invented by the author, so they might have some knowledge about the domain.
You can also add in some fundamental attribution error, as in "I am just looking out for everyone. You are being difficult. They are engaging in bad faith."
Google isn't going out of their way to punish you for trying to protect your privacy. They're trying to stop unwanted traffic. By unfortunate happenstance, you appear to be disguising yourself in the exact same way a shocking amount of bad traffic is.
I use Firefox with a few basic extensions (Privacy badger, uBlock, Google Container) yet every time I am presented with having to pick out traffic lights over and over and over again. I usually have about 5 or 6 "challenges" before I give up and use another site.
My timezone has not changed, my IP address and rough location has not changed, my screensize has not changed, my broadband speed has not changed, and my general computer dexterity has not changed, yet I am relentlessly targeted. On chrome I never saw these challenges, but on firefox with the privacy plug-ins I am always always always challenged.
At this stage I think the only signal it is using is "is there a google cookie in this browser? and if so has the google cookie got some 'normal' looking activity logged against it?" I.e. they are checking their server-side logs for a given cookie ID and seeing if that looks normal or not (i.e. seen on google search, seen on youtube, seen ads from a variety of third parties on various different sites, mixed up with time of day and speed of viewing etc etc).
Since I have got Google in a container in Firefox, I am guessing that my google cookie is not present when the captcha loads (due to the containers and privacy badger et al) so there is no identity back in the mothership to compare me against.
captcha is google master blow against ad blockers.
a regular user, who they have all the info, give them dollars per ad impression. You, with your doNotTrack (ha! that was a joke) and privacy addons makes them only cents per ad impressions.
you are google's enemy. remember this when you get stuck in captcha hell (and consequently censored from most sites until changing device/ip)
I rarely see the "I am not a robot" box, and hasn't seen image recognition tasks for a long-long time.
If they lose enough customers over this, they will probably remove the captcha.
The clever part from Google's perspective is that you have to trade one of these things to Google in order to get access to sites that do not belong to Google at all. Google convinced site owners to have their users pay a tax to Google.
When someone uses a recaptcha, they should think about why they are doing so. It's one thing to use it to save a business model, but it's another to use it to protect information that should be free anyway. The elephant in the room is government data. Many government agencies think that selling their data can be a nice source of side revenue, and a recaptcha is a good way of enforcing it. In reality, they just increase the costs for everyone, and those with means can obtain the information while those without means cannot.
Governments need to release their data, freely, without captchas or fees for single users and bulk users, no exceptions.
The counter argument is that they do a great job with trivial stuff like registered dog's names, and less well with sensitive/important issues like policing.
What's the right way to leverage the platform developed for the first into the second?
Totally agree. Fortunately the Dutch government is trying to make as much data open as they reasonably can, and regularly organise events to encourage developers to use their open APIs.
That's because Google isn't just profiling "Tor users". They're going after anyone who values privacy in any way or technology.
Simply put, you're being punished for ensuring privacy. And anybody who uses Google's captcha services is an accessory to that.
That's a massive load of bullshit. Google has a captcha challenge that only humans can solve. That alone is already sufficient to prevent unwanted traffic. That is how every captcha system works. However google is an exception. If you're logged in to a google account or are using chrome then google can use that information to track your captcha history. Privacy minded people avoid google like the plague and therefore they cannot be tracked.
>Google isn't going out of their way to punish you for trying to protect your privacy.
Except this is exactly what happens. It's not "unfortunate". It works like this by design.
If google cannot track you then the captcha will force you to do something that no other captcha system does: give you even more challenges even if you have solved them correctly. You will spend the next 5 minutes solving captchas correctly and then at the end it will tell you you've failed. This again is unique to google: correct answers lead to failure. The problem immediately goes away if you let google track you, it doesn't matter how bot infested the network is. No other captcha system does it this way.
Google is clearly doing this to get free labour to label their datasets, force people to have a google account and encourage them to use chrome.
If you do everything you can to prevent google from knowing who you are, don't be surprised when they behave like they don't know who you are.
I took that to mean they were blocking cookies
A lot of the behavior that captcha exhibits is in part a function of feature analysis from ML models - features that may seem ridiculous to layman humans but make sense to a neural net plugged into the data.
It's not bullshit, it just depends whether your website is being targeted directly or not. We're targeted directly and the robots hitting us are getting the CAPTCHAs solved, presumably with human help.
Another thing that sets it off is virtual machine usage, I can be logged into chrome and gmail on the same residential IP for hours but the moment I try to search google for a problem inside a VM it's a minute of slow loading captchas.
Have moved to bing instead, that sort of wasted productivity is a burden.
Also, I’m just not interested in the remaining 10% "legit" traffic from people who are aggressively paranoid about their privacy. Almost all of them ended up being dickheads who were using TOR to abuse other members of our community.
To the people who think every website should treat TOR users with respect, please understand that you are intentionally making yourself indistinguishable from the mountain of robotic junk, abuse and human dickheads. It's not my fault that you have chosen to do this, and it's not my job to provide you with tools to prove you're not a dickhead.
...or maybe voting me down will change the facts.
Yeah, that's totally going to work.
My VPN must have gotten white listed (or cracked down on some of their traffic patterns) because that stopped.
I personally don't care too much about the hassle, but I really don't like the idea that I'm basically playing Artificial "Intelligence"/doing clickworking for the not so community oriented efforts of Google.
Do you have any data on this?
> On the other hand, anonymity is also something that provides value to online attackers. Based on data across the CloudFlare network, 94% of requests that we see across the Tor network are per se malicious. That doesn’t mean they are visiting controversial content, but instead that they are automated requests designed to harm our customers. A large percentage of the comment spam, vulnerability scanning, ad click fraud, content scraping, and login scanning comes via the Tor network.
The obvious caveats apply, of course. It's completely possible what Cloudflare saw at the time is no longer true and TOR is no longer mostly spam. It's equally fully possible that the traffic Cloudflare sees is wildly unrepresentative of what TOR traffic actually looks like, and it's mostly people worried about their privacy. This is just the data we have at the moment.
And when you have malicious traffic swimming in an anonymous pool, there's no practical alternative but to block all of it.
And also, isn't cloudflare also the one to allow booters and stressers to be online behind CF - and they used stolen CC's to boot?
The Tor decisions to screw users over is just the cherry on top. Especially is egregious is when a captcha is demanded on even a simple static page. Seems pretty obvious what's going on here.
You continued to post flamewar comments. We ban accounts that do that repeatedly, so could you please stop? We've already had to ask you more than once before.
We're not robots. We can shun white supremacists and leave everyone else alone. This isn't a slippery slope, it's just good sense (no more white supremacists, hey!). Humankind will get along just fine if we tack on that one extra rule and all follow it.
The definition is commonly agreed upon, and what "white supremacist" means is not at all controversial to most people. It certainly isn't so arbitrary as to be meaningless.
Now, the term may be misapplied at times, as may any term, but for it to be misapplied, it has to have an accepted application to begin with. A term without a definition can't be subject to definition creep, and the possible creep of a term like "white supremacist" is that wide to begin with.
Murder now, for some, includes abortion.
Censorship now includes, for some, private companies removing bad actors from their private systems.
Come on, that's lazy to dismiss it that way when society literally changes all the time.
Whether or not abortion is a murder is not about definition of "murder" it is about definition of "human being".
There's no doubt that abortion involves killing a living creature, the whole pro-choice vs pro-life debate is basically about one simple question: "is fetus a human being?". If you answer that with "yes", then every abortion becomes a murder, plain and simple.
This also explains why there will never be a compromise between two crowds: it is logically impossible to compromise on yes/no questions.
If you're putting constraints on Tor traffic, it's not because of raw throughput. It's because it's extremely poor quality traffic.
It should be default for the TOR browser for sure, if just a few people use it, it decreases the anonimity set.
The issue with just doing memory-consuming work client-side is that it only marginally slows down spamming. Spammers tend to use compromised machines they don't own. Unless you can make it prohibitively expensive to calculate something using machines you don't pay for - perhaps not a trivial ask - you wind up needing a different set of tools. This is why Google tends to look at things that will exhibit human variation rather than pure computation.
It's not that your ideas aren't good. I'm sure ARGON2 has a use here! It's that this might not be a problem easily solved by consuming more resources.
Click all boxes with traffic lights. Ok, well, this one box just barely contains the bottom right corner of the traffic light. Click. Nope, that little corner didn't count. Try again. Ok, well on this one, the right side of the traffic light is only barely over the line, so I won't click it. Nope, that sliver of the light mattered this time. MF!
Spambots will solve the Sorites paradox!
My naive assumption is that you should click the "refresh" button in these cases.
Another one is "click the mountains". It typically won't let you through unless you click anything with trees on the horizon, even if the terrain is clearly flat. Google's robot thinks mountains are made out of wood, and any human who disagrees is labeled a robot. It's insanity.
I'm guessing they do something like load up a batch of images and once N people agree on one, record the answer and remove it from the rotation. You end up left with the ambiguous images where people couldn't agree.
And, does the pole count?
The whole thing is way more stressful than it needs to be for what it is.
Does this look like a mountain to you? https://0x0.st/zzvr.jpg
Google's image classifier would think that's a mountain. If you disagree, google will classify you as a robot. After failing these sort of challenges a few times the user decides to play along and tell google what they think google wants to hear, rather than the truth.
Especially if this is all used for learning, enough people saying "that is clearly not a mountain" would reinforce that it's, in fact, probably not a mountain. Even if I got classified as a robot, I'm not sure I would think "oh, a system designed to classify images would think this not-a-mountain is a mountain", so I definitely wouldn't double down and keep marking it as a mountain. I'd, well, not. And assume the system is at least as good as classifying the images it chooses to use as I am.
Because every single time it asks me to classify mountains it rejects my answers if I don't click on trees on the horizon (and often trees on the horizon are the only "mountains" presented) and every single time it accepts the answer that such trees are mountains. I've gotten the mountains challenge dozens of times, the results are very consistent. If there is a group of trees on the horizon, that is asserted to be a mountain.
> "enough people saying "that is clearly not a mountain" would reinforce that it's, in fact, probably not a mountain."
Totally irrelevant because if I am trying to get through a google captcha, it's because that captcha is standing in the way of me doing something. My interest is in passing the captcha, not correcting Google's shitty image classifier. So I have absolutely no incentive to make my life harder by insisting on correct answers, and every incentive to tell Google what they want to hear.
I guess this is where the misunderstanding is. You don't think Google wants to hear the correct answer?
Trying to guess at what the daily/monthly flavor of "correct" is seems like it'd do more harm than good, resulting in some kind of nondeterministic guessing game of "well, trees on the horizon are probably assumed to be a mountain" that never settles on actually-correct answers (and, I'd wager, is often more inconvenient to the user than just answering correctly would be, because now there's a layer of indirection on what they think a system thinks of an image, rather than just what they think of that image).
If everyone just answered "no, that's trees" instead of a hand-wavy "I think you think it's a mountain", I feel like this captcha would be significantly easier for us humans (because we could actually give real answers), as well as less inconvenient for people who just want to pass on through and get on with whatever they were doing before a site wanted to verify they weren't a bot (because they can just, well, identify images instead of playing a game of "what does the machine think?").
They may want it but they don't reward it. I don't care what sort of answer they want, I only care what sort of answer they accept. I'm not going to donate my time to these bastards by doing anything more than what's necessary to pass their captcha.
> "If everyone just answered "no, that's trees" instead of a hand-wavy "I think you think it's trees","
That's just not going to happen: https://en.wikipedia.org/wiki/Prisoner%27s_dilemma
Thankfully they'll eventually fall back to the "click the images of _object_ until there are no pictures left with a(n) _object_" in it, but those clicking block ones of a specific picture are super frustrating.
It's always annoying though.
The image classifications that you do, however, are used to train the computer vision system.
TOR doesn't protect your privacy, it just lumps you in with—and makes you indistinguishable from—the worst crap on the internet. If you don't want to be treated like crap, don't try to blend in with the crap.
A very good addon against the shit from Cloudflare and Google.
The difficulty is probably cranked all the way up
if you care about all that, run a node without internet exit, and also strive to make your sites available on tor (hate the "hidden service" nomenclature)
I recall the modern (non-text) captchas used to be cars pretty much every time. Then, the images started getting grainier as they apparently wanted to improve their recognition in different conditions. Then crosswalks and store fronts became quite common, eventually with the same kinds of noise distorting images. Now I've started seeing things like buses, bridges, motorcycles, bicycles, etc. It feels like they've finished getting enough data for improving Google Maps and have begun moving towards collecting data for their self-driving car projects.
Also Google: "Our standard for 'what a machine couldn't possibly do' is identifying a stop sign."
Stuff you get now often requires cultural information, like "sidewalk" isn't a cross-cultural name, I'd guess almost everyone knows it, but meh. What classes as a store, is a lawyers office a store? Also, I seem to recall I had "click on all minivans"?? Not sure what one of those is, nor really what is classed as a car in USA, is an MPV a car [I'd guess that's what a minivan is?]? Do pedestrian crossing lights (green/red man) count as [part of] traffic lights? I've often wanted a short description of the locus of the terms they're using. Of course it never tells you if you failed, just gives you a further captcha, which it might have done anyway.
Current Assignee: Juniper Networks Inc
The V2 was just annoyingly badly designed because the questions were badly put.
I'd always assumed that was noise carefully tuned to throw off one machine learning model or another that was being used to beat the captcha, sort of like this: https://www.theverge.com/2017/11/2/16597276/google-ai-image-...
I think it might be the same when they switch to other types of objects (like crosswalks or bikes). Someone's model got too good, so they had to change to something else. I also get the impression that they add delays to the tile refresh before they do that.
I suspect Google now uses robots to generate captchas for humans, under the assumption their image recognizers are far better than anyone else's. They already have some very well ones for other products (self driving cars, street view) and lots of street-level city imagery. That would explain why their captchas are so difficult for humans to solve -- they're testing if you see things like their "AI," not like other humans.
I was thinking it was trying to dirty up the image just like the lenses on cameras get dirty. What happens to the image recognition when there's water spots, dirt, mud, etc on the lens that keeps parts of the image obscured?
If you have the clean version of the image, you need to get that classified by a human - then you can throw noisy versions of it into the training set for your AI. You don’t need to ask a human, hey, I added noise to a picture of a yield sign. Is it still a picture of a yield sign?
With the possibility of almost uniquely identifying us on the web through fingerprinting... Google, of all companies is in the perfect position to know that my web request was made by me... And therefore I'm not a robot.
You can only conclude that recaptcha is a ml training exercise.
They're not secretive about it
The article explains that this is part of what reCAPTCHA does, e.g.:
> Finally they combine all of this data with their knowledge of the person using the computer. Almost everyone on the Internet uses something owned by Google – search, mail, ads, maps – and as you know Google Tracks All Of Your Things. When you click that checkbox, Google reviews your browser history to see if it looks convincingly human.
But your point is otherwise right in that it's used for ML training, which Google admits as another commenter pointed out.
Human [n]: Entity that uses Google®-brand services.
— Google Dictionary, 2020 edition
Or, maybe they feel I'm not pulling my own weight, seeing as I rarely ever click on adds. They probably need more monkeys to feed the beast so I get selected to train their AI beast.
Edit: It could also be that I'm always running on incognito mode.
If one looks at the history of Google Books one can see that they started with big ambitions, but hit copyright in quite intensive ways. That also changed their approach to other projects. Clearing all rights internationally isn't easy.
This is false. EU governments have already placed significant restrictions and fines upon US tech companies in the past. There is no reason to believe that they won't be able to again.
It's a great situation for the US economy but a very bad strategical position for Europe.
But I'm kinda hoping that the reason I keep having to identify cars and store fronts is that my refusal of third-party cookies is causing them to have no idea who I am. But that might be the optimistic view.
In any case, I wouldn't mind if sites stopped using recaptcha.
there was a decent write up from a whitehat showing the damage, but I can't find it
I can't imagine some forum has enough traffic to meaningfully screw up their data, and they don't tell you which of the two words is the unknown word, so you're just going to fail a lot doing that.
If I remember correctly, Google later on also sometimes showed two "known" words or, if they had actual other evidence that you are human, two unknown words.
- Couldn't a computer just temporarily hire a human to prove there is a human involved?
- Why are we using recaptcha or verifying humanity anyway? I understand stopping spam, scams, and fraud, but scraping already public data doesn't present significant harm.
I don't love the compromise of paying for things with my data or by training Google's AI, but it's hard to say users aren't getting anything out of it. That said, I do miss the old reCaptcha.
Very few low-traffic blogs that I see use (or need) CAPTCHAs. I know that the ones I run don't.
> I don't love the compromise of paying for things with my data or by training Google's AI, but it's hard to say users aren't getting anything out of it.
I don't think they are getting much, if anything out of it -- aside from being increasingly punished for defending themselves against being spied on by Google.
If this somple thing comes from a popular WordPress plugin the equation for the spammer changes, of course.
But it also sucks the first day you get an attacker who solves it once and then spams you thousands of times.
Modern spam tools are pretty impressive these days and minimize the targeted work the human spammer needs to do in these cases. In the early 2000s, you could set a custom question and then assume no attacker is going to manually code for your little blog.
But even in 2008 I was using spam software (out of curiosity) where you could import a massive blog list, and it would pause spamjobs with failed comment submissions, let you pencil in a value for this unknown field, and then click resume.
You could also choose other actions for that field like "prompt me each time" and sit at your computer multiplexing your labor across hundreds of blogs. And that was pretty polished ten years ago.
Fair enough, but you won't get Google's spam filter or availability either, which your privacy was paying for.
My point was just that even if something is provided to the customer for free, doesn't mean it's easy to produce. That causes a lot of the issues my non-tech friends have with understanding the scope of work. Just because social media is free and easy to set up as a customer doesn't mean developing a social media is easy at all.
Overall it’s been a good experience. I run into a few sites which when I send to them classify my email as spam or grey list my sending IP so mail doesn’t get through quickly but then I used to have the same spam problem with some sites running my own domain through google apps.
This book offers one set of proposals for "Data as Labor", inspired by Jaron Lanier: http://radicalmarkets.com/chapters/data-as-labor/
And there's going to be a lot of discussion of the idea at the RadicalxChange conference in March (https://radicalxchange.org/), including with Jaron himself as well as the book authors. (Disclosure: I do the conference website as a volunteer).
If you're really paranoid, randomise combinations of distinguishable fields (name, email, phone, age and hidden fields) every time you generate the form, so even if a bot herder manually maps names to fields one time, it'll fail the next. At this stage it'll be cheaper for the bot herder to use Mechanical Turk, after which even Google's captcha is compromised.
Or a blind user who might actually rely on both labels and names. That's a bit like what arxiv does, they have hidden links that ban your ip when you crawl, but the links aren't hidden for AT users. I got myself banned that way once.
Accessibility is very important, and if accessibility features are implemented well they'll often be useful even to people without disabilities, but do any CS/SE or code bootcamp programs take the topic seriously? I'm sure it must be taught somewhere, but it doesn't seem to be common at all. Can you even imagine 21st century university architecture department that didn't cover ADA compliance? That'd be unthinkable.
I can easily imagine it: architecture departments from universities in other countries don't necessarily have to cover compliance with USA laws.
maybe to save them a few $ from bots and spam (bandwidth and storage is very cheap today) they might be losing new users by the thousands (and traffic acquisition is far more expensive than the formers)
Recaptcha isn't obnoxious for fun, it's obnoxious because this is the state of the arms race right now. There's also the challenge of creating a captcha that allows blind people in.
For example, since you're here and HN uses Recaptcha on its register/login form, it seems like the compensation was adequate.
Which is one of the reasons why the presence of reCAPTCHA is strong push to avoid that site.
> since you're here and HN uses Recaptcha on its register/login form, it seems like the compensation was adequate.
Perhaps so. I don't remember doing a CAPTCHA to sign up, but I don't dispute that I did it. However, I've never been presented with one after signup. If I was, I wouldn't be here.
You give Google training for ML models.
Google gives the site provider the service of excluding bots from submitting the form.
The site provider gives you whatever was provided by the form you were trying to submit.
No one is uncompensated.
First, it seems tacky scrounging for peanuts from the users' captcha work. Or it's like a product/services website showing Adsense ads. It's a cheapening message to send.
Second, since you make more money from more captcha volume, you're incentivized to maximize your use of captcha which is at odds with every complaint in this comments section about captcha. Most sites only use captcha to gate low-volume actions like register/login (e.g. HN).
They created their own Ethereum token too which always puts a bad taste in my mouth these days.
Finally, it doesn't address the upstream complaint that someone else is profiting off the user's "work" rather than the user. Though I don't find that complaint very reasonable. And a tiny fraction of a cent sounds about right. The truth is that users benefit from anti-abuse systems. The number of bots that HN's recaptcha on register/login has stopped is worth that tiny fraction of a cent to most users.
Sites can set the difficulty level necessary for their application. Some are under continual targeted attack, others are mainly keeping out rogue automated spambots from their comments section.
The user is typically getting a free service, a better site experience due to less bot traffic, or both. I think sharing the value of their work with the website is a fair deal.
As for using blockchain tech for ledger functions, that is all under the hood: websites can cash out to dollars as they prefer.
(disclosure: work on bot detection at hCaptcha.com)
Yes, mainly because we're talking about fractions of cents. Also, it's not for free; the website and its users get a good anti-abuse measure in return.
There's a big difference between something that cannot make money and something that makes pennies for the site. But, to be fair, 99.9% of users aren't going to notice the difference in captcha branding either way unlike my example of a banner ad on a retail site.
My main reaction is that the UX incentive to minimize user exposure to captchas seems to work against the primary pull of using hcaptcha in the first place.
Though one site I can think of that has a captcha behind every action (every post) is 4chan. Maybe you can get them on hcaptcha one day. It would at least help you test your tagging system against vandalism. :)
I didn't find any pricing examples on hcaptcha's website. For all I know, people are bidding 5 cents per image.
Anyways, I definitely want to see more serious contenders in the captcha space so that we all aren't contributing to Google's middle-manning of the entire internet, and I'd like to try hcaptcha even out of curiosity.
If it means no/fewer ads to support a site then the user benefits because they don't have to pay real money to keep the site up.
If you use my referral URL I get a bump in the queue:
When you search for an address on google maps, that little tiny house number on the house was once a captcha image and now google knows that number so it can take you to the exact location on a map when you search for that number.
Everyone helps train the machine so when they want something from the machine then the machine is better at finding what they asked for. That seems pretty democratic to me.
disclaimer: work for google, nothing related to reCAPTCHA though. opinions are my own, etc.
A dividend on this could probably provide for a basic income.
Is it worth it?
Manipulative user-hostile websites can rot.
Personally I now use...
- iCloud.com instead of Gmail
- DDG for search though I do have to !g like 20 to 30% of the time for things like driving directions (from X point to Y point), local movie times nearby and flights.
I still use
- Google Maps some as its great for getting distance between X and Y
- Google News (is there a better substitute)
- Google Photos (is there anything that compares)
Hoping in time to rely a ton less on Google products.
Apple Maps works for me. I appreciate that's not the case for everyone, but it's come a LONG way.
I sincerely use Apple news (on iOS) and have been loving it, but appreciate it's not for everyone's use case.
Google photos.. yeah wow. There really isn't much like it. I've resigned to storing my photos myself on a private server and slowly making albums/things come together. But I have to NOT use google photos. It's too scary.
Gmail was easy
Youtube I use a fake gmail account that's not linked to me in the slightest and only use it on 1 iPad, else not logged in.
It's a quest. But I'll get there. Someone really ought to make a Google Photos competitor though, there's nothing that has the same level of polish right now.
Flickr app has auto upload from Android at least, I'd guess Flickr as Google photos closest competitor?
You could further approximate that with: "How much does Google's AI think this human's time is worth in future revenue?"
I for one intentionally inject errors into their image classifier until it lets me in anyway.
You know how they usually give you several questions to solve, even if you're quite convinced you solved a question correctly?
Turns out if you click randomly, they keep showing you new questions as well. If, after a handful of purposely wrong answers, you answer one correctly, they let you through.
I now purposely mess up the answers a few times. It seems neither slower nor faster than actually taking the time to do it right, but it takes less mental load, and it makes me not feel like doing slave labour for a machine.