It reset itself the next day, but it really opened my eyes to the power they have slowly obtained over my life.
It took about 5 minutes to get past the (Google) captcha.
First I clicked that I'm not a robot, but then it gave me the photos of doom. After clicking all the right pictures of a bus, say, then I had to wait while they fade out and fade in new ones rather slowly. There were about 5 rounds of selecting. Then it would pick another noun and ask for that. After a few nouns of a number of rounds each, it cycled back to the first noun. A doubly-nested loop of time-delayed captchas!
Thankfully it wasn't important.
But it makes me appreciate how someone getting locked out of their own Gmail or Google Docs must feel.
Needed to do your job today? Tough luck, better luck tomorrow maybe.
I suspect that if you have a machine that is less likely to be fingerprinted uniquely, you have to solve more captcha. In this case, I happened to be using Firefox on Linux with a few privacy related addons running. If they believe you are a bot, because you're not watching their ads, or because you are running a privacy enhanced environment, I suspect their algorithms will slow you down.
It would be interesting to prove this hypothesis. I wonder if you browse with Windows + Chrome + No Ad Blockers, if you get right through the captchas vs. if you use Linux + Firefox + Ad block or maybe even the Tor browser.
(Alternative hypothesis: If you solve a captcha correctly, they intentionally give you a few more, to get more good data for their AI algos.)
I was recently developing a web frontend that had a captcha implementation on the login form. The first couple times I solved the captcha there was no problem. Then once it got to fourth and fifth one it started making me solve a few of captcha pages in a row. Eventually it was forcing me to solve several pages of "Select the picture of the fire hydrant" over and over, often marking it as incorrect when it definitely wasn't, and then restarting back from the beginning so each run-through took multiple minutes forcing you to solve dozens of captcha pages before accepting it as correct.
So, to add to your hypothesis, they're probably doing this intentionally not only to get more data for their AI algos, but also to mitigate people training bots on CAPTCHAs.
EDIT: Also the fact that you're being downvoted is baffling to me but my tinfoil hat persona makes me wonder if it's some orchestrated downvote because you're pointing out a hypothesis that Google is trying to keep hidden.
The fact stands that google locked me away from my documents and intentionally undermined my ability to understand what had happened or rectify the situation.
Collateral damage is still damage.
It also might just be a broken integration with recaptcha.
If you select the unknowns as well as all of the known answers (and don't select the wrong answers) they can train their AI robots that the extras you selected are also a part of the "has traffic lights" dataset
Sometimes I use one of these operators by mistake and get instantaneously hellbanned from using google. The first time the captchas kept reappearing a whole day. I read somewhere that the ban is resetted the next day, but the next day I was still getting the captchas and couldn't use google for anything.
I deleted all cookies and suddenly I wasn't banned anymore. So this looks to me like a captcha bug, or a chrome bug, or both. Probably both.
There are many other search engines as well as other ways to access the google index.
Assuming it's traditional Google search from your ISP, there other alternatives such as using startpage.com or Tor.
Its a perfect example of passive aggressive dysfunctionality when it comes to big tech.
I can dictate an email to my phone or search videos based on the semantic interpretation of their automated audio transcriptions (yes I've noticed Youtube will sometimes do this) but I can't search for some unpopular technical term because I'm obviously a bot if I do this. Thanks for correcting me Google, of course I meant to search about "laser hair removal", not "laser air breakdown".
Spam detection algorithms usually have a training set. One of the features could have been "uses_feature_x", which, according to the training set, is known to have a high probability of being a spam bot (because humans rarely use those features)
Power users are the ones who know how things work; they are difficult to manipulate and mislead, will use their knowledge to consume content in the way they want (blocking ads, stripping DRM, rooting devices, etc.), and are in general not docile and "obedient".
In summary: power users know how to control their destiny, and this is something companies like Google are opposed to, because those companies want to be the ones in control.
Why is everyone so actively opposed to power users? I blame tbe field of UX that
- instead of working to improve products so that ordinary people can achieve more
- always decides to work on dumbing down the products so "everyone" can use them.
World of Warcraft became more and more accessible over time, and eventually reached a point where it was so accessible that no one wanted to play it since it wasn't fun and there was nothing to achieve.
Diablo 3 was meant to appeal to the mass market but it was really obvious to everyone that the game was horrible since it wasn't actually meant for anyone in particular. They literally had design choices that were obvious to teenagers that it was a profit-making machine.
I haven't really been a video gamer in years but AFAIK they don't have the same reputation and are just an ordinary company now due to changing their model from "amazing product" to "mass market product".
* Power users are far fewer than ordinary people.
* Power users are hard to manipulate, and hard to satisfy.
That, of course, also tends to take care of spammers, malware as you described and just plain buggy bots. Those can be tackled with additional measures, but first and foremost Google cares about search latency. (And quality: higher latency reduces quality, too.)
Ridiculous. I'd rather wait longer for better results.
Even in the real world there is a correspondingly analogous saying: "Think carefully before you open your mouth."
I think Google have just stopped even trying to maintain their "flagship" Search service - that the only way I could explain how their search quality and website responsiveness could decline to the level of the old Yahoo.com and Altavista. And don't get me started about the sorry state of their Google Groups archive...
While more variety in search engines would be great, they'd still be choke points.
What's really needed is a decentralized search engine.
Okay, that's fine, but then I'd rather not see them in my google results. If wapo wants to put their content behind a paywall, cool, but google should be an index of things I can read. So I've been experimenting with various search flags to exclude them. Started getting hit with captchas just now.
I use this to lock out sources that are behind a paywall, or that only feed Apple News one paragraph and require you to click/scroll through to their site for the full story. Often nothing is lost because those sites are only barely functional on mobile anyway.
Well, kinda. But it's been a few years since it really did. It defaults to "show then what we guess they want rather than what they literally asked for".
Most people are absolutely terrible at keywords based searching. Which is not very surprising since most people are terrible at spelling out what they search to begin with, and they're also terrible at turning a sentence into its important keywords, so doing those two things at once brings all the bad.
To this day, even with current Google helping them out a ton more, I'm still amazed at most people complete inability to type out what they search for in any meaningful way.
Pay attention to how people ask their question when looking for something and you quickly realize they don't ask their question in a logical way, or using the words you would expect. Pay attention to how many time when someone ask you something you feel like saying "What are you actually trying to find/do/achieve ? What is actually your issue, instead of the half-way mess you just said that made no sense ?". And people are also really bad at realizing that and taking a step back, and tech user are not exempt; something like half of stack overflow question are broken that way (totally made up stat).
Then add a layer of "this is a computer I need to computerize my query !" and you enter the land of weird.
PS: with that said, allowing a full verbatim mode AND an option to keep it activated at all time for my account would be totally awesome
You can mostly solve this by creating a keyword search that launches a verbatim search. Depending on your browser, the syntax probably looks something like this (with the essential part being the 'tbs=li:1'):
Of course, this still leaves the problem that verbatim on Google doesn't always mean 'verbatim' anymore, but that part will have to wait until Google changes their ways (unless you put quotes around every word).
This is the same reason clicking once on the address bar in Chrom(ium) selects the entire text, and two clicks narrows it down to a single location - which seems the reverse of what usually happens. The design justification for this on Monorail was something like "most people use the address bar to perform new searches", instead of editing the URL they were already at, so it was better to save them that one click.
Current attempts to try to make everything a conversational AI are pure failure. Whoever is in charge of Siri/Alexa/Googlette/Whatever thinks people are saying "hey X, do Y" as if talking to a human. In reality, those people are trying to figure out the magical sequence of words and intonations that will make the thing do what they want, frustratingly and poorly reverse-engineering some inhuman and constantly-changing robot.
Other examples that bother me: randomly, I don't see my flights in the Google Assistant on my Pixel. I then have to type "My flights" in the search box to see them. Same for weather, hotel bookings, etc.
Sure, you are right, and moreover Google could interpret the request as an AND; but they don't return results that _correctly_ fulfil such a query.
So, practically the distinction seems useless?
To recapitulate, it's not my interpretation that the results don't match a logical AND, it's a fact.
The first obvious mistake in the list is that google doesn't default to AND anymore since ages. A list of two terms will be some random combination of one of them, maybe both of them, and occasionally none (and no, this doesn't happen due to fetch/indexing lag or stemming or autocorrect).
To get true inclusive searches you need to quote the terms, individually. People keep pointing at verbatim search, but verbatim performs phrase search, which is not what you want in most of the cases.
DDG suffers from the same. I curse them both. I've used some js to quote individual terms before performing the search to get back useful searches for technical terms.
But really, the number of times I now get pages which do not contain the exact terms I'm looking for is subjectively increasing.
also google purposefully returns error codes when searching certain numerical string ranges, probably to prevent people from searching for carelessly indexed credit card and social security databases
If 85% of searches are repeated queries, does this mean Google will treat a query differently if it is a repeated one?
If a query looks like a repeat query is it funnelled into a retrieving a set of predetermined results?
(To be clear, I am referring to queries that do not match exactly.)
No doubt this would be much faster than a "dynamic" search where any similarities to prior queries are ignored.
To keep things "fast" it might be necessary to treat the 85% differently from the 15%.
It might be beneficial for the search engine provider to encourage users to use repeat searches rather create than new ones.
As we know, Google is not transparent regarding the series of steps they use in providing search results from their web cache thus these questions are likely to remain unanswered.
>Search for X and Y. This will return only results related to both X and Y. Note: It doesn’t really make much difference for regular searches, as Google defaults to “AND” anyway. But it’s very useful when paired with other operators.
The author must be using a different google than me. It's been many, many years since google functioned as an AND search. It very frequently decides to drop one or more words from my search if there is a low number of results, it's extremely annoying. Once you could force the old behaviour with +<term>, and then later with quoting ("term"). Both of those now also tend to drop search terms for me.
If anyone has an actually reliable way to get a real AND search out of google, I'm all ears.
which helps us to create such fascinating queries like
Less stupid A/B tests (sometimes I'd get idiotic results until I report any error. It'll then fox itself magically as soon as my account is removed from the test ;-)
Less annoying "we know better than you what you wanted to search for.
Though I use it so... not giving any data to Google is the killer feature for me. I like the idea that the results only depend on the keywords and not me. Instant results are quite nice too.
I'd like not to indirectly depend on Microsoft though.
Edit: and I'm not bound to Duck Duck Go. I'd be glad to use something else too.
Also there's a whole bunch of wrong info in the article. The most blatant one is that they claim google by default performs an AND search, which even a cursory use of google will demonstrate is not true.
I think the coolest thing about DDG are the 'Bang' searches: https://duckduckgo.com/bang
I'd use if the typical browsers didn't already have that feature built-in.
EDIT: It occurs to me that they don't on mobile, so I guess it's useful there.
Also, even if they have thousands, I don't think it would be too hard to find a site they don't support. Who knows how long they'd take to add the keyword on request, if they decide to add it.
Pros and cons. No solution is perfect. I guess I'd find DDG's feature more useful if I relied more on my phone for such searches or used random computers I can't setup for myself.
Their product is worth the money if you work in digital marketing.
My search terms often get turned into something else for no reason. If I search for any less common word, there's a good chance the results would not contain that word by default. Maybe I'm crazy, but is Google really improving over time, or actually getting worse?
If space equates to AND, then how is it useful? There's really no technical reason to use AND, right? I mean, even if you nest:
good1 OR (good2 AND -bad)
good1 OR (good2 -bad)
The : is not needed; one can use a space. Also, I imagine this isn't really an operator, since it doesn't make sense to mix it with any other. Same with "weather:".
Heck, give me ten minutes and I will prepare 100 of these operators. Weak article.