Yes, they link to you below the featured snippet.
No, more people don't click, because they've taken the answer from your website and displayed it right in their search results.
For example: If I'm searching for "best nail for cedar wood" Google gives me the answer: STAINLESS STEEL - and I never had to click through to the website that gave the answer: https://bit.ly/2MdovdP
• Yes, this is good for users (it would also be good for users if Netflix gave away movies free)
• Overall, the publishers who "rank" for this query receive fewer clicks
• Google earns more ad revenue as users stick around on Google longer
Ironically, Google has a policy against scraping their results, but their whole business model is predicated off scraping other sites and making money off the content - in many cases never sending traffic (or significantly reduced traffic) to the publisher of the content.
It's for this reason that's I've stopped embedding micro data in the HTML I write.
Micro data only serves Google. Not my clients. Not my sites. Just Google.
Every month or so I get an e-mail from a Google bot warning me that my site's micro data is incomplete. Tough. If Google wants to use my content, then Google can pay me.
If Google wants to go back to being a search engine instead of a content thief and aggregator, then I'm on board.
I don't get why google thinks it's acceptable to critique my site without prompting. It honestly just feels rude. They want me to do a whole bunch of micro-optimizations on a site that already works fine because it doesn't fit their standard of "high quality". I think I've gotten exactly 0 clicks from Google search results ever and I don't really ever want any.
If it were possible to get a human's attention at Google I'd start sending my own criticism their way but of course it doesn't work like that...
On the other hand, it's surprising that you would get a notification if you had crawling disabled. Did you set this robots.txt up recently?
(Disclosure: I work at Google, commenting only for myself)
<meta name="robots" content="noindex">
<meta name="googlebot" content="noindex">
If Googlebot is not respecting robots.txt, and is crawling something it's been instructed not to crawl, let me know and I can file a bug?
(Disclosure: I work for Google but not on Search, speaking only for myself)
How do you tell Googlebot to not crawl your site and to not index it either?
Previously, one could use the undocumented "Noindex" directive in robots.txt, but this will be disabled soon: https://webmasters.googleblog.com/2019/07/a-note-on-unsuppor...
You can specify your index preferences in Webmaster Tools. Don't know if there's a domain-wide off switch in there, but there probably is.
(Still speaking only for myself)
Thank you for your comment, I'm replying via email to send some info I'd rather not share on HN, but will post the same redacted in HN. I used to (back when starting my web-dev career) run a one man show development team of a web agency and all our development/pre-prod sites (that had to be unauthed) had robots.txt to disallow all bots, but they still popped up in Google. Searching some of the old domains in google I found an example here: http://***.***/***, and attached is an example of it showing up in a SERP and a what the robots.txt looks like (and I'm pretty sure that the robots.txt has looked like that since that page was created).
In this case it is just one page that nobody will care about, and since I'm not working on projects that are open but "robots.txt hidden" anymore I don't know if it is as bad as it used to be, but I regularly see pages with the "No information is available for this page" whose domains have robots.txt's that disallow all bots but still show up in Google.
Please let me know if I missed anything :)
The robots.txt protocol gives instructions to crawlers about how they should interact with the site. If you instead want to give instructions to indexes, use the noindex meta tag.
In this example the robots.txt has clearly told all bots to not crawl this site, but the only way to read the meta tag (or equivalent header) is to crawl the site. So I assume that in this case google either assumes that it is fine to crawl URL's that it has found elsewhere while ignoring the robots.txt or it assumes that pages disallowed by robots.txt are "open for indexing/linking", which would mean that any page both disallowed by robots.txt and which has a noindex meta tag would still show up, right?
What is the intended behavior if a page is disallowed by robots.txt and still linked by another indexed page? Will it get crawled or just assumed to be okay for indexing/linking? Is there any way to tell Google not to index/link and not to crawl?
they can index without scraping. It is enough that other websites have links to you site. So the google bot follows the rules in robots.txt to the letter. "no-index" is the way to stay away from google.
Without the noindex part of robots.txt (which google decided to ignore not so long ago) this is not solvable.
I have a feeling the PDF viewer triggered it, cause on Mobile it defaults to showing the whole page which results in tiny text but that's easily fixed by the user so I prefer to leave it like that.
The rest of Google has a policy of "Engineers will probably say the wrong thing if we let them talk in public"
While I understand the problems with Google scraping content, as a user these snippets help me find what I'm searching for faster. If that's all you're optimizing for, Google is fantastic. There are certainly good arguments to be made for other models, but for search, stealing content helps. I'm not advocating stealing content, I'm just saying that it produces more useful results.
The first three result lead you to fake android blog telling you how you can easily root every chinese android device and specifically the M89 tablet...
The real authoritative result (xda-developers) only appears in the fourth position, under sight. It will tell you if you follow the instruction given in the fake blog post from the 2 or 3 first results, you will brick your tablet.
In a similar way the word "cbd" (for cannabidiol) has been hijacked by dubious commercial compagnies through fake blog posts filling pages after pages of google results telling you how great cbd is for the treatment of every disease on earth... But there is no trace of an actual study in these results. You will have to go with the less popular word "cannabidiol" to start to see some serious articles about it.
Google results can be hijacked and Google do little about it. May be because the ads shown in these fake blog posts are from google ads network ? I don't know...
But google result have clearly deteriorated these last years and the authoritative figure of the companie is not anymore what it was in the past.
OK, some people believe anything they read (especially if it confirms their existing biases), but that problem has always existed. I think Google’s occasional snippet fuck-ups are a drop in the ocean compared to the spread of false information through social networks.
But the long tail is important too. It's fixed now (yay) but for years you could search for "calories in corn" and Google would confidently present an answer 5x the true value, scraped from a site with profoundly wrong information. As Google moves to present more direct answers and fewer links, this risk increases.
It looks like they have backed off on the direct answers somewhat which is good news.
Very few new blogs and content websites are being set up.
All content is moving into apps and walled gardens. Part of the reason for that is that running a well researched blog will never pay for your time, so becomes a hobby thing, and most people are fine to use Facebook for that.
Well it also serves Google's users, to be clear. Though I should also be clear that I don't think that justifies it, since I think it's bad for the ecosystem in more subtle ways than are expressed in immediate user satisfaction.
And if you view Google instead as a connection broker, e.g. a middle-man between publisher and consumer, then Google is destroying their own business by snubbing publishers. Assuming that Google is still making rational, intelligent decisions, it follows that Google no longer sees itself like that.
A search engine is inherently a content aggregator; the functions are inseperable.
I mean, maybe not yours specifically. But snippets are great for users in the typical case.
I don't, either.
A site I used to own had a discussion forum on it. It contained a message along the lines of "Real Estate Agent X is a great guy. Real Estate Agent Y is a complete sleazebag."
The blurb that Google displayed for it was "Real Estate Agent X... is a sleazebag." And that was the first result for anyone who searched for that agent's name.
As you can imagine, I received many angry e-mails, phone calls, and legal threats. No, you can't explain to angry people that it's "just" an algorithm that told the world that they're a sleazebag.
I ended up editing the post so that Google would display a different version after its next scrape.
Turns out I didn't have optic neuritis.
I think it gives that one-shot answer to questions people have, even when the real answer is nuanced and multi-faceted.
Then the snippets show up, and they are presented in a similarly trust worthy fashion. But the snippets are really just the really just the result of which ever site has the best SEO, and that’s often a really worthless metric these days. The time zone and currency stuff is easy, because it’s math, but opinions aren’t. The thing is though that even if google didn’t have the snippets, those sites that gets snippets would still be the top results that we clicked, and we’d still get the wrong information. That would probably be better, because it might be easier to spot obvious bad sources, but I still think there is just a fundamental flaw in how SEO professionals have learned to game the google bot to bring the world useless information.
I mean, part of it is certainly on google. No one in their right mind wants to comply with Google’s ranking terms, unless you make money from google searches. Which means a lot of useful personal blogs have dropped off the face of the internet, unless you’re really lucky to see them linked on a place like HN.
I wish libraries would band together and make a privacy focused and curated search engine, because librarians are actually kind of good at finding you the correct information.
And anything Google does, is done at vast scale, which makes me, at least, think it might be substantially affecting society.
Although sometimes the site is actually correct and Google still gets it wrong by copying the info incorrectly or losing some context or qualifiers.
I loved zero-click results back when DucfDuckGo first introduced them, but I'm less enthusiastic about Google's implementation of them.
Yes, but even when they are curated the curators are usually unreliable and sometimes malicious.
The “therefore” is misplaced; curated snippets aren't always correct, either.
I have stopped using Google a few years ago, but just in case I keep this (or similar) add-ons of my Firefox.
I have no idea of the popularity of such addons, but they would also impact the tracking that Google does.
The click through google redirect also allows them to track things like relevancy of the content and time on site (if you return to google SERP by clicking the back button), in-case the target site isn't using google analytics (unfortunately most sites do).
 https://html.spec.whatwg.org/multipage /links.html#hyperlink-auditing
Hyperlink auditing can be blocked with uBlock Origin / uMatrix
(Disclosure: I work for Google)
(Disclosure: I work for Google, speaking only for myself)
It's embarrassing that I wasn't aware of this extension, given how useful it seems - thanks!
Asking because I'm not sure of the answer to this question and lately I've become even warier so I decided to uninstall everything except things I absolutely must have like colorzilla, grammarly and full-page screen capture. For adblocking I use brave and never ever touch firefox, opera or chrome.
There's an extension that appends a share=1 parameter to all quora links to prevent them from forcing you to sign in in order to view a post. I like it but I'm trying to minimize my extensions footprint and I'd rather write my own script to perform the same script.
The question is, how do you get to be sure that an extension is safe?
Yea a couple days ago I was checking the Places API, which they’ve built off user-generated content and scraping Yelp and others. They charge $17 / 1000 calls for certain items and don’t you dare cache anything for too long.
Great way to build a business: get data for free, wall it off and put a hefty price tag on it, then put your best lawyers around the moat for good measure!
That doesn't seem right.
It's like if you put a sculpture in front yard and get upset when someone points it out in their neighborhood tour company, even worse cause yard ornaments don't have standard accepted methods of saying "don't use".
1) use robots.txt
2) don't put it on the internet
This is the kind of argument people used to use as they flagrantly violated your copyright by cloning your article on their own site. "You put it on the Internet, so it's free for everyone to copy."
The law says no such thing, at least not in any jurisdiction that I'm familiar with. Contrary to popular belief in some quarters, normal laws do still apply on the Internet.
If you infringe copyright, it's still infringement even if what you copied was freely available on someone else's site.
And if you state something that is misleading and harmful, it might still be defamation, even if what you stated was just an automatically generated snippet that takes a small part of someone else's site and shows it out of context.
It is more like if the guide that used to send visitors to your property has set up their own boot on the best spot on the sidewalk next to you and are raking in money because of the useless (often, in the last few years) ads they have plastered all over it.
Even if it is an educational non-profit resource you don't want that as some of the details get lost when visitors only reads the guides summary instead of taking a closer look for themselves.
And according to people on this thread they will also complain and/or come with suggestions about how you can make it even more useful to them.
And for that, it's a question of copyright. It turns out, in the US, if something is publicly available it does not make the copyright a part of the public domain. Thus the original author still retains copyright unless explicitly stated otherwise.
There is an exception to this though, which is called fair use. And for that, I'd recommend reading this: https://amp.theatlantic.com/amp/article/411058/ Book snippets by Google searched were deemed fair use.
So the question remains, would website snippet similarly count as fair use? What will the federal courts rule be? And when it comes to fair use, that's the only way to know if it is or not.
In answer to your final question, I'm not sure whether this use of snippets in search engine results has been tested in any US courts yet, but the issue of search engines showing enough content from the sites they link to that users never actually go through to the original site is sufficiently controversial that the EU's recently passed copyright directive includes specific provisions aimed at exactly that sort of situation.
Here is where your argument falls apart. The web is a public space - it's not your property or your front yard. It's more akin to going to the town square wearing a fancy hat and getting upset if people look at you and your weirdly shaped headwear.
You're wrong here. Just because it's a public space does not mean nobody owns the property. As a simple example, a shopping street is usually a public place. That does not mean that all window displays, doorways and adjacent buildings are automatically a free-for-all.
In fact, only "the tubes" of the web are a public space. The rest is owned property, even if there are no visible fences.
It is very surprising to read this on a board where many people write code: if a dev found unlicensed code, they would certainly not think it is public domain.
Citation? I thought snippets are just for display, not ranking.
TIL. That's actually a good idea. Does that eliminate all kinds of snippets? NOARCHIVE may also be of use.
Who made this contract? I never signed one. If I came to your place of business and copied your content and provided it somewhere else, I would be infringing your copyright. Do I have to put up signs specifying that at my place of business? Why is this any different? My web content is not the property of someone else and by publishing my information that is in no way an implicit grant of the right to reproduce it.
Although there are other criteria to consider, Google's snippets clearly violate that particular tenet.
See #4 https://fairuse.stanford.edu/overview/fair-use/four-factors/
I wonder how true that assumption really is any more. The quality of traffic Google drives to sites I operate is very low compared to all other major sources, with much less engagement by any metric you like, notably including conversions. The only reliable exception is when we're running marketing campaigns in other places, which often result in spikes in both direct visitors landing on our homepage and search engine visitors arriving at our general landing pages.
There is this conventional wisdom that SEO, and in particular playing by Google's rules to rank highly in its results pages, is the only way you can run a viable commercial site these days. Our experience has been exactly the opposite: our SEO is actually quite effective, in that we do rank very highly for many relevant search terms, but it makes a relatively small contribution to anything that matters. And really, when I write "SEO" here, I'm only talking about general good practices like being fast, having a good information architecture and working well on different devices. We don't change the structure of our pages just because Google's latest blog post says X or Y is now considered a "best practice" or anything like that.
Of course I have no way to know how representative our experience is. YMMV.
Hoping to stand out on Google results as a business plan is recipe for failure. You are one algorithm change from going out of business.
Then why can't publishers scrape google?
Interesting way to put it - the biggest bully with the most money wins!
- This site is optimized with the Yoast SEO
- This site is optimized with the Schema plugin
Yeah, optimized to death
Searching for "best car engine oil" has certain brands displayed straight on the featured snippet. Who cares about the click if Google found your customer for you and got your message through for free?
That means your marketing department can no longer justify investing money in Google SEO, which means less optimization towards Google's crawler, which means less reliable search results, which means less Google searches in the long run.
This one is actually reverse. Google search doesn't net google any money if people don't actually click the link, since ad revenue for google search is Per Click, not per view (per mille).
The incentives for them are actually reversed - increasing the amount of clicks into external websites, specifically advertised links, increases their revenue. (which is why there are so many advertised links on a search page)
I know that Aylien has an API for this but it's out of my price range.
Even if they were responsible, it's still legal to lie. You don't see pseudoscience websites being taken down because they are objectively false either.
Similarly for Twitter.
This is 100% wrong, the opposite is true. The law explicitly protects website operators from being liable for content posted by 3rd parties while simultaneously granting them the explicit freedom to curate content that they deem objectionable.
> No provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider.
So content indexed by google absolutely falls under the definition of "provided by another information content provider"
Of course, there is plenty of room for google attorneys to wiggle, but in the end the objective for them is to 1) give credibility to a source and 2) to get the benefits of being the providers of information.
Google and other tech never claimed to be common carriers, and even internet service providers have been cleared of that status - barely anyone is legally required to transmit without discretion (it's pretty much just phone companies). So why make it about Google and Twitter, and start with ISPs?
On top of that every Locksmith and towing company had names like “AAAAA Aaron’s Locksmith”.
If they were trying to monetize you they'd show you an ad that links to your answer and take a profit on the click. Directly giving the user the answer they want is great for the user, but guarantees that Google won't earn any revenue.
So why does Google do it? Simple: because their competitors do. That's the free market for you. Google didn't start that feature, another competitor did; Microsoft made it their primary differentiating feature in fact (remember the "bing and decide" ads?). Google had to adopt the same behavior or lose their customers.
So no, don't blame Google, blame capitalism. This is precisely the kind of feature that you wouldn't get if Google was able to behave as a monopoly.
But as Google sucks up the consumer surplus, it's going to be harder and harder to make money from internet businesses, and the final result a few years down the road will be toxic.
The internet isn't going to work too well if its solely reliant on hobbyists.
Whereas Google was previously a way for sites to be discovered and for sites to generate revenue, it is increasingly becoming the sole source system where data is scraped and imported into Google, and Google keeps all of the revenue to itself.
Having to scroll down past ads, unrelated news, unrelated youtube videos, and ever more of these info boxes has pushed the actual content I'm looking for out to the second page. It's made it much easier to use ddg as default and use the !g flag only when absolutely necessary.
Their results have gone into the toilet - I ragequit Google search about once a day and do something else like forum searches.
Imagine you're searching for tail lights for your car or something, but you don't know the size, so you search "Astra tail light size". This might bring up headlights. Wrong but no matter, you'd go on to google "Astra tail light size -headlight -head" or something.
What Google seems to have been doing to me recently is ignoring those negated terms, ignoring quotes, and just giving me the same results again and again. It's really getting annoying. Google seems to assume it knows what I'm looking for, and that my search query is just completely wrong and not what I want.
Note that the car stuff is just an example, I'd expect Google to not give you headlights the second time. It generally but not exclusively happens to me when searching things that are more technical. ESPECIALLY when it's a consumer level thing I'm trying to get info on, Google likes to assume it's giving you errors and you're trying to fix it. Which makes sense for most users, but god it's frustrating when every combination of advanced search parameters you try does nothing!
Google search needs a checkbox or something to turn off it's cleverness and just do an actual search.
How likely a search time is "Yices", ffs? Feels like something that exotic ("statistically unlikely") probably is meant to be in the results by default.
This is SO annoying.
The problem is the business model breeds for this, and we end up replacing one abusive monopoly with another, until we can break that cycle.
For a time it seemed Free Software might ... free us ... from that, though as even that effort's biggest boosters (Eben Moglen, Bradley Kuhn, RMS) freely admit these days, we've been regressing of late, and at an increasing rate.
What's it going to take?
Also, it has gotten very hard to rank in google for new sites. SEO blackhat tactics rule, and even local businesses use them. Google went from win win to win-get lost.
Sadly, now that webmasters need it the most, investments in alternatives to search and advertising have dried up. There is almost nothing except G.
Today, I struggle to get sites I make to show up on Google at all. For my most recent website, even searching for phrases that are unique to my website doesn't cause it to rank. This is more frustrating to me, because I have a Google ad for my website, which drives all of my traffic - so I know Google knows about my website and what keywords are relevant to it.
Regularly cannot turn up results that I know exist - a modest change that has no relation to the query meaning and the results often turn up.
This is not the Google Search Engine I remember.
https://www.internetlivestats.com/total-number-of-websites/ indicates the number of web sites is still growing at breakneck speed.
Somewhat dated but still relevant:
says the growth rate for the number of pages to be still substantial.
I wondered yesterday: if you provide microdata, Google scrapes it, and you later decide to remove your sites from Google - is Google allowed to keep the microdata and continue to publish it?
Also, to quote from the Wikipedia article : An owner has the right to object to the copying of substantial parts of their database, even if data is extracted and reconstructed piecemeal
Because they didn't say "in the EU", and it not being copyright is not just a technicality. Copyright is about creative expression, and utilitarian collections of facts aren't.
They also didn't say "in the US". From context you can only assume "in some jurisdiction google cares about"
> Copyright is about creative expression
That's not true, or at least a very US-centric view. The Berne Convention, the international standard for copyright, reads:
"[...] shall include every production in the literary, scientific and artistic domain, whatever may be the mode or form of its expression, such as books, [...] works expressed by a process analogous to photography; works of applied art; illustrations, maps, plans, sketches and three-dimensional works relative to geography, topography, architecture or science."
"Collections of literary or artistic works such as encyclopaedias and anthologies which, by reason of the selection and arrangement of their contents, constitute intellectual creations shall be protected as such"
That's lots of things that are not exactly "creative expression" (even though exceptions for pure statements of fact do exist).
If there was no selection or you make the original selection irrelevant, while also giving your own arrangement, then there's no violation of copyright.
This doesn't provide any protection for the underlying facts.
Even more interestingly, we've still never yet resolved the question of why Google gets to lift your entire site's contents and re-serve them in arbitrary ways to their own profit in the first place. It's really just a thing that happens on the internet because it was happening on the internet before the lawyers got there. I've said before and still believe that if there was no such thing as a search engine and they were just invented today, they'd be annihilated in court as nothing but one big copyright violation.
robots.txt is a courtesy, not a legal obligation.
Granted, it's still a while away to get into that territory, I think most sites still profit from Google.
Well, I don't see the problem at google providing a cache to save WP's bandwidth. I block ads anyway...
I'm really not sure how Google can be protected by Section 230 and at the same time control and publish so much directly. Last time I read an article on the topic, google controls 23% of the top 100 sites.
Neither the CDA, nor section 230 specifically, create the sort of publisher/platform dichotomy people seem to be hung up on.
And Section 230 does exactly the opposite of what people commonly think it does. It's actually right there, in the text:
No provider or user of an interactive computer service shall be held liable on account of (a) any action voluntarily taken in good faith to restrict access to or availability of material that the provider or user considers to be obscene, lewd, lascivious, filthy, excessively violent, harassing, or otherwise objectionable [...]
That seems really easy to understand: you can delete nazi propaganda, porn, bad jokes, or just random user content from your platform without running the risk of thereby assuming liability for the rest.
And the problem is that even if someone finally comes in and shuts those actors down, Google kept all the profit from the malicious activity. In order to incentivize Google to police it's ad platform, we need to implement a requirement which seizes all revenue from malicious advertising, retroactively when a malicious account is flagged/reported.
If Google is losing revenue on allowing bad actors on their ad platform, they'll be incentivized to quickly respond to reports and remove them so that legitimate ads, which they make money on, can have those ad slots.
Have you published data about this anywhere, like a list of reported links that were ignored?
I really doubt this works this way. There's 3 assumptions here, for your scenario to play out, people must pass the following funnels:
1- The person notices the malware
2- The person associates the connection between the ad and the malware.
3- After making the connection they install a non-dummy adblocker. Dummy adblockers like the one by Eyeo whitelist google's ads while actually harming the competition. It benefits them! Note: if I look up adblock on google, uBlock is only mentioned on page 2 of google and only because it's mentioned as a competitor to adblock on a zdnet article. The first whole page is dedicated to the Eyeo plug in.
I'd say very few people will get through that funnel. My experience is that when my family and friends actually seek out help with their computers, they have let it go for years until the computer is a slow mess of malware, self installed spyware in the form of browser add ons and other crazy stuff.
I've actually known a person who buys a computer every couple of years when it 'gets slow' simply to avoid maintenance. The few that I know from IRL relationships that do use an adblock, mostly use adblock by Eyeo, simply because of the domain and ranking on google.
But in the country I live in, the USA, landlords are liable for many things their tenants do.
A friend of mine is a landlord and he almost lost a house he owns because his tenant was cooking meth in it. I don't remember the exact details, but the liability was no joke.
P.S. I don't agree with this liability issue, I'm simply describing reality as it is.
So ocdtrekkies point stands, imho. Google is directly profiting from shady activities through their services and therefore has no incentive to control or stop that behaviour. That's a tricky thing.
> Baker said Jumpshot’s data comes from 100 million devices worldwide, whose users have downloaded free security software from partner Avast. The devices include smartphones, laptops and tablets.
Even my fave from 10 years ago AVG seems user hostile (try turning it off, it's not easy!). I'd hate to see what the other's are doing.
Of course, lately, paying for an antivirus hasn't had the same value - windows defender, the much better browser sandbox and the fact that you don't really download executables anymore have contributed to reducing the risk of going without an anti-virus.
I feel the same way about a lot of outbound sites on Google. There are a bunch of things I just don't want to go to another site for. Off the top of my head:
- Exchange rates. Although this one I find infuriating because Google doesn't know how to correctly round off exchange rates (as in, there's a standard). This manifests as, say, showing 2 decimal places for AUD/USD when they should be showing 4.
- Mortgage calculator
- Song lyrics
I'm glad these are in search results, typically in significantly better versions. And I don't think anyone who runs a site built around a basic formula for interest calculations has any right to complain about it.
Of course this will be painted as "where does it end?" but not every surface is a slippery slope.
Just look at the likes of Yelp who complains about Google "stealing" their content. Well, Yelp is about one of the scummiest businesses out there. So I won't feel sorry for them, not now, not ever.
The one weird case here is AMP. Like I get Google's motivations here. Many companies develop terrible mobile sites that run badly or not at all and AMP IS much faster, generally speaking. Yet still it seems so heavyhanded with seemingly no opt out (on the consumer or publisher side). I don't really understand why Google wants to die on this particular hill.
Now Google is using the same content and depriving them of the traffic.
That's pretty scummy in my opinion.
User: Hey Google, what's the monthly payment on a 20-year mortgage with 3.9% APR?
Google: Oh, MortgateSite would know, go there.
MortgageSite: Here's your calculator.
User: Okay, let me put in the numbers ... awesome, thanks! Oh, neat, they can hook me up with lenders. Let me take a look.
MortgageSite: Thanks for the referral, Google!
Today, it's more like:
Google: Oh, that would be this much: ... . Here are some sites where you can dig deeper.
User: Um, okay, might be worth a look. Let's try MortgageSite.
MortgageSite: uhhhhhhh hold on a second. Hey, see our BUY THIS PRODUCT mortgage calculator. AND THIS ONE
User: Uh, okay, I'll just type in--
MortgageSite: HEY! It looks like you're new to this site. Want to get on our mailing list?
User: You know what? Screw it.
MortgageSite: Confound you, Google, for stealing our traffic!
User: Hey Gateway, what's the monthly payment on a 20-year mortgage with 3.9 APR?
Gateway: Here's your calculator, already prefilled with the data taken from your query.
The problem with all these little sites is, besides bloat, that the "Oh, neat, they can hook me up with lenders. Let me take a look." ends up with user getting malware and/or scammed. The problem with Google is that they can hardly be trusted at this point to do things like this in the interest of users. They want to be the frontend through which you access the Internet.
I used the word "Gateway" in my example as a placeholder; my imaginary perfect world recognizes that things like "currency conversion", "song lyrics" or "mortgage calculator" are data, which should be separate from the frontend used to access it. I dream of the Internet where things like these are API-driven and do not involve loading anything other than what's requested - neither ads, nor "value-adds", and definitely not all the rest of the webpage surrounding a mortgage calculator, which is bloat obstructing requested functionality.
 - yes, even mortgage calculator; it's a mathematical model, an algorithm, and code is data.
You're dreaming of what wolfram alpha is trying to build.
Edit: funnily enough, they parsed the input slightly wrong but you can quickly get to your answer.
Google Assistant (aloud): $3,004
Seems pretty straightforward, no need for a web page.
Actual current results: "Sorry, I don't know how to help with that yet"
While in the interest of "privacy" Google limits the amount of data the banks and others you interact with get from any competing solution, they will sell all of it in Ads Data Hub which is a Google Cloud product. So not so much in the interest of privacy but much more in the interest of Google selling more cloud products.
User: I can afford a livable four walls and a roof from a small amount of savings, without needing to mortgage decades of my future. Brilliant.
The problem with that is that if a poor person can afford a house with cash, then a slightly richer person would just go, "wow, I can buy a really big house! Or two!", and suddenly there is no more space.
Maybe they wouldn't be allowed to buy two, because in "a perfect world", a place to live is a necessity, not a profit vector. Keeping housing supply limited to boost house prices, to appeal to house-owning voters and wealthy landlords, to extract six figures of money from normal people over decades, is nothing like "perfect".
Here I say "landlord epxloitation, profiteering from necessities, houses left unlived in as investment vehicles for the international super rich, NIMBYism and entrenched residents voting against housing stock, AirBNB, broken zoning incentives" and then you say "FREE MARKET it's rich people's right to buy up everything and if you disagree you're dumb, free market is best market". And then we disagree to agree and both leave unhappy. Is that an approximately good summary?
We don't need to determine who gets which for this comment chain, we only need to question whether a 30 year mortgage should be the default way for a normal person to keep the rain off their bed, someone who isn't thinking of a beach condo in Malibu or a NYC penthouse, whether that is the best possible "perfect" world. I think it isn't.
Even then, mortgages aren't something Google can realistically calculate because lenders don't structure loans that simply. There is a fixed rate period, and a variable period after that.
First-time searchers may get suckered in by a personal finance site that relies on ads just as Google itself does, but after that, they'll be using the calculators that are on lenders' own sites when it's time to comparison shop.
Dear every web developer in the world who makes no effort to fight your employers' marketing department on this: I hate you.
let them deal with their own spam, then they'll understand.
Or various Five Eyes addresses.
I think I may make the effort to learn how to make a Firefox extension to automate this. Scrape "about us", "contact", etc. page in background for first email address, then autofill it into the email spam form. Boom, done.
In some places this a growth hacking/marketing tactic (that works), in others it isn't the marketing team but someone up high demanding it because they've seen it. Am still a marketer, but I've been in the latter and trying to fight the good fight.
But, this popup harassment is just going to turn away experienced users. Many of whom will be the educated, well-compensated demographic the site most wants to cultivate. And a subset of that group will have the ability to create or edit blockers, which will then be turned against the offending site. Thus making it more difficult for that company to reach the more desirable users.
That's what I would tell marketing. That wouldn't work for all companies. Some will be thrilled to get a boatload of inexperienced or naive users. But would be nice if the others would start to get worried about what they're losing with their bounce rate.
Take a closer look at the key phrase in my post: "makes no effort". If a web developer who's trying to pay the rent makes even a single comment to the marketing drones about how email signup forms might not be a good idea, and asks if they're sure they want to do it, then that web developer has my sympathy and not my hatred. But if they just cheerfully say "Yes sir!" when the marketing drones make that demand, then I hate them just as much as I hate the marketing drones, because at that point they are marketing drones.
MortgageSite: Please wait while we take 10 seconds to calculate an interest rate AND CHECK OUT THIS SPONSOR!
MortgageSite: Five more seconds. YOUR COMPUTER MIGHT HAVE A VIRUS!! DOWNLOAD THIS FREE SCANNER NOW!
User: Finally, now I’ll just move my cursor up and close this tab.
MortgageSite: OMFG YOURE GOING AWAY! Please come back and click on me more!!!
Clicking a google search result takes seconds to load the link while clicking a ddg link feels much faster. The additional tracking and redirections start to make things feel sluggish.
I remember when Google used to push content developers to be better: no paywalls, no overlays, faster response times, no scammy ads, no content farms.
Those days seem to be over. Google is constantly sending me to content behind paywalls and under pop up content.
Content farms were huge before Google started kicking them out in 2011. Have they made a comeback?
Google's latest attempt to keep the crap under control (AMP) is very unpopular on Hacker News, but I guess it shows they are trying to do something?
Have they not? Content marketing seems as alive as ever, if not more.
> Google's latest attempt to keep the crap under control (AMP) is very unpopular on Hacker News, but I guess it shows they are trying to do something?
Because we - myself and many of those other HNers disliking it - believe that with AMP, making web more performant and less crappy is only an excuse, and one that doesn't stand up to scrutiny.
FWIW: I agree that for a long time, the need to rank highly in Google's searches pushed content sites to be a better experience. And the equilibrium we've reached isn't as nice as it was 10 years ago. I just don't think "because Google" really captures the complexity here.
No, I don't blame them for hosting the content themselves for some queries. I never said that.
Just as Amazon holds some responsibility when I purchase a counterfeit item on their platform, it behooves both companies to have higher quality standards -- and both have the market strength to do it.
I found some instructions that says a website can opt out of Google's Featured Snippets with special HTML code:
<meta name=”googlebot” content=”nosnippet”>
It's not a perfect analogy, because obviously people still hope for referrals from Google. But this idea that because something is published somewhere, it's okay, reminds me of Arthur Dent finding out that the notification about demolishing his house was in the third-subbasement of a local government building in a file cabinet marked "Beware of the Leopard." From rough memory; please forgive if I mis-remembered.
Also going forward you have to add this to your site or I will steal anything that isn't nailed down.
<meta name=”fuzzz4lyfe” content=”notheft”>
I don't know if it's made any difference. Organic search traffic from them has slowly but steadily increased over the years.
Google now sometimes displays a snippet from my competitor's website for searches like "cheapest .io domain" . The snippet seems pretty useless as it doesn't include any registrars' names/links (and my competitor's price info is quite outdated).
In these cases, since the snippet is the 1st thing users see in SERP, and doesn't provide enough info to fully answer the question, I'd wager that my competitor is ultimately receiving the majority of clicks from these snippets.
Appreciate your datapoint. Also btw, when I "view source" the HTML of tld-list.com, I notice it has "nosnippet" inside a comment:
<meta name="googlebot" content="nosnippet">
IIRC, I eventually removed nosnippet because it caused google to not display microdata in SERP (see the "$25.99 to $99.80" in the above screenshot) that were desirable for my traffic. I instead replaced nosnippet with:
<meta name="robots" content="noarchive">
And this seemed to have the same effect as nosnippet, but with the added benefit of my microdata still being displayed in SERP.
Google "chancellor" from the UK and it'll suggest the question "who is the UK chancellor now" which expands to name the previous chancellor (an error Google attributes to gov.uk which would have been updated in July with the new appointment). Click on the link to Google the question and this time the snippet attributed to the same source will be up to date and give you the correct answer. But if I still need to load a new page to get the right answer I'd preferred to go direct to the source page without it serving me wrong answers wrongly attributed first.
And the other questions are also a rabbithole of rubbish to fall down (the question "who was the previous chancellor?" answered by "previous chancellors have opted for whisky..." is my favourite combination) or excerpt the wrong part of the page and then encourage you to search Google again to get the same unhelpful excerpt again rather than deign to look at the source period.
Getting this stuff right across millions of queries and unstructured data of mixed quality is undoubtedly an incredibly hard technical problem but I think it's probably better to sometimes let users leave Google properties than feed them inaccurate answers inaccurately attributed...
I think we need to go back a little and think. When the internet started, if you wanted to monetize your website, and attract users, what would you do? There were no search engines. Nowadays, people claim they're at the mercy of Google, but it's more that their very existence and business model was enabled and made viable by Google. Since it seems, without Google, no one would find them and visit their website. In that sense, I find it hard to say Google is taking things away, it seems more like Google is giving and giving, sometimes, it gives less, and sometimes it gives more.
In the 90s, people were making websites because they wanted to make websites, and some of them were informational, and some of them were just whatever cool thing somebody decided to publish. Keeping track of your favorite sites was harder, but eventually browsers added bookmarks and it became a bit easier.
Okay, now how do you find all of that stuff? Three answers kind of sprung up: sites that are list of sites, directories (Yahoo) and search engines. They kind of work, but finding good stuff is still hard, and while the best answer here is probably search, the indexes aren’t entirely comprehensive and the algorithms aren’t good at sorting good stuff from bad stuff yet, they’re too easily gamed.
Google came along and went with search, but they did two things differently: they built a comprehensive index and they had a good algorithm for sorting good stuff from bad stuff. This didn’t stop an entire industry from springing up to try and game it, but they made gaming their algorithm an increasingly expensive proposition, and just doing the right thing easier.
So now we have a good search engine with a comprehensive index, now the web has a different problem: as much stuff as there is, it turns out there’s not enough stuff about a lot of stuff, and this is obvious once you have a good enough search engine that will till you whether there is enough stuff about your term. Well as it turns out, it is already 2004 and people are just now figuring out how to make lots and lots of money with this internet and web stuff. Google has come out ahead of the pact as the obviously superior choice for searching all this stuff, their competitors get a lot better, but Google stays ahead of them, and they’re able to maintain a lead by simply being so much better there is no reason to stop using Google so long as they’re not evil.
They also figured something else out: people are looking for information, or the tools for getting that information. Who knows more about getting information than peoples whose jobs it is to find and surface the right information? And it turns out that some things you can’t find with just a webpage, you need other kinds of tools, like calculators. So they built them in, because at the end of the day, it’s the information that people actually want, they don’t need the debris that comes along with loading a full webpage.
tl;dr Google is an information services company and always has been. Characterizing them as a search engine or advertising company is a massive understatement and was always inaccurate.
I do queries like "2 tbsp in tsp" or "45 * 22" or "$25 CDN in USD" all the time and I don't think of them as a search.
Same goes for my Echo. I ask it all the time for today's weather or what time the Texas Rangers are playing or who is the starting pitcher for the Giants tonight. None of that feels like a search.
EDIT: added some sources (sorry, that's the best I could dig up in 5 minutes)
"How is Wolfram|Alpha's data checked?
We use a portfolio of automated and manual methods, including statistics, visualization, source cross-checking and expert review. With trillions of pieces of data, it's inevitable that there are still errors out there.
How is real-time data curated?
Wolfram|Alpha effectively checks real-time data (such as weather, earthquakes, market prices, etc.) against built-in criteria and models. If an unexpected deviation is found, Wolfram|Alpha will normally indicate it, for example by showing lines as dashed."
Genius doesn't own the copyright to the lyrics and screwing with the punctuation doesn't create a new work, so I'm not sure they have much of a case against anybody. Maybe they could come up with a ToS violation or CFAA case against LyricFind?
Genius had to invent a watermarking system for texts: https://betanews.com/2019/06/17/google-genius-com-lyrics/
I do emphathize with the companies being scooped by Google, and so I have concern that they are viable economically and that Google compensates them for getting the info from them.
But in the mean time, it's a success for me when my search gives me the answer immediately instead of two or three rounds of indirection.
Are you saying that there's no excerpt at all above the search results, when you expected one; or that there is an excerpt, but it's not from the most appropriate/relevant source?
It's annoying when there's a page that has all the information you might need, but to find it, you first have to scroll past a bunch of Stack Overflow posts that are more suitable for dilettantes who prioritise trying a quick fix over learning best practices. But when you do make the effort to find the most promising link and your browser allows the search engine to record your choice, you at least get a vote regarding how the ranking should be adjusted. When users just look at the results but don't interact with them any further, that feedback vanishes.
A win for Google isn't necessarily a win for Google users.
If you have been to a local news site recently, you might agree that they are similarly "scummy," overrun with banners, popups and native ads.
no dark patterns, loads quickly, uses fathom for analytics so no tracking. There are affiliate links to LendingTree, but they're just text links and i tried to be non-obtrusive to the ux.
also, have an android app that just hit 5,000 installs since being launched in mid-April: https://play.google.com/store/apps/details?id=io.mortgagecal...
currently ranking #3 for 'mortgage calculator' on google play. pretty proud of that! beating out Zillow and Quicken's mortgage calculators.
For example I was searching for a lease termination letter template. There are sites dedicated to documents, letters, forms and templates. Many of them are pretty spammy or messy. Then I found a page of a real estate portal. They had just what I needed, without superfluous ads with nice UX. It's in their interest to help people to do this, because then they will probably search for a next apartment. They, of course, are fast to offer additional services on top like searching for the move services. But nevertheless, providing this service is something on edge what they are doing - a marketplace for real estate. Only then they will get some money, so the side experience should be as smooth and painless as possible.
Another example is a tax/salary calculator on a job board. Or... most what Google is doing to some extent, but with different trade-offs.
I know it's nothing new, but it was interesting for me to notice it recently so directly.
I though AMP was opt-in for publishers? As in you have to write the front-end spec and pick a CDN to cache with.
It is but Google will only put you in the carousel at the top of the search if you use AMP. That's a significant enough driver of traffic where publishers can't afford to lose it.
we had an emergency AMP project after our traffic dropped 35% due to a smaller competitor implementing AMP and google rewarding them by boosting them in the search results.
So unless you can survive getting bumped off of the first page of results, not implementing AMP isn't optional.
I now use https://github.com/bentasker/RemoveAMP, but that's only possible because I'm lucky enough to have a Jailbroken iPhone. :/
I guess you'd call it an "extension"? It feels weird to use that term, as it implies (to me) that there's some sort of extension framework in place. There isn't, it's just code injection.
Yeah extension does sound weird.
Every link aggregator that indexes your AMP page will cache it, including Bing and Baidu. That's what makes safe prerendering possible.
If the right investments were made, every software company's product could be a feature of Microsoft Windows, and Microsoft Windows could be a feature of Intel CPUs.
A lyrics website recently put fake lyrics on their site to prove Google was copying, and sure enough the fake lyrics showed up a couple weeks later.
Google isn't just taking content from "evil" companies like Yelp, they're doing it to everybody.
Job search, shopping, song lyrics, news, and who knows what else, is all being somewhat blatantly lifted. And nobody can stop it because blocking Google is a death knell to any site
Granted, but the comment you replied to was about the difference between a feature and a company.
I'd rather just type CAD to USD.. or BTC to USD, or any other combination in my address bar than type xe.com, load a bunch of libraries, including Facebook connect.
Ghostly reports 6 trackers blocked.
Furthermore, it took 3.45 seconds to load xe.com in Incognito, while google only took 1.1
Or retirement calculator: https://www.daveramsey.com/smartvestor/investment-calculator
New versions of Chromium will begin hiding the full URL entirely, probably so that more and more marketing/targeting related UTM parameters can be jammed in without the user ever even knowing.
Pay to play baby if you want to show up
Monopolization of cultural information is antisocial.
I set up custom search keywords in every browser I use for this, among many other custom searches. In that particular case I query xe.com from an address bar search (not affiliated with them I've just always used the site) as xe <amount> for my most common currency conversion and others like xeyen for Japanese rates, xegbp for British Pound, etc. Takes me right to the page.
Even for various Google services I set them up to save unnecessary clicks.
I generally search for "1000 USD in AUD" as a workaround.
(1000 * incorrectly_rounded) / 1000 == incorrectly_rounded
If I want to look up an actor I'll just add 'IMDB' to the end and get a much better experience on the site itself than google could ever offer in their search page.
Exactly why go to those sites when you can get the same experience without leaving Google? And its not like Google isn't serving the ads on both Google and those "scummy" websites anyway, and no one is going to out dark pattern Google.