One of the obnoxious things about SEO is that if one person is doing it everybody has to do it. It's not necessarily enough to simply offer a better product at a better price. Luckily Google does try to reduce the effect of SEO. I notice for instance that StackExchange almost always beats out Expert Sex Change links these days.
Ultimately Google's ability to quantify both quality and relevance will converge on actual quality and relevance. In other words, the way to "game" the system and practice SEO will be to create a site that is actually relevant with high-quality content. But we're not there yet, and who knows how close we'll actually get.
- make something your audience wants
- make content your audience wants to share
- "earn" links by creating useful resources & content
Essentially, the way to get great rankings in Google is by actually having a website that people want to go to and is relevant to their interests. Trying to game the system is much harder these days, and honestly for the most part isn't worth the effort.
When so much business is decided by a search engine, the plumbing company with people writing "10 crazy plumbing stories you won't believe" beats the plumbers who are out on call each day. Further, if time-on-page is a Google factor, the site with a slideshow or listicle is going to top the one with efficient info and contact details.
Side note. Ideally, your plumber's service shouldn't have to be "googled". It should be in some sort of directory, and we shouldn't be using Google for that. The problem is that people are so very reliant on Google to find everything for them that they don't know any other place to do it. And no reasonable competitor (that I know of) has come in to provide such a directory service.
Most "yellow-page" sites I've seen are giant data-dumps of next-to-useless and out-dated pieces of information. No wonder people "google it".
Minor anecdote. Just the other day, I searched for a known name of a website/company (can't remember). And the first hit was their website, but just above it was an Adword ad for them. Not thinking, I accidentally clicked the adword version. That was $1 right there taken from their advertising budget because of a broken web (I'd argue), and simply given to Google.
The set of people whose wants should be satisfied is much smaller than the set of people who want.
To say that peoples wants are bad for them or they should not get what they want is a paternalistic concept that I don't care for nor find particularly actionable.
Or, say, of concepts of value from ecology and systems theory.
But if you want to have conversations based on your Glory, by all means do.
I sincerely appreciate your letting us know that's your intent so as to avoid considerable wasting of time.
To get back to the original subject, reasonable people can perhaps debate whether the explosion in clickbait articles is "good" or "bad" in whatever sense, but it doesn't follow from that that "value" lies solely in satisfying people's most immediate wants.
Arguments against heroin are invariably rooted in negative externalities. I have never heard anyone argue that there is no demand for heroin.
We eat high calorie foods because our brain tells us its good even though we're a little fat. We fuck each other because our brain tells us its good, while we use contraceptives. We go out and try to be successful because our brain tells us it's good even though our base needs are already provided for.
Sure you can say that we're tricking our brain, but only so far as you can say everything we've done since we discovered conscious thought and developed agency is tricking the brain. Hence why I find pointing it out in a specific case a little silly. "Hey you're only doing that because it feels good!" Well yeah duh? Same reason any of us are doing anything.
Most people who use heroin do not become addicted. The addiction rate for heroin is 23% according to National Institute on Drug Abuse . Note that this is overall addiction rate, not the percentage of heroin users that are addicted, as most heroin addicts do not remain addicted.
In other words: The value of heroin for most users is the high, not that it satisfies an addiction.
I'm guessing by "negative externalities" you mean "people who object to having their property stolen by desperate junkies, or who would rather not get caught up in the violent fallout from drug gang wars."
How do you propose to put a value on that?
The negative externalities you mention are much more a result of prohibition than of the drug itself. But under current law I'll give you that they're real, though simple policy fixes would largely eliminate them.
The negative externalities to which I'm referring are primarily the addictive nature constraining the users' future choice, and the chance of an early death resulting from overdose (though that is arguably a result of prohibition as well).
an externality is the cost or benefit that affects a party who did not choose to incur that cost or benefit
See Stallman's opinion on this (I know, an extremist view, but I think he has a point):
"Words to avoid (or use with care)": http://www.gnu.org/philosophy/words-to-avoid.en.html#Content
> If you want to describe a feeling of comfort and satisfaction, by all means say you are “content,” but using the word as a noun to describe publications and works of authorship adopts an attitude you might rather avoid: it treats them as a commodity whose purpose is to fill a box and make money. In effect, it disparages the works themselves. If you don't agree with that attitude, you can call them “works” or “publications.”
I think there may be a correlation between this negative phenomenon of gaming pagerank and referring to articles and pictures as "content", as if they were second-class citizens of the web.
You put "earn" in quotation marks, because these aren't organically earned except in a tiny fraction of cases. Most of them are planted.
I'm not sure whether that is a good thing or not, but it would make sense that the sophistication of the AI used to game Google moves at about the same pace of Google's ability to stop people from gaming its algorithms.
My guess is Google doesn't do it because it costs too many cycles. Counting backlinks, supplemented by some very basic NLP, is very much cheaper and easier.
The irony is that I wonder if people would pay for - or at least not mind - guided and prompted personal search if it produced highly relevant results.
The sad truth is that the current sneaky lumping of broad demographic guessing with search history with backlink counting and a bit of NLP/timeline voodoo produces mediocre results for many kinds of searches. (I've just spent a very frustrating 15 minutes trying to find out if the raw datasets from KIC 8462852 are available online. Usually my search fu is pretty good, but I couldn't get a definitive answer.)
I'm not sure to what extent Google's sales model relies on this. It's much easier to sell advertising if you don't offer an SLA, because it becomes the customer's fault if the service doesn't provide high quality results.
A more effective service would increase ad buyer confidence and the price of niche ad sales would increase, but maybe not by enough to compensate for the loss of more generic ad sales overall.
What has been Google's saving grace has been their skill at updating their target, attempting to turn the chaos and effort from SEO towards improvement in content. They have made an impossible problem into a mostly-solvable control systems problem.
Brad Templeton made the observation way early on that his humor mailing list went to over ten million people. If people sent him a penny if they laughed, and only 1% did, he would make $30,000 a month or $360,000 a year just for curating jokes. Scale + pennies can add up to big dollars on an individual level.
PageRank, AdSense, AdWords, all fed this mechanism with a way to turn a small amount of work by an individual into disproportionate returns. It really has been no surprise that the web spam problem became much much worse than the email spam problem. There was a better built in payment mechanism.
It's not nearly enough to offer a better product at a better price. Our company, which was objectively superior to the bad imitators that followed us, was eviscerated by a competitor with a pedigree as a professional spammer -- his product rarely worked and we were constantly getting refugees once they were able to wade through the massive amounts of spam that promoted his solution and find some genuine information. It didn't help that he bought off niche webmasters to delete any mention of our service from their forums and only allow mention of his service.
I have accepted it as a reality that "SEO consultants" (read: professional spammers) are a necessity if your site is going to get anywhere. Note we've always been SEO optimized, meaning our site was always search engine friendly and it contained a great deal more relevant content than our competitor's site. But because our competitor enlisted his network of spam websites to manufacture backlinks and actively sought to exclude mention of my product from the internet by paying off webmasters, he performed much better.
People don't know and generally don't care if your product is better or worse. If you have a product that looks like it's functioning, that's all the development that needs to be done as long as it half-works 5-10% of the time. The rest of the money needs to go into spa--err, sorry, "internet marketing".
True, but they very often don't beat out numerous websites that have taken the open StackExchange content (or other similar, more specialized forums) and rehosted it on a new domain. This behavior should be easy to kill, Google should consider this blacklistable behavior and have an easy way to report any site doing it, with the number of StackExchange users out there these sites would be blacklisted within hours of being launched and before long peolple would simply stop doing it. But for some reason it is allowed to continue.
1) copy paste from from any random site to a StackExchange comment
2) report the original site for scraping StackExchange
What does this comment refer to?
Raise your hand if you want to go back to AltaVista/AskJeeves.
With Google, I feel like I have to fight with it. If I'm searching for something obscure or perhaps a word that is misspelled on purpose it thinks it knows better what I'm looking for. It also often returns searches without the word that I searched for and often ignores when I prefix it with + or put in quotes.
You can also click on the "Search tools" button and select "Verbatim" from the "All results" dropdown. This causes the search to only perform exact matches: https://support.google.com/websearch/answer/142143?hl=en.
And yes, I also spend a lot of time fighting Google's inferral rules. I remember doing a search for Biber at one point, and it asked if I meant Bieber instead.
However, 5 minutes looking through my Google Search History, and I don't see any examples where non-verbatim results would have been useful, so hmm.
e.g. "brakes NEAR ford NEAR (problem OR issue)" would bring back results with "brake issues with fords..." as opposed to AND where all words simply appear on the page.
no other search engine at the time offered anything near this power. the amount of crap eliminated by a properly constructed query was breathtaking
PageRank was one part of the reason we have a search as powerful as Google, the use of links to assign an authority score. Another big part was the use of link context -- which isn't part of PageRank but part of the overall search algorithm at Google.
PageRank score is when Google revealed those scores to the public for any page. That's not something it had to do, in order to use PageRank as part of it algorithm. But in doing so, it fueled an explosion in link spam.
But I will actually claim that PageRank itself is a problem not just because it is so gamed but also because web page authors link to things they find via Google, creating a positive feedback loop that undermines the very premise of the PageRank algorithm.
Search engine research had stagnated because of Google's dominance, and Google itself is not motivated to change, much as Microsoft was unwilling to evolve its cash cow. Rather than innovate in search, it spends most of its resources on ways to shore up its dominance (Google+, Android), fighting the very thing its search engine feeds (Internet garbage), and strengthening its advertising business.
Do you remember when Yahoo was a website about wrestling ?
Or when having a good section in DMOZ was important ?
I can also remember the moment when I discovered Google. It was reading this article back in 2000 (I could have sworn it was 1998):
If you want to travel back in time to the internet as it was when Google appeared, give it a read.
For any gripes I may have about Google's search engine (like the fact I can never seem to easily relocate this article when I want to refer to it), it definitely solved more problems than it created.
What? Really? I was around in 1995, and I don't recall that...
I'd like to see a hybrid of the high-speed algorithmic scanning of the pages that we see now combined with an army of human reviewers, including super-reviewers who are recognized industry experts, who periodically rate indexed content based on quality. Backlinks and other derived consensus measurements should be given far less weight and a combined algorithmic and human quality rating should be of at least equal importance.
So I want to go back to the days of a manually curated web index, combined with the technology needed to make that span out over billions of web pages.
I don't think Google have too much faith in their algorithm - they know it's flawed. But it's the least worst algorithm anyone has come up with, and adding human tweaks leads it subject to subjective bias.
Yeah, I don't think it's a bad idea. I don't have the funding to start it, of course, and VCs crap their pants at the thought of anything that has overhead, so it's probably a non-starter.
>nor that recognized industry experts want to spend the majority of their time reading a huge amount of content and rating it for pennies an hour.
The recognized experts would be paid more than pennies an hour, and they wouldn't need to spend the majority of their time reviewing content. They'd be "super-reviewers", so their opinions would hold a lot of weight. It'd be a way for them to make some extra money without a lot of overhead, something they'd do occasionally for an hour here and there. Honestly the main thing we'd be looking for from these people is information about the cutting edge; things that are trending that we haven't picked up yet, things that are new and thus don't have a lot of consensus markers but are still worth attention, and information about the perception of the content within the industry. That classification can be used to inform on a variety of axes that could be good search parameters. We'd need to make sure we got opposing industry leaders so that the index didn't become solely representative of a single viewpoint.
Normal reviewers in a position analogous to a news reporter are more affordable, more consistent, and can classify the majority of content for a sector fine. Maybe Yahoo! could take their niche content mills and reassign the staff to rate pages in their index, then maybe their search will get somewhere.
Then you'd have MTurk style reviewers who provide the bulk of the content rankings and just give back a few basic pieces of info. These are the people that would be working for $2-$3/hr or less, at their convenience.
All of this is on top of a more traditional automated ranking algorithm that would use consensus markers and computer-perceptible quality markers to rank content. There's not necessarily an obligation that every page is sampled and reviewed by a human.
It'd be great if we could get good traffic data too; we'd be able to see where people are actually going instead of just what they put links back to.
>But it's the least worst algorithm anyone has come up with, and adding human tweaks leads it subject to subjective bias.
The bias is there regardless, it's just filtered through different parameters. This is inescapable. In general, not just in algorithm and computer design, we need less faith in cold systems and more faith in human intervention and judgment.
Yes, you have to be aware that any process is subject to gaming, manipulation, or bias, but I think affording sufficient room for human opinion and circumstantial judgment as very highly-weighted inputs prevents most of the egregious failures caused by runaway systems.
But this article is so far up its own ass.
> Ever gotten a crappy email asking for links? Blame PageRank.
Never mind that web rings were around long before Google and used the same tactics.
> Ever had garbage comments with link drops? Blame PageRank.
There are way more reasons spammers exist than just boosting PageRank.
The author is acting like a) Google had less of an influence on the web before PageRank was public information and b) the web was somehow better both back then and before Google existed. There will always be people who want to game search engine results, regardless of how much information they know about their own standing, and the web was pretty much un-navigable pre-Google.
The article isn't about Google's influence on the web, as a whole. Google has had a huge influence on the web in many ways, from making it easier for people to locate information to sites considering how to speed up their content, to become mobile-friendly or to use secure connections, because Google rewards such things with ranking boosts.
The article was specifically about PageRank's influence on the web, in terms of link brokering and link spam. Before Google released PageRank scores, some of this happened. It would have happened even if scores had never been released, because it was well-known that Google leveraged links and thus, links had value.
But PageRank scores were an accelerant. They allowed people to use Google's own scores to assign value to pages, value that could be translated into monetary value. It really did reshape the link economy, to the degree that we had a court case with a First Amendment ruling on Google's search results (amazing, when you think about it) as well as an entire new standard to restrict the credit links could pass, nofollow.
I doubt Google anticipated this. Showing the scores, as the article explains, was meant as an incentive for Google Toolbar users -- "Hey, enable this feature, and we'll show you how valuable a page is deemed to be." Google's gain, of course, was that anyone enabling this sent their browsing patterns back to Google, so it better understood what was happening on the web outside of its own properties.
The unintended consequence was that PageRanks scores fueled an explosion in link buying and selling, as well as link spam.
Also, I didn't say the web was better before Google. It had plenty of problems, though it wasn't "pretty much un-navigable pre-Google," as you say. Many people used many of the search engines that were bigger than Google successfully for years. If it were really that bad, by the time Google came along, people would have given up on the web.
Google, of course, was a huge improvement in search and for the web as a whole. The article wasn't that Google was bad for the web. It really was just focusing on one aspect that didn't help the web, how releasing PageRank scores ironically fueled some of the spam Google has to fight (and which it fights well) as well as the spam third-parties have to deal with.
It's nice of you to say, but Google has many very smart people who spend all their time thinking about search. I would be surprised if they failed to anticipate this outcome. Maybe they thought it was worth the cost, especially as a company that values openness.
If you "never used the web before Google," how would you know this? I suppose you might have read about it. I did use the web before Google, and I don't remember web rings back then. There simply was no reason to do so, before Google started ranking pages based on the links.
I do remember lots of BS meta keywords.
Also, I do consider reading about something to be a valid way to learn about things you didn't directly witness, so it's strange that you would discount this.
> There simply was no reason to do so, before Google started ranking pages based on the links.
Webrings were useful for people who found a website interesting and wanted to visit other similar websites. The incentive to be in a webring was that you could get more exposure. Not to mention that the 90s were full of trendy things like this: "under construction" gifs, "valid HTML" buttons, etc. Web rings were one of those "clever" things you could add to a website.
With Tripod, Geocities, Angelfire, and the like it was fairly easy to get a really basic page up, typically with a bunch of links to pages that you checked on regularly, and might be of interest to others.
At least that's how it was for me. I think the rings I was part of included Terragen, POV-Ray, and Star Wars, and I can recall getting a couple people started with basic (and of course very ugly, with that same star field background for the SW-related pages) for people that I met in the various groups.
EDIT: And don't forget the 'made with notepad' icons. Or recommending 800x600 or 1024x768 as the best resolution to view a site.
I do consider first hand memories a more reliable contribution than hearsay (reading about it), but there is some contribution in repeating received wisdom too.
Unless they were some other kinds of web rings we're talking about.
We get pagerank SEO spam from time to time, and it's pretty annoying. I have the tools to take care of it within 5 minutes every day, but I do worry that if we grow to a certain point it may no longer be possible for me to handle the problem alone.
I'm sure many other sites have similar problems with comment spam, and I'd love to hear some advice on how to deal with this from sites that have the same problem.
Right now our main lines of defense are a recaptcha (our last remaining third party embed, ironically sending user data to Google I'd rather not send to deal with a problem Google largely created), and a daily update of an IP blacklist we get from Stop Forum Spam.
I tried to do some Bayesian classification, but didn't make much progress unfortunately. And nofollow really isn't an option for me, as it would involve me manipulating other people's web sites and I don't want to do that.
I've said I would decide to cross it if I ever ran into a major security issue, but so far that hasn't been the case. I've even decided against web page auto optimization ala things like ngx_pagespeed for this reason.
And honestly, let's be real, even if I put on nofollow and Google stops using pagerank, they're probably still going to do it because they're basically shady pagerank scam artists in the end, fooling gullible people into sending them money. They don't understand they're feeding into a botnet meets army of underpaid pagerank spammers, and it's basically impossible to fix this with education.
It would be excellent if Google gave me an API to report pagerank spam. For all the money they've made on pagerank, it would be nice if they could defer some of that money into helping us deal with this, and would definitely help to improve their search results.
Did you read the comment?
Back in 2003 I wrote:
"PageRank stopped working really well when people began to understand how PageRank worked. The act of Google trying to "understand" the web caused the web itself to change."
It's amazing that it took this long.
A friend mention that recently to me and it was really eye-opening. Once you start looking, the pattern is everywhere.
And, the solution looks roughly like "weigh established authority to the point where it trumps relevance".
(I still offer Ad Limiter if you'd like to trim Google's in-house search result content down to a manageable level.)
Google are not interesting in protecting us web users from SEO, but themselves (or more specifically their core ad business). After all why would you pay for Google ads if you could just get free traffic from Google via SEO techniques? So SEO is the logical competitor to Google.
Edit: Changed "on airbnb" to "about airbnb"
(Because the extension works on a JS level, it needs to wait for the results page to load before it can strip out what you want to hide. Too often I'd see the results page begin drawing quickly, with a couple w3schools results at the top of the results, and my eyes would scan down to find the first non-w3schools result; then a second later, in the middle of finding the first such result, the w3schools results would be removed, and the non-w3schools result which I'd finally found and was about to click on has moved.)
The Google search bubble is powerful and can be harmful in some ways but once you learn of its existence and are careful about what results you click, it will work for you in a great way.
I am still uncomfortable with the amount of stuff Google knows about me. I sometimes try ddg or even yahoo or bing but they're not as good.
For the use case you described it would work. You could also simply use the search operator site:airbnb.com
You can still infer the approximate rank of a page by where it places relative to other pages, when searching for relevant keywords. Someone wanting to place ahead of the competition still has a function for measuring how well they are doing in SEO.
Therefore, the data is no longer open and power is now more concentrated: Those who know someone at Google can find out their page rank score; the 99.999...% of the rest of the world cannot.
Every system can be gamed. Every system where money can be made WILL be gamed. It's a predator-prey relationship.
The way this article was written made it sound like Google Search was a bane when it arrived. And sure, it was the worst Search Engine at the time, except for all the others that had been invented up until then.
When Google arrived, it was a huge advance in search. It offered an obvious improvement in relevancy, which is why so many serious searchers switched to it from AltaVista and then users of other search engines moved over.
Nothing I wrote suggested that Google was bad, didn't offer great relevancy or anything like that.
My story is about what happened when Google revealed PageRank scores for pages across the web. That fueled an explosion in link buying and selling. It allowed people to attach Google's own score to a page, a value if you will that Google itself placed on those pages, which made it easier to then assign a monetary value.
In turn, that lead to many of the woes that the web as a whole has to deal with today: understanding how to use nofollow to block links, to stay in Google's good graces. Spam mail pitching links, trying to buy links. Link spam
I'm sure we'd have had some of this even without PageRank scores ever having been revealed. Perhaps it would have been as much, even. After all, it was well-known that Google was leveraging links as part of its ranking algorithm. The market would have been there.
But I do think that releasing the PageRank scores accelerated market faster than it would have done otherwise.
Back to the gaming -- again, it feels like you're reading stuff I didn't actually write. I'm certainly not saying that Google itself introduced the ability for people to try and game search engines. That was happening even before Google existed. Of course, Google initially thought it was immune. In 1998, Sergey Brin even said this on a panel that I moderated:
"Google’s slightly different in that we never ban anybody, and we don’t really believe in spam in the sense that there’s no mechanism for removing people from our index. The fundamental concept we use is, you know, is this page relevant to the search? And, you know, some pages which, you know, they may almost never appear on the search results page because they’re just not that relevant."
Google soon changed its view and introduced extensive spam fighting efforts. Those were inevitable. As you say, it was prey that would attract predators. And even with the link selling, it has done an admirable job fighting off the spam. It's not always perfect, but it's a very robust system.
Nevertheless, the spam attempts will continue regardless if Google actually blunts them because, as the article explained, there's simply so many people with misconceptions that they'll chase anything anyway. PageRank scores fed into this, that's all.
> My story is about what happened when Google revealed PageRank scores for pages across the web.
And I'd assert that people already knew if they were number one in the search results, or not. And that metric continues to be the main thing they pay attention to. Well, that and their traffic numbers from Google. My point being, we all knew Google was using links to rank, and the search result rank was visible just by doing a search on a few of your synonyms and adjacent terms, market, brands, trademarks, etc. The battle over spamming links was inevitible, whether they revealed PageRank numbers or not.
> I'm sure we'd have had some of this even without PageRank scores ever having been revealed. Perhaps it would have been as much, even. After all, it was well-known that Google was leveraging links as part of its ranking algorithm. The market would have been there. But I do think that releasing the PageRank scores accelerated market faster than it would have done otherwise.
I can agree with that, but that's not the tone that I get from your article, at all.
The tone I get is that Google created this monster, visible PageRank score, and those crappy emails, link drops, and need to use nofollow, are uniquely Google's fault, and it all could have been prevented if they hadn't ruined the web in 2000 by making it visible.
> Google initially thought it was immune.
Your quote from Sergey doesn't imply to me that he thought they were immune. It tells me that they mechanism they intended to use to fight spam would be to reduce its rank so low that "...they may almost never appear on the search results page..."
You may think that's a pedantic difference, but I see it as a meaningful difference. You can't claim that we're all immune to measles, mumps, polio... But we've reduced the incidence to an incredibly low level, here in the Western world.
It seems the most likely reason for Groovy's strange behavior on Tiobe.
The value of "ch" is a checksum you have to precalculate.