Hacker Newsnew | comments | show | ask | jobs | submit login
Split Testing May Cause Google to Accidentally Think You’ve Been Compromised (webdesigncompany.net)
23 points by melvinram 1116 days ago | 23 comments

There's a simpler explanation, which is that the site really was hacked as we claimed. Here's an example cached page that we crawled on May 2nd: http://webcache.googleusercontent.com/search?q=cache:zl9t7DQ...

I count the name of one drug repeated over 100+ times.

I've been quite clear that there's nothing wrong with A/B testing. In fact, less than a month ago I tweeted that "A/B testing can be really helpful" and included a link to best practices: https://twitter.com/#!/mattcutts/statuses/191658511149711360


That is not the same website: that is model1.webdesigncompany.net, which seems to be some kind of lorem-ipsum design or functionality demonstration. That site does not seem to be linked from www.webdesigncompany.net (based on a complete recursive wget of that site) and is not hosted on the same server as www.webdesigncompany.net: the only thing it shares with www.webdesigncompany.net is the domain name. It is also not entirely clear that it actually has been hacked. Is this kind of circumstantial evidence really relevant for Google?


Hi saurik, here's what I can tell from home on a Sunday night:

- some pages on the domain were definitely hacked across multiple pages for weeks.

- we detected hacked pages on the site both manually and algorithmically based on our hacked site classifier.

- we sent a message via the webmaster console on May 5th to alert the site owner so they'd have a heads-up.

- it looks like this has nothing at all to do with A/B testing (which as I've said before is perfectly fine to do).

That's what I know with near 100% confidence.

Here's what I believe, but won't be able to confirm until tomorrow when I can talk to a few folks. I think the domain was hacked in multiple ways. I found the hacked pages on model1.webdesigncompany.net just from doing site: searches. And you're right that those are auxiliary pages, not related to the main part of the site. But I suspect the core site was also hacked, based on looking at the manual action that was submitted. I'll be happy to talk to the relevant webspam people to ask for more details.


Hi Matt,

Looks like the subdomain you referenced did have a hack, but the main site didn't. I had done a Google fetchbot request to check before hand. Regardless, I can see how having hacked subdomains might trigger the warning.

Based on your comment, should we understand that split testing, where it actually redirects to a few different separate pages, is an okay form of testing that won't trigger the "this site was compromised" warning?


melvinram, that's correct. Split testing is a perfectly fine form of testing and doesn't have anything to do with the "this site was compromised" warning. That's the high-order bit.

If you want low-level/detailed info, we discuss this topic on our support page at http://support.google.com/websiteoptimizer/bin/answer.py?hl=... . Hope that helps.


Thanks for the clarification.


P.S. Looks like the robots.txt now blocks /wp-includes/ . If you remove that block, we should be able to crawl the relevant files and verify that they're clean (or more accurately, 404) now, which would let the system auto-revoke the hacked label.


Updated it.

PS: Thanks again for your help on this.


The analysis here makes sense. It is likely that the redirection test tripped Google's URL cloaking detection algorithm [1]. I've not used redirects for split testing before, rather I use client or server side variations. Redirects don't make sense from a page speed and UX stand point which is why you don't see them advocated for use in split testing.

One thing that rang alarm bells that you and readers should be aware about is to do with this statement:

> The VWO javascript automatically redirected equal portions of the traffic to the test pages.

If you are not randomly assigning visitors to a group and instead just putting odds and evens in separate groups when you run more than one test at once you are will synchronize those tests and generate incorrect results.

[1] http://support.google.com/webmasters/bin/answer.py?hl=en&...


That was a simplification on my end. I'm sure VWO does randomly assign the traffic to the variations. I just wanted to get across that part of the traffic went to page A and part of it went to page B.


If you were using VWO why not have a single landing page with different content?

Also why did you segment the traffic during the test (ignoring traffic that didn't match you criteria) instead of doing it at the analysis stage? Is this a limitation of VWO?

I haven't used VWO rather I have rolled my solutions and I found segmenting traffic during analysis allowed for a great deal of discovery and indications for further testing. What if one of your test pages caused people who searched using a phrase that didn't indicate an intention to buy converted those users better?


I typically use the client-side variations approach but didn't in this instance because the two pages were created separately and were pretty different in terms of css and it was just easier to keep them separate.

Moving forward, I'll do the extra work to make VWO swap content on the same page and adjust the css appropriately.

Segmenting traffic after the test is something I wish VWO would allow me to do but that is currently a limitation of their system.


I usually robots.txt any big text variations anyway. A short delay in indexing isn't going to do any major harm and that way you get the right version cached.


This makes sense if you are doing a one-time split test, but not if you are doing ongoing tests akin to the scheme described in the article I link to below (which comes up on HN from time to time). The problem being that in these cases you never really stop testing: instead, you always assign some random proportion of incoming traffic to your various buckets based on your current statistical certainty.


This not only has the advantage mentioned where you limit your regret for making a bad decision on too little evidence, but with some modifications to the algorithm (such as discounting old evidence from your inference) you can also deal with situations where the world in which you are running the different variations is itself changing over time, or where different types of users coming from different browsers may have different behaviors.

From the perspective of someone searching for your site, however, the results might be nearly identical: even if you have massive changes to the phrasing (having versions written in various regional dialects of English, for example), the resulting page will still largely be identical, possibly sentence-for-sentence (though not word-for-word). In this case, you will still want to get one of those variations indexed, and it might not matter which one.

As described, of course, this is both "cloaking" and not: the site is not treating GoogleBot differently than any other visiter on purpose, it is simply going to end up doing that because GoogleBot has different behavior on the site than a normal user, so it will learn and optimize (possibly somewhat randomly or uselessly) from this and end up treating that user differently than one coming from Internet Explorer (which may very well represent a different demographic of user).

It sadly does not seem like this advanced usage of testing is allowed by Google.


If your site looks different to Google than to other users" even for a legitimate reason, why should Google index York step based on what it saw? And it can't index your site based on whatvit didn't see, so what's left? Either make your site look the same to Google and the public, or get your link juice in a way that doesn't depend on site content.


It's really despicable that Google has decided there is a single way that the Internet must be used, and that if you don't abide by their rules they can use their near-monopoly on both search (discovery) and advertising (revenue) to entirely exclude you from it. We are now living in a world where Google has managed to replace the URL bar for 99% of users with a search box (whether or not the URL bar actually is a search box, which it now actually is in many browsers).

Hell, I myself am guilty of not bothering to type in URLs anymore (yes, just like those poor people who used to search "Facebook login" and one day got highly confused when an article about logging in to Facebook that happened to support Facebook Connect to leave comments ranked higher): I just search for things like "cdnetworks portal" and "1and1 client login"; I even, and I shit you not, do a Google search using the Chrome OmniBar for "google images", as I don't remember what the URL for that part of Google is.

[edit: As adgar correctly points out below, the following paragraph is a misunderstanding of the mechanism that was used by this website to implement this split-test algorithm. However, it seems fairly obvious to me that the mechanism used to implement the split test does not matter. Meanwhile, the reason I went in this direction is due to Matt Cutts specifically stating the trade-off in the below paragraph on the Google webmaster video series; it is not because I assumed something from this article's conclusions.]

Seriously... there is /nothing wrong/ with a site choosing to A/B test people by returning different results from the server, and yet Google insists that doing so is somehow harmful and that it would be better to wait entire extra round-trips to do client-side testing using JavaScript, a process that in the end is not only a worse experience for the user and a more complex and less secure mechanism for the developer but has /the exact same fundamental behavior that Google claims is evil/.

It is /exceptionally/ irritating as their rules may have some (highly arguable) philosophical purity, but in the majority of cases leads to a /worse result/. For example, it would be /much more correct/ for sites like Hacker News to mark that the comment/title at the top of each page "is what search engines are allowed to see and index" and that the rest of the comments below it are "ancillary content that absolutely must not be indexed". Otherwise, when you search for a comment using Google, you find every single point along the tree that connects from that comment up to the root of the page, as the comment is present on all of them.

I found myself thinking about these issues a lot recently while working on writing some custom forum software, and even went and skimmed through every single Google web masters video that Matt Cutts put out, and the end result simply made me angry: I was finding myself purposely designing worse things so that they could be "indexed better", and when I'd look at what I was doing and go "this is nuts: is Google really that important?" I'd have to sigh and sadly remind myself "yes, it probably is".


lol, read the article


> Seriously... there is /nothing wrong/ with a site choosing to A/B test people by returning different results from the server, and yet Google insists that doing so is somehow harmful and that it would be better to wait entire extra round-trips to do client-side testing using JavaScript, a process that in the end is not only a worse experience for the user and a more complex and less secure mechanism for the developer but has /the exact same fundamental behavior that Google claims is evil/.

Halfway through your rant, you clearly demonstrate that you did not bother to adequately understand the linked article before ranting about Google, throwing words around like "evil" and "despicable" without even knowing what you are ranting about, like so very many well-intentioned but woefully ignorant commenters in today's technorati.

FTA, emphasis mine:

> [The split-page approach used] is sort of a hybrid of Client-side and Server-side variations. Here’s how it works. Let’s say you want to test your landing page at yourdomain.com/landinag-page. You would create additional pages and using javascript, the visitor would be redirected to one of the pages.

You seem to have completely misunderstood what the author was saying. Naturally, since you are both misinformed and critical of Google, that makes you the most highly upvoted comment on today's Google Hacker News thread!


Can you please explain to me how this oversight on my behalf makes the situation better? Does the fact that Google is also disallowing client-side A/B testing of this fashion cause it all to make sense? Seriously: you can tell me I misunderstood the article (which I will happily admit I skimmed: I read the various paragraphs as "different common mechanisms for implementing split tests", one per paragraph), but can you honestly tell me that this misunderstanding would have changed anything about my rant, excepting possibly making Google come off even worse?


^ This is the reason why I went in that direction, by the way. This is a video from Matt Cutts, one of the many that I sat through watching during a massive five-hour- MattCutts-and-GoogleWebmasterHelp -a-thon that I force myself through a week or two ago. I am hotlinking to 3:23 into the video as this video answers multiple questions (as is common on the MattCutts channel, as opposed to on GoogleWebmasterHelp, where each question tends to get its own video).

In this question's answer, it is clearly stated that Google may very well consider the same page returning different loads of the page for purposes of A/B testing as "cloaking", and that webmasters should instead use client-side mechanisms to perform these tests. If tests /are/ done, it is claimed that the webmaster should only do so on areas of the site that are not being indexed (which may involve explicitly telling Google to stop indexing that part of your site).


Your rant doesn't make sense because your starting point "Google is evil and despicable for making rules for the Internet" is incoherent.

Both Google's action and Google's inaction "make rules for the Internet". In an alternate universe where Google didn't implement this penalty, attacks using this kind of client-side redirection could easily be common and a serious problem. In that universe, alternate-saurik could come to Hacker News and complain that "Google is evil and despicable for not clamping down on these client-side hacks that are almost always used by attackers, not legitimate developers."

Google has to make choices. Those choices are going to feel like "rules for the Internet" for many people they influence. That's unavoidable. If you're going to criticize Google for this sort of issue, you need to focus on the details of those choices. In this case, your lack of attention to detail makes it clear you're not making a credible case.


In my rant I provided a specific example of how Google's rules regarding content cloaking make it difficult to search or build sites like Hacker News (and will now further point out that Yahoo provides that feature); I have also provided evidence that Google disrecommends doing A/B testing using some mechanisms as it may be considered cloaking, and that webmasters should first deindex their content from Google.

I can even, if you demand, find other videos from Google (also from Matt Cutts) that show that the heavily-hedged suggestion in this video to serve different content from the server (even small snippets of text, such as titles or breadcrumbs) that is not based on IP address (and thereby might not be stable for GoogleBot) is also highly dangerous and can lead to Google believing that you are cloaking.

I therefore take issue with the fact that, once I misunderstood this specific article, that somehow means that all of the other research that I've done on this matter, and even the arguments that I state in my rant (where the specific paragraph that is currently under contention I am backing up with evidence outside of this single article, which is from someone none of us have ever heard of and for all we know could be lying about what Google did anyway) are now void.


(Also, if you spent any time at all to look into who I am, what I do, or the things that I stand for, you'd realize that your appeal to alternate-saurik in a world where Google built an open system that was capable of being abused doesn't really have much basis, and is honestly a little insulting.)


Hi saurik, that's one of the earliest videos I ever taped, back in 2006: http://www.mattcutts.com/blog/more-seo-answers-on-video/ I had to move that video over from video.google.com, which is why the YouTube date appears more recent.

I should probably make a new video or blog post about A/B testing, but I can try to summarize here. In the original video, I said:

- it's nice if you can A/B test on a part of your site that Google doesn't see. Then it's a moot point for us.

- failing that, it's helpful to do something server-side.

- don't do anything special for Googlebot.

The state of A/B testing has evolved quite a bit in the last six years though. If I were making a fresh video, I'd say: A/B testing is a perfectly fine thing to do. You can do A/B testing via either server-side or client-side technology. In both cases, don't do anything special or different for Googlebot. Treat Googlebot just like any other user and don't hard-code our user-agent or IP address. If you have any other questions, refer to http://support.google.com/websiteoptimizer/bin/answer.py?hl=... or http://www.youtube.com/watch?v=QHtnfOgp65Q to review best practices and avoid actions that search engines could potentially view as cloaking.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact