Hacker News new | past | comments | ask | show | jobs | submit login
What the SEO? – The shady world of small business SEO (dangoldin.com)
72 points by dangoldin on Oct 20, 2013 | hide | past | favorite | 48 comments

Hey Dan, it's annoying when a someone scrapes your website, and it's even more annoying that this site tried to claim your Mom's images with Getty. But as far as I can tell, the site that you mentioned isn't having any success at all. For all the work that this spammer put into copying your Mom's site, it doesn't look like they're getting even a single digit number of visitors from Google. So they may be annoying, but their attempts at spamming didn't do them any good at all.

Feel free to do a DMCA request, but I'm already passing this to my team as a spam report and we'll dig into it. I don't know how Getty deals with scrapers though, so you'll need to look into that on your side.

I made a video with advice about dealing with scraping a few years ago that might be useful for you: https://www.youtube.com/watch?v=5CosWAVLCZg


Thanks for the response - I guess my mom's site is good enough to ward off a cheap attack. Thanks for passing this on to the team - that host has a ton more sites similar to my mom's so anything that can be done to help the original owners would be awesome. It sucks worse for people who aren't familiar with the web and just have to deal with the consequences of someone else's actions. Anything Google can do to help is appreciated and I suspect this problem will only get more difficult.

I'll past that video on to my mom as well. Thanks again.

No worries at all. It was a pretty big network of spam domains, even though each individual domain wasn't really having any impact.

Hey Matt,

Can you pass along the scrapers details to the Adsense team too... I would think that account should be shutdown.

"... it doesn't look like they're getting even a single digit number of visitors from Google."

So? What if that is not the goal?

Is it possible that some users might not type their searches in a Google-sponsored search bar or a Google-controlled browser's "omni" bar?

Is it possible that some websites might get traffic from their domain names alone, because these names are descriptive? No search engine needed: Type your search into the address bar and add .com.

I believe this is called "direct navigation". Presumably, these sites generate small scale (but targeted) traffic on a site by site basis that might only be significant when aggregated across many sites. Contrast this with the immense traffic received by a search engine acting as a portal to the rest of the web, for which each site must compete... by purchasing ads or placement or "SEO" services.

Interestingly, there is no AdSense on the imposter site mentioned in the article. The only outbound links is a search form pointing to https://www.google.com. Thus this imposter site is not a landing page for Do Re Mi but for Google.

Ok, this guy is using a couple of methods off BlackHatWorld (Probably the biggest SEO forum around if you haven't heard of it)

So essentially this guy is building this site, ranking it in Google using the churn 'n Burn method, then once its ranking well he'll contact local business saying he can link to them in return for a fee.

Even though the domain is 6 months old, if you look at ahrefs this guy has 33k backlinks to the homepage. Google had a major SEO update on October 4th which essentially fucked up their algorithm more than it already was. This churn 'n burn method means that you spam out a zillion backlinks and you will rank really well until the next Google update comes along (which is around 3-4 months) You could do this in the previous Google update (Penguin 2.1) and I thought Google would fix this but they just made it easier. This is a good example of the method: http://www.blackhatworld.com/blackhat-seo/black-hat-seo/5769... (This was the one of the guys that essentially figured out this method from what I can tell)

So with this new Google update, they are getting rid of some of the spammy sites, but more spam sites just taken their places in the rankings. Now they are focusing on the amount of content on your site rather than the backlinks. So now 10 year old sites with hundreds of legitimate, high authority backlinks are getting outranked but spammy sites with 14 backlinks and 300 gibberish articles.

I've also been reading that with local Google searches, you get results from completely different countries. For example, people in the UK were reporting that they were getting results from sites in Russia, Argentina and New Zealand instead of UK results. I don't think this happens as much in America from what I know.

So I'm guessing Google is eventually going to iron out these problems, but it's a battleground at the moment.

If you do get outranked by this guy, make a video with your keywords in the title, same with a Facebook and LinkedIn page and then buy a bunch of backlinks off Fiverr. They should outrank pretty much any other website in a matter of days.

So the descriptive domainname is probably only to influence Google's ranking methodology? The domainname itself being a factor in the inferred "relevancy" of the content.

Your explanation of this scam shows some of the flaws in relying on Google's methodology and ranking system to (re)"organize the world's information".

Imagine if the world's libraries worked this way.

The flaws should not simply be opportunity for scammers, they should be a signal to entrepreneurial innovators.

In the same way that the flaws in first generation search engines like AltaVista (e.g., blurring of the line between commercial and non-commercial content in search results) were a signal to the entrepreneurs who created Google.

Great post.

I'm going to take this opportunity to suggest people check out Duck Duck Go. That is all.

What does Duck Duck Go do to mitigate this sort of SEO attacks?

A DCMA Takedown notice is the best legal way to take down the website. If you want to get shady too you could also use tactics like using automated black hat tools and generate lots of suspicious (to google)backlinks pointing to www.bestnewjerseyartsschool.com Far from helping them, it will tell google that they aren't worth ranking that high. Of course, it's a last resort. Copying content and duplicating it all over the web is a form of SEO attack. Now you just have to fight back.

Of course, it would still be worth it to get your own content rewritten at least once before you hit them back.

I know people won't agree with this radical solution but for those black hat guys the web is a war zone. Good luck.

Thanks. I'm helping my mom file it.

Turns out that they actually filed a DMCA request against my mom's site for the photos she took. That's some chutzpah.

Do you have a copy of the DMCA request from chillingeffects.org? I think pretty much every DMCA request that Google receives has a copy sent there for public view. It should at least have their name or company name though they may have faked it.

"If you want to get shady too you could also use tactics like using automated black hat tools and generate lots of suspicious (to google)backlinks pointing to www.bestnewjerseyartsschool.com Far from helping them, it will tell google that they aren't worth ranking that high."

I'd do some research before embarking on this. Remember, they copied your mom's website, they have duplicate content. I wouldn't be surprised if Google's crawler already knows that. So if you make their site look shady, your mom's site may be hurt in SEO rankings for being guilty by association. Ironic, I know. Again, I am not sure if this could be the case, but you may want to get an expert's opinion.

That is precisely why I wrote it would be worthwhile to have existing content rewritten before striking back. Just in case.

I thought there where penalties for malicious use of the DMCA

Filing a false DMCA takedown opens the filer up to civil damages (assuming you want to pursue them). It's a shame there's no criminal penalty.


I'm actually confirming it was a DMCA request. I might have jumped to that conclusion and it may have been something else just trying to extort money from her. I'll update once I find out.

So I was wrong when I said it was a DMCA request. It was actually Getty images telling her she was using an image without licensing it. When she asked which image they sent her a link to the rip-off site since they assumed it's her. Now she just needs to tell them it's not her site and she'd be glad to have them issue a DMCA request.

A DCMA takedown notice might work, however, in my experience, a successful notice has often the result that the website in question is moved to another jurisdiction. Good luck with sending a DMCA takedown notice to a site hosted on a server in Bulgaria by a hosting company in Panama and a domain registrar in Thailand. .com are in theory under American jurisdiction but domain registrars cannot be made responsible for content under 'their' domain names …

If Google doesn't remove the site from the SERPs, you can also serve the spammer's hosting company/ISP with a DMCA, then they'll be forced to take the site offline.

Yeah but they'll just neg-seo you back, better. Your domain will then be trashed.

Are you actually losing any queries to this site or seeing a drop in your traffic? Google is very smart, they probably know your site is more trustworthy and had the content first. If the person is hosting a bunch of similarly crappy pages on the same IP and/or the same host, Google knows about it.

Any suitably popular site is going to have all sorts of people ripping off your content. Google knows this, and work very hard to deal with it. I just searched for a bunch of things related to your site and the legitimate pages appears to be winning them all, and I don't even see the bad page showing up in the top 10.

If the site is sapping traffic, file a DCMA against their hosting provider (though there are some shady web-hosts who won't do anything about it). Obviously if they are filing a DCMA request against you, defend yourself. But just because somebody is ripping off your content doesn't necessarily mean you need to do anything.

Based on the Google analytics the traffic's steady but who's to say that this won't happen in the future? I think the fact that this behavior is allowed to happen is the problem. Even if this adversely affects one real business it should be fixed.

To note - my mom found out about this site when she received a note telling her she needs to pay close to a $1000 to license the photo that was ripped off from her site. Otherwise she wouldn't even have found out about this.

> To note - my mom found out about this site when she received a note telling her she needs to pay close to a $1000 to license the photo that was ripped off from her site.

To me this is smells like the real motive behind this. It's not about trashing your SEO, or stealing traffic, but rather to scare those small website owners into paying some kind of 'fee'.

It's pretty safe to assume lots of small websites for local businesses use various photos they found online or downloaded from Google images, rather than their own content or images they bothered paying license for. Copying those images onto a seemingly legitimate site and then filing a claim for license payment seems like a pretty good strategy to milk money out of innocent site owners.

I'd be curious to see the note your mum received and what kind of language or tactics it's using.

So I just dove into it and that notice was actually from Getty images which claimed she was using an unlicensed photo. When she asked which photo they replied with a photo from the rip-off site which she has no control over. Getty is actually responding correctly in this case but falsely targeting my mom since they assume it's her site. Now if only they can issue a DMCA notice to that other site..

Google is implementing a new (almost completely rewritten according to them) algorithm right now, but right now it's kind of easy to edit the timestamp and make GS believe that the copied content was posted first...

It certainly could keep happening in the future, it may have happened before also! The web is an open platform, and that has pros and cons. One of the cons is that it is very easy to publish whatever you want, such as copies of somebody elses site or content.

However, tricking Google is another matter. The site I used to work for has had probably hundreds of other sites steal content from it, some of them really egregiously (IE mirroring the entire site). It didn't matter though, because our site had a better reputation and had the content first. For example, let's compare your site to that other domain:

1) Your site was registered over 4 years ago. The knock-off site was registered 6 months ago.

2) Your site is hosted by a reputable web-host, linode. Their's isn't (or at least the whois for their IP range doesn't look reputable).

3) You had the content first.

4) Your site has some reputable incoming links, their doesn't.

5) Their site's IP is hosting tons of other sites, yours isn't (I'm assuming)

You are obviously the more reputable site, and that is why you are going to rank and they aren't.

The best defense is just to look more reputable than your competition --- the easiest way to do this is to be more reputable than them!

Three quicks things that I noticed you might want to improve:

1. Change the http://doremi-nj.com --> http://www.doremi-nj.com redirect from a 302 to a 301. A few incoming links are coming to the non-www version of your site, and theoretically a 301 is better for passing SEO juice than a 302. (In simple cases like this I'm sure Google figures it out anyways, but might as well help them).

2. Make sure you are giving Googlebot a sitemap with links to all your content to help them crawl it first. There isn't one in your robots.txt, so I'm not sure if you have one. You can just upload this directly into Google Webmaster Tools as well.

3. Have better meta descriptions. For example for the search "guitar lessons livingston nj", you come up 2nd. Very good! But, the snippet for your page is "Do-Re-Mi School of Music & the Arts93A South Livingston Avenue,Livingston, NJ973-758-1500 We offer instruction in Acoustic and Electric Guitar and Bass, ...". This is not a good snippet. For your main 20-30 landing pages (which you can find in GA) write custom meta descriptions that are going to look good in the Google snippet. This will not only directly help your traffic, but this is also is a good reputation indication! Human written meta descriptions look more human than robotic ones.

Thanks for the advice. I'll work with my mom on getting these done.

If Google could detect this behavior, wouldn't they kick the culprit out of AdSense?

Being able to be part of Adsense and ranking for search terms are completely different things. And even if everything were integrated, the bar for kicking somebody out of Adsense would be much higher than demoting somebody in the rankings. Kicking people out of Adsense also is much more permanent, because Google has all of the person's information so they can easily block future attempts to signup if they kick you out.

The #1 thing Google is concerned about for search is webspam. The #1 thing they are concerned with for Adsense is click fraud. So in this situation, we probably have a) a site which gets little traffic because google search has detected it is webspam and b) a site which makes little money because it gets little traffic, and c) doesn't engage in click fraud. Therefore, it still is a member of Adsense, but gets no search traffic in google search.

Ads and Search are kept very far apart from each other organizationally, so things that search knows about a site may not always get over the firewall to Ads. This is an unfortunate consequence of the sort of Church/State separation we've set up that is in most ways very beneficial.

Purportedly, Google has recently started banning YouTube accounts that are associated through common linkage in Adsense to other YouTube accounts which were banned. So this firewall you mention is not as strong as it may have once been.

Youtube doesnt belong organizationally to Search AFAIK.

Not always - they let doddgy "helper" sites that charge a fee for things available for free from government sites e.g. passport applications back in after they added some minor tiny text to their site.


I see the google fanbois are out in force

Help your Mom refresh her site content completely. Rewrite everything and reorganize it in a fresh new way. Make sure that you include some duplicate links for the same pages and do proper canonicalization to indicate to Google which is the authoritative link. https://support.google.com/webmasters/answer/139066?hl=en

In the process of rewriting, change some of the key URLs, i.e. some of the pages that link to a lot of other pages on your site, and submit those changes to Google for recrawling http://googlewebmastercentral.blogspot.co.uk/2011/08/submit-...

This will help defeat a copycat who has munged up their timestamps because Google will now believe that your mom's site is the original.

Don't rewrite too much, i.e. no need to change every word, but just try to improve every page, make it a bit clearer, restructure the content to put the most important info first using journalistic pyramid style writing. Change some of the wording in headings and titles. Never make a change just for change's sake, but always make changes when they improve the content, and help the target audience for that page. You could even test a few pages on actual target audience members and ask them if they think the rewrite is better and clearer.

Once you go through all of this, keep up the resubmissions to Google when key pages change, for instance the site map, index pages, portal pages, and so on.

They will no doubt copy your site again, but it will do them no good.

P.S. if you want to try an underhanded attack on the people doing this, don't go to a blackhat link spam site. Instead contact the site owners of all the real sites, and offer to help them for free, in making their site copy-proof. If you find anyone who actually paid money, urge them to report the fraud to police. Keep hunting for the people behind this, and when you find them, send a copy of all the official fraud complaints from all over the country to their local police department. That way you keep the moral high ground.

What if for whatever reason your website goes down for a bit.... you go on holiday not realizing because everyone needs a holiday for a couple weeks.. come back oh no.. will the scrapers now take your place as the originals? That is the question.

Not trying to be snarky, but what happened to "Don't write for search engines, just worry about the human visitor". Sure Google might get confused but I doubt your regular visitors will confuse that spam site with yours..

Well if people are searching for piano lessons and end up on that site instead of my mom's they might just give up rather than try to find the real site since they most likely will assume that's the site itself. Google's responsible for helping the human visitors get to where they need to go.

take your link to the fake site down from your page here! You're just upranking them and giving them potential traffic by giving them in-links.

Yea - I have the rel="nofollow". Do you know if that's enough? Obviously I don't want to give them any extra juice.

You're probably OK but I wouldn't link them at all. Anyone who pretends to know the secret sauce of Google or any other search engine is either lying or is a high-level engineer working specifically on search at said search engine.

just what he said. Most blogs on controversies deliberately don't link to the bad guys' site (you can always google them, right?) just so as not to give them traffic or any likely benefit.

I removed the links but kept the text. I want to make it easy for people to compare the two without actually giving them the benefit. Surprisingly difficult.

That's enough.

the title tag of this is misleading - this is some sort of scammer, not a "small business SEO company."

-1 for being mobile hostile by disabling pinchzoom. What value could there possibly be in disabling pinchzoom?

Hmm. That was not the intent. Let me take a look and see if I can fix the css.

Thanks for letting me know.

Applications are open for YC Winter 2021

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact