

Mahalo's disappearing noindex tags - dazz
http://smackdown.blogsblogsblogs.com/2010/03/08/jason-calacanis-makes-matt-cutts-a-liar/

======
omarchowdhury
When will we stop hearing about Calacanis and Mahalo? The only thing I take
away from the whole thing is how effective manipulating search engines can be.
And I already knew this. From blackhat SEOs.

The only thing the blackhats haven't taught me is how to get into the Google
index and stay there - how are you doing it, Calacanis?

~~~
jasonmcalacanis
This is getting a little silly I agree. I guess the SEOs are not going to stop
so we're all going to have to suffer as they create spam pages in Mahalo and
then point to them and say "Calacanis allows spam!"

Update:

1\. Did we have show short pages in Mahalo? Yes.

2\. How did they get there? Our users start topic pages and don't finish them.

3\. Were they getting traffic? Not really... like < 6% of our traffic was from
short pages. We know this because we've not deleted them or greatly improved
them.

4\. Why does this keep coming up with the SEOs? Because I called SEO bullsh#$t
back in 2005 when someone asked me at a conference why Engadget and Joystiq
were doing so well with SEO.

5\. Do you still think SEO is bullshit? No, I don't. Back then I didn't know
what SEO was and assumed it was spamming google so I said, "oh, that's bs."
This was true at the time. At Weblogs Inc. we did zero SEO and we had amazing
search traffic. Why? Well, it was later explained to me that we made a lot of
content (20 posts a day across 50 blogs = a lot of posts), we got a lot of
comments on them and we used keywords in the titles of the posts before anyone
else used them. Today's I don't think SEO is BS. I think SEO is an essential
part of building any startup just like building a viral loop, doing PR or
hiring great people.

6\. Are you sorry to the SEO community for hurting their feelings. Yes, I'm
really really sorry.

7\. I found a page that has no content on it? yes, that will happen from time
to time... when we find them we delete, redirect or expand them. These will be
in the < 1% of our corpus range. If you do find one email me jason at
mahalo.com.

8\. don't you have tools to do that? yes and no. we're building them. we need
to balance turning off pages people are planning on building out vs. ones that
are abandoned. So, you will see nofollow on pages with no content, and no
follow on pages with < 100 words if they are older than X day, etc. This is
all being built now.

That's it... there is nothing bad going on here. We're good people doing good
work. Hundreds of people are making real money to take care of their families
at Mahalo and I wish you guys would stop attacking them.

If you're pissed at me take it out on me, leave Mahalo out of it.

.... or don't. Who cares. This is so biased and absurd right now I don't think
any is really taking it seriously any more. Aaron had a handful of good points
last month, but now folks are just doing this to linkbait for their SEO sites.

great... trust me, this is making my experience reading HN annoying too. If HN
wants to filter out "calacanis" and "mahalo" from HN i'm all for it. Clearly
the system here is being abused.

~~~
mvandemar
Jason, you are actually full of crap. For starters, you are mistaken, or
pretending not to know, as to why this keeps happening to you. You brought
this on yourself, and on Mahalo.com, by unsubtle spamming and boasting about
it, and then being rude to boot. You can whine all you want about the
perceived injustice of people calling you on your crap, but the fact remains
it is your actions that are getting called out, nothing more and nothing less.

This isn't about a small handful of pages that don't have content on them, and
if you read the posts you were rebutting instead of trying so damn hard to
make yourself sound better then you might actually have information you could
use to improve your site.

The vast majority of the pages on Mahalo are empty pages devoid of human
generated content. It is not like I am straining my ass off to find them,
either. Click on this link, scroll down the page, click through to 20-30 pages
(the ones with all lower case titles demonstrate what I am talking about most
often, focus on those), and LOOK AT YOUR WEBSITE:

[http://www.google.com/search?num=100&hl=en&safe=off&...](http://www.google.com/search?num=100&hl=en&safe=off&q=site:mahalo.com&start=300&sa=N)

In order to appease Matt Cutts, who gave you the courtesy of warning you
instead of banning you outright, you did in fact add noindex to all of those
pages. 1 day later, after he saw that it was there, you removed it again.

Pretending not to know what is being talked about and making shit up in an
attempt to garner pity for being picked on is _not_ helping your cause any.

~~~
jasonmcalacanis
as I've said, all these pages are being deleted, redirected or built out.

also, the no index thing is going to be on all pages under 100 original words.

we are working as fast as we can to resolve all these stubs, but many are
owned by users and we don't want to just nuke them.... so, we're busy at work.

thanks for the feedback....

~~~
niyazpk
Awwwww..this is so cute, Jason!

I have a mental picture of you and your engineers trying hard for months in
vain to add no-index to all those pages.

Don't worry any more. I am here. Here is the code. Copy paste and everything
will be fine:

    
    
      if(content.length < 100){
          content.setNoIndex(True);
      }
    

Let me know if there is anything that needs to be explained. (I can talk to
your engineers if needed)

Peace and Love

------
alain94040
Shame on Google algorithms for not automatically being able to detect a page
without content.

Over the years, my view of the Google's ranking algorithms went from dream-
like (they are so smart they understand everything) to infant-like (just a
bunch of keyword counters).

Obviously there _has_ to be huge complexity involved, so my current view must
be pessimistic, but still...

Just one (unrelated) example: search for "leader in online used cars" in
Google. Does every human understand what I'm looking for? Here's search result
#5: "Indian holy leader and BA stewardess arrested over prostitution".

~~~
scott_s
Google is a tool, and like all tools, you need to learn how to use it. Two of
your keywords are unnecessary, and I think that caused the problem. First, you
don't need "leader." If you're looking for the top of something, you already
get that information from the ordering of results. Second, "online" is
redundant, since anything you look for will probably be "online." (And I
suppose you don't need "in," but I don't think Google used it anyway.)

So then you're just left with "used cars." That result page looks reasonable
to me.

~~~
alain94040
You make my point better than I could! You just explained to me that my query,
which every human understood perfectly, is stupidly understood as "used cars",
so Google is a basic keyword algorithm and this still doesn't explain result
#5.

Really, I'm smart enough to understand your explanation of how the tool works.
The problem is that I'm in the 99th percentile in the field of "computer
keyword understanding" over the general Internet population. So what does that
say about the tool?

~~~
froo
>So what does that say about the tool?

That there is probably room in the market for more search engines - perhaps
specifically ones that specialise in translating only human queries and
returning results rather than trying to be more general like Google is.

Like almost all professions that utilise tools in some way, sometimes you just
need more specialised tools. A chef wouldn't be that great if he (or she) only
had knives to cook with.

------
raganwald
Who is Jason Calacanis and why is this particular but of SEO notable? What is
aspect of this should "gratify my intellectual curiosity?"

~~~
jcromartie
Jason Calacanis is a guy who runs a sweatshop building a website you've
probably never been to and never will because you're a smart person who
doesn't fall for spam.

He also has made a career out of being a professional asshole and making fun
of "lifestyle businesses" that make products that people like and pay for. He
thinks that if you're not a workaholic and want to enjoy your short time here
on Earth then you shouldn't be working with startups.

~~~
josefresco
Keep your personal opinions out of this debate, please.

This is who Jason is for those who do not know why his 'bad behavior' is so
watched and talked about: <http://en.wikipedia.org/wiki/Jason_Calacanis>

~~~
Perceval
I think it's reprehensible that parent has been downvoted. _Ad hominem_ is _ad
hominem_ , no matter how deserving we think the target might be. Great-
grandparent poster asked a question, ostensibly seeking an informative answer.
What they got was grand-parent's uninformative tirade. While it certainly
represents how people feel about Calcanis, it does not in any way answer
great-grandparent's question, it is not informative, and it undermines the
norm of civil discourse that we should expect on HN.

Refresher: <http://www.paulgraham.com/disagree.html>

~~~
scott_s
_Ad hominem_ is not a synonym for _insult_. It is used exclusively for when
someone attacks the speaker, and not the speaker's arguments. In this case,
there were no arguments. The question was "Who is X?" Someone answered who X
is. That answer was harsh, but also consistent with everything that I have
seen.

I agree its civility is borderline. But it's not an _ad hominem_ attack
because no argument was being made.

------
houseabsolute
Posted this over on that blog, but I thought it might be interesting to HN
readers too:

I also noticed this snippet of code:

    
    
      if(url.indexOf(”google.”)!=-1||url.indexOf(”search.yahoo”)!=-1||url.indexOf(”search.live”)!=-1||url.indexOf(”bing.”)!=-1){
        if($(”#adsense_2719325″).parents(”#right-column”).length > 0 && !isImage){
          $(”#adsense_2719325″).addClass(”afc_wide”);
        }
        $(”#adsense_2719325″).html(ads);
        $(”#ads-section-2719325″).show();
      } else {
        $(”#ads-section-2719325″).hide();
      }
    

Does that mean that he basically shows different content to people coming from
search engines? Is that kosher?

~~~
vaksel
showing/hiding ads is perfectly fine.

What you can't do is use that to keyword stuff your site.

~~~
prawn
On one of my sites, I used AdSense channels to learn more about the CTR from
those arriving from Google vs regular users of the site. The CTR from incoming
Google traffic was 10x higher, so I show ads to that traffic but not to the
regulars - keeps them happier.

------
aresant
In Jason's most recent subscriber email he absolutely THROWS Mark Zuckerberg
under the bus for lacking integrity and riding on the edge to grow his
business.

If there’s ever been a case of the pot calling the kettle black . . .

------
chaosmachine
I'm sure it's just a bug, and they're working to fix it real soon now... ;)

~~~
die_sekte
I am just waiting for Calacanis to show up here and say just that.

I have never seen anybody else lie that openly and get away with it.

~~~
froo
>I am just waiting for Calacanis to show up here and say just that.

Lets not forget he'll use his obligatory "Peace n love" signature...

------
epi0Bauqu
FWIW, these pages have been removed from Duck Duck Go for quite some time.

~~~
icey
Did you have to manually cull results from Mahalo, or did your algorithm
discard them automatically?

~~~
epi0Bauqu
Some automatic, but to be safe I put a manual block on them, which I do for
big "useless" sites I keep (or users keep) seeing again and again.

~~~
icey
Seeing as you operate a search engine, what's your take on this whole thing?
Do you feel this all truly lies in a gray area, or that there is a clear-cut
way things ought to be in this situation?

As an aside, your comment about blocking sites manually for your users just
crystallized the value of DuckDuckGo (and by extension other smaller search
engines) to me. There is definite value in a search engine that has some
opinions about what kind of results are considered quality results. There's
probably a name for this; but "curated search engine" or "opinionated search
engine" don't seem right.

It's kind of like the difference between a micro-brew and Google's Budweiser.

~~~
epi0Bauqu
It's all about perceived user quality. Google has guidelines, but if you read
closely they qualify them with we can do whatever we want. They keep making
the call that keeping Mahalo in is better for whatever reason, perhaps because
they think it will bad PR (censorship?), or they'll get a lot of complaints
(where is mahalo.com?) or maybe just because they don't see data that it is
bad for users now (site metrics).

I think what angers people legitimately is two fold. First, Mahalo seems to be
doing stuff that gets other people banned. Second, they're seemingly being
really sketchy about it.

Personally, I think they are on the whole useless pages and am happy to block
them. There are certainly a subset of useful pages, and I'd consider doing a
tighter integration where I could promote just those pages in our Zero-click
Info boxes. Same goes for something like Yahoo! Answers. Most of their pages
are useless, but they have a subset that are really useful.

To your latter point, I couldn't agree more. I'm much more aggressive at de-
listing what I deem to be "useless" sites. Google can't do this because they'd
get too much flack for it, e.g. censorship.

------
ryanb
I think this whole thing would have blown over if not for Jason's incredible
ego and his "I can do whatever I want" response to the whole issue. Pissing
off the geeks = bad idea.

------
euroclydon
Malhalo is starting to remind me of that (was it a Canadian) company who was
making all that money off of pay-per-click arbitrage between to search/ad
networks, and then one day, Google cut them off cold turkey. Who were they?

~~~
OmarIsmail
GeoSign.

This is the thing. The huge spotlight that guys like Aaron Wall are pointing
at Mahalo right now I'm sure is making Calacanis and Co sweating bullets. For
big cases like this, it's often a matter of managing perceptions and Google
may definitely take action if the story becomes big enough.

The thing is, there are quite a few sites that really skirt the edge of what
is acceptable. When these sites/companies are somewhat larger with lots of
employees and contractors, well the effect of 'blacklisting' can affect A LOT
of people's lives in a very bad way.

For the independent blackhatter, some spam side-site they spin off isn't their
livelihood and won't ruin them if banned. Mahalo (and ilk) isn't just a
technical matter. Of course if this story gets bigger and Google doesn't do
something about it, then this sets a bad precedent for the spam-community to
try and game Google. Sure Google will figure ways to deal with it, but if
you're the guy dealing with the spam... well, you'd just rather it not exist
from the beginning.

I'm really interested to see how this all plays out. From Google's side,
Mahalo's side, etc. It's riveting stuff actually.

------
utnick
The spam page example given in the article is:
<http://www.mahalo.com/aaronwall>

Question: how does somebody end up at this spam page? Google seems to do a
good job of filtering stuff like this. It wasn't in the first 10 or so pages
of search results for "aaron wall"

~~~
byrneseyeview
First: I highly recommend Aaron Wall's rank-checker tool:

<http://tools.seobook.com/>

Second: the idea is not to rank for every page. But the marginal cost of one
more page is basically zero. So if even 1% of them rank on page one, the
return on investment can be pretty high.

Calacanis is in a tough position, to be honest. He can dial down the quality
and automatically get more revenue for no extra money. But he can't reduce the
capital expenditure he made to create this system in the first place.

It's sort of like being a janitor, but having a great connection for huge
quantities of black tar heroin. You could be a really stand-up guy, but after
a while, all the easy money gets tempting...

------
tomh-
I guess as long as you bring in a lot of money for Google they will look the
other way. I'm just wondering if their anti spam team is aware of articles
like this and why they don't respond? This is really a clear case of spam and
its going on for ages..

------
jaxc
If Mahalo is alleged to be a scraper site then just let Google handle it.

If you think about, if Google delists the site then thats probably their
business model gone because Google is like 60% of search Market in the US and
higher else where so if Google delists them and take away their Adsense then
they are in trouble.

So let them taunt Google all they like and then see what happens when the
Google Dragon wakes up because it won't be pretty.

(Edited for typos and readibility)

~~~
byrneseyeview
Aaron is basically acting as their Ombudsman right now. Google will probably
lose some serious revenue from that, because they get a big cut of what Mahalo
generates. If Google lost that ad revenue for some other reason, the person
responsible might well be fired for it.

When Aaron Wall hounds them over it, they'll be more likely to make the right
decision.

~~~
jfarmer
Google does something like $18Bn per year in revenue and of that about
$750MM-$1Bn is from ads on non-Google-run sites.

So, I think they care about Mahalo as a source of revenue about >< that much.
They're probably more worried about possible anti-trust actions if they just
drop Mahalo -- I bet that would shave more off their market cap than losing
Mahalo as a customer.

~~~
lawrence
This is incorrect. AdSense was 31% of their revenues in Q409. They did more
than $2B in AdSense revenue in Q4 alone:

[http://investor.google.com/releases/2009Q4_google_earnings.h...](http://investor.google.com/releases/2009Q4_google_earnings.html)

~~~
jfarmer
I was working from this: [http://www.businessinsider.com/chart-of-the-day-in-
case-you-...](http://www.businessinsider.com/chart-of-the-day-in-case-you-had-
any-doubts-about-where-googles-revenue-comes-from-2010-2)

Those numbers are net sales, not gross revenue, so cost of revenue (revenue
splits, traffic acquisition costs, etc.) are factored out.

------
supadog
Froo is right, there is only one reason why Jason currently gets away with
spamming google and that is becuase both google and Mahalo are Sequoia funded.
Someone senior at Sequoia has told someone senior at google that mahalo will
clean their act up so please dont ban them yet.

------
greenlblue
Can I ask why everyone cares so much about Mahalo? Up to a week ago I didn't
even know "human powered search" existed and the consensus seems to be it's an
epic fail.

------
froo
It's the SEO equivalent of a shell game

------
shareme
I wonder what would happen if Jason just tuned off noindex tags on odd days?
Clearly turning it off and on gets more blogs talking about it each time

------
benatlas
I though it was Matt's job description to noindex pages? Oh, I get it now, he
delegates, let the patients run the asylum...

~~~
sp332
<http://en.wikipedia.org/wiki/Noindex>

------
dustingetz
he's acting in his incentive, who the hell cares. read: i don't understand why
anybody cares.

