Also, stating that such pages only amount to less than 1% of your revenue is in no way a justification of the content theft you're committing. You're cowardly sidestepping the issue. What have you got to say about the actual content theft, rebranding of such content as Mahalo's, and knocking down the original authors of that content by skipping on the credits and outranking them?
You got caught, dude.
Well, when doing a site search in Google for
site:mahalo.com "Links Powered by Google"
there are 553,000 pages indexed in Google which are using scraped search result content (with optimized page titles) to help pull in traffic.
and keep in mind that is just links from Google...there are also chunks of content from Google blog search, Twitter, and other sources (images, videos, news) on those pages
he is full of ____ if he is trying to get anyone to buy that doing the above is responsible for less than 1% of their traffic when Compete.com shows their search referral traffic as being ~ 60% of their referrals
It is not just a few (thousand) 100% auto-generated (experiment/stub/zebra/spam) pages that have scraped content on them...the above search shows Google estimates over a half million pages in their index contain content from their own search index...total regurgitation of 3rd party content :D
And lets not forget that 1.) he is using people's optimized page titles as content on his pages 2.) search traffic monetizes better via ads than other traffic forms...especially the search traffic that lands on a page for some random longtail keyword made up by arbitrarily combining chunks of 3rd party content mixed together and re-aggregated. 3.) in addition, there is a $0 editorial cost to scraping these millions and millions of content snippets and re-displaying them. 4.) he is making at least 5 figures a day from that content scraping...with 100% certainty.
his 1% remark is just another form of misinformation. nothing new there!
So it's most likely a big cash cow for Google.
Which do you think it's better for them to send a user to? Mahalo powered by Adsense or __________ powered by Tribal Fusion?
Oh and just to clarify I'm not saying Google is doing anything evil. They don't adjust their algos to help push Adsense sites higher. But when an adsense site reaches the #1 position, they do appreciate the extra revenue.
Reputation is worth billions to companies like Google. Google doesn't give a fuck about Mahalo.
SEVEN powered by Google areas.
It's not a perfect system, people can't put multiple search boxes in there.
However, that page will never rank well in a search engine (unless by a fluke). In order to rank well you really need to have 500+ original words.
We're in the process of moving all pages to that standard. It's really a self-regulating thing: if our contributors make short pages they never rank and never make money. They get frustrated and we teach them how to make longer pages and some day they may rank.
... it's really not a problem, and the truth is we rank for three things well:
1. video game walkthroughs (typically 2-10,000 words!)
2. how to articles (typically 800 to 5,000 words)
3. question & answer pages (typically 300 to 10,000 words),
Isn't this basic SEO (and i'm not expert): build original content and you might rank. Build short pages, you don't rank.
All pages start short (just like wikipedia stubs do), and over time we make them longer. that's the normal process.
Let me know how many need to be listed for you to see the patern, and I will prove you factually incorrect once more.
Your 'community' includes robots, according to the evidence Aaron collected.
What's keeping them ?
Am I the only one who simply can't stand Jason Calacanis?
He's often propped up as some sort of guru/authority/etc. of start-ups and the Web in general, and I just don't see it at all. I've never read any words of his and felt like a smarter or more knowledgeable person afterward; I only ever see rather mundane platitudes.
Perhaps I just haven't read the right pieces of his? If anyone believes this might the case, please consider responding to this post with a link or two. I'm seriously very baffled by his image (and to be honest, I don't think highly of Mahalo as a concept, for many various reasons I won't detail here unless someone is interested in them).
That's cruel. Let me rephrase: This isn't a new field. It's a series of new fields, all developing in tandem. Web design, web coding, web marketing, are all necessarily only a couple of decades old. The tools we're using are younger than that. Two years ago we were still debating how to align a webpage vertically. We've only begun to have real web typography in the same period of time. And programming is similarly new, and we still don't have good web marketing. But those of us who're invested in the web don't like to think about that, so we convince ourselves that we're as advanced and sophisticated as anything else. To do that, we lower the bar.
Celebrities in any field tend to balance skill and marketability. Rarely is somebody both ultrafamous and cutting-edge. So in a world where nobody is really that famous and nobody is really that talented, our famous icons are doubly pathetic. Even our best icons, like Zeldman and Fried, are bright but not really operating on a new plane. (I think there are some exceptions, like Dean Allen, but Dean isn't the size of Calcanis.)
That's why the brightest talents in our sphere come from other mediums. I think the best designers I know are all twee and twenty and annoying as fuck. For all their immaturity, that's the first generation that could grow up immersed in all the aspects of the business. But admitting their superiority is admitting just how fucking clueless we all are, and that wouldn't be as fun for the communty.
Calcanis is a particular jerk, though. I never got why we like him here. Mahalo's been a joke from launch day.
(Sorry if there were spelling errors. I wrote this on an iPod.)
One of the rules of this site is that you ought to write things that you would feel comfortable saying to someone's face. And often it's very practical because all sorts of people turn up to read stuff here. Including Jason, it appears.
That said, I get a bad vibe too: why is everything he is associated with seemingly "Jason's this" or "Jason's that". Generally, a startup is known first and foremost as the company, not as "Alexis Ohanian's Reddit" or "Sergey Brin's Google", but you always find that guy's name in the same phrase as Mahalo. I can't quite place my finger on why this bugs me, but perhaps it's something to do with building something up as a free standing entity versus lending your famous name to something.
I think it's this hypocrisy that's caused sentiment to turn ugly more than the (depressingly common) behavior of his company.
You know, despite being a total noob here, I still thought I had that one checked off on my list of things to watch for...until of course Jason himself showed up on the board (assuming it's really him, of course). I really did not see that coming. But what can I do now other than apologize and hope he wasn't offended? Sorry, Jason. :) No real defamation intended, just wanted to open a discussion on the validity of the opinions you propagate.
seriously, no big deal. I mix it up, so I expect that folks will do the same with me from time to time.
Thanks again, David.
He can be annoying at times, but I love the way he tells it like it is. Every episode, entrepreneurs call in and he helps us by dishing out no BS advice. Good advice. The kind you need to hear when you are too close to the problem to understand what to do. He doesn't patronize (well maybe just a little), but instead, he seems genuinely interested in helping other entrepreneurs.
Yes, his self promotion can get tiring, but hey, the guy is a marketing genius that we can all learn from. Check out the podcast.
After all without people building the content/handling things, Weblogs inc wouldn't have been such a hot property that sold to AOL for however much money it ended up selling for.
AOL was mainly buying into the trend of blogging / lower cost content production models when they bought Weblogs, Inc. They didn't buy a huge cash generating company.
Jason was bragging about having about a million a year in revenues (revenues, not profits) with some absurdly large number of writers (like 30). And part of that cash flow was selling text links that flow PageRank to online casinos...over 100 of them carpet-bombed his network of blogs.
And (ironically) back then he was also selling links to scraper sites
(maybe that is where the Mahalo idea came from)?
Actually, we didn't know were selling text ads to people selling page rank back in 2005... just like Tim O'Rielly didn't. We both turned those ads off when we found out.
so, we actually did the RIGHT THING and you're still trying to paint me as some evil spammer.
to this day Joystiq, Hackaday, Engadget, TechCrunch50, This Week in Tech, This Week in Startups, Open Angel Forum, Blippy, Gowalla, GDGT and dozens of other brands/projects I've been involved with/invested in are doing great work.
You're making me into this black hat who is trying to cheat... it's not how I operate. I true to make and support great products and improve them consistently.
A lot of folks I know like the brands I'm involved with and I'm very proud of those brands.
Have I made mistakes over the last 15 years of building brands? of course... but i'm always honest and engage folks who point them out. in fact, i thank them.... like i've done here.
your attacking me for what you think are my weakest points just gets my team focused on fixing them--and for that I thank you!
Honestly, thank you! if there are things we can do better we want to do them. We're in it for the long-haul. We want to make a great brand and we don't want to do it by cutting corners.
I know you were hurt when I said "seo is bullshit" a million years ago, but back then that is what i thought. I didn't know anything about SEO back then and I really don't know much about it now. i was wrong when i said it was bullshit and I have said that it was an off the cuff, uninformed statement many times.
what is it going to take to make you happy?
Well I am still in the same company I was back when you call my industry scum (while you were selling links to shady casinos). As far as I am concerned your opportunistic jumps between companies are all part of the same general trend / strategy. You still have not changed as a person.
"i was wrong when i said it was bullshit and I have said that it was an off the cuff, uninformed statement many times."
Yes but the difference is you stand on a podium at conferences and yell to the press when you slag it off.
When you say it has value you do it on a twitter post or a quick blog post that does not even hit your homepage. You are not quite as loud in this case.
I would say to make it right you should probably need to (at a minimum)...
- Direct link out to the sites you are scraping content from. FIX THE ISSUE.
- Make an in-depth post on your blog highlighting how important SEO is to all web based businesses. Describe how you gamed social media sites and worked media and nepotistic angles to build links for Mahalo. Also describe the bait and switch public relations and attack bait angles you used to build links. Better yet...make this an in-depth case study!
- Make sure a copy of that post lands in the inboxes of your media email list and your Jason email list.
- Issue a press release announcing the above blog post.
If you do all of those then there is a pretty strong chance I will think your apology is sincere. Anything less and I will realize that this is once more another round of posturing.
I'll probably get voted down for this, but I think you're crossing a line here. I agree with some of your opinions, but these kind of rants against a person in particular and some kind of personal dislike you have for them really hurts your case. It's just petty bullshit and it reveals personal bias that really clouds the issue.
I made a simple, uninformed comment five years ago during a Q&A session when someone asked me about SEO. Back then when we were doing Engadget and Joystiq and HAckADay we didn't do ANY SEO and our opinion of SEO was it was a waste of time.
I've personally learned from great SEOs like you and Michael Grey how to do white hat stuff, and what the best practices are.
Our goal at Mahalo is to produce great content, and yes, rank for that content when we have a GREAT page in search engines.
We don't want to rank for lower quality pages or pages that are being built out. Your anger towards me is misplaced I think... i was joking saying SEO is bullshit and it's not my fault that everyone takes my one-liners and turns them into gospel.
I will make a concerted effort to rehabilitate the damage I've done with my flippant comments. Honestly, I will. Even willing to do a joint press release with you and have you on This Week in Startups to discuss...
... provided you keep giving me all this great free advice! :-p
all the best,
However, I would give myself a modest amount of credit for finding verticals a little earlier than average (silicon alley, blogs, human-powered search, etc), as well as marketing, branding and product design.
I never asked to be an expert... I just state my opinion and folks can respond to it. Sometimes i'm right, many times I'm wrong.
I don't claim to be smarter than anyone... I just go to work every day and try to do a little better each day.
I'm not your biggest fan, but can everybody piling on here read this and take it to heart? The best thing about the Internet is that you can say anything whenever you want and learn as you go. Just because Jason's more famous than you are doesn't mean he expects his words to be taken as canon, so if he apologizes for saying something and backtracks it's probably less slimy than it is genuine people-not-being-perfectly-consistent-always.
I notice people are piling on the downvotes right now, which is a shitty way to handle a debate. Even if you disagree with people, vote based on whether they're contributing to the conversation. When the guy who made Mahalo comes in, he's contributing something rare and unique and if you pound it into the dirt you're making us all look like immature jerks.
I am just calling a spade a spade.
I have never met the guy in person...I just think he should hold himself to a higher standard if he is going to trash an entire industry he should educate himself on best practices within it rather than claiming ignorance while building a deceptive business model.
While Jason goes out in public and says one thing, based the examples in Aaron's blog post he does just the opposite in practice.
Sites like Mahalo that auto generate content and scrape content are increasingly becoming a problem online as they clutter the search results and hurt the publishers they scrape and steal from. Google is doing worse than turning a blind eye to this as they actually are encouraging these very large "content providers" to create content for filling ad space (see the Wired article on Demand Media).
With SEO being an industry that has a negative perception by many, those involved in the industry feel the need to defend their profession.
In the above metaphor, the artists are the bloggers whose content Mahalo is using. The radio station ripping off the artist is Mahalo. The Federal Communication Commission is like Google, who is allowing all this to continue because the radio station is giving them a cut from the advertising revenue.
Hope this helps make it a little more clear why what they are doing is wrong, needed to get exposed and needs to get fixed.
This is like Tiger Words claiming he desired privacy for his family deeds (after he had went outside his own family). if you want the public relations when you are hyping your own company and/or trashing other people's livelihoods then you accept the greater responsibility.
you can't just choose to take all the benefit and fall back on that <em>I am no expert</em> crap when you get caught in a lie.
"if he apologizes for saying something and backtracks"
That is the big problem. The apology is insincere.
He is not doing any real backtracking or shifting of strategy...just repeating a fake apology WITHOUT addressing the issues that were brought up.
MOST CRUCIALLY putting nofollow on the links to the sites he is "borrowing" content from.
How can you call Maholo a search engine when the majority of the traffic you get comes from other search engines. How is Maholo any different than About.com?
The original SEO book post and HN discussion are here:
EDITED: Added correct link (thanks icey!)
I think this is what you meant to link to: http://news.ycombinator.com/item?id=1073723
Mahalo is a great money printing machine, but come on, those are pages designed to do one thing - rip content quickly, monetize, and SPAM google.
We can't tell whether Jason is misleading us about the proportion of scrape-generated pages on Mahalo without access to any Mahalo page statistics.
I'm not for or against Jason on this matter, I'm just saying that we have no data on which to base any conclusions. It's possible he's telling the truth.
It would be interesting for someone to take up the challenge of creating a small web app that finds all Mahalo URLs, heuristically examines them for spamminess and generates some statistics.
Not that such an app would be of any particular long term use, but it might be interesting nonetheless.
Internet users would benefit more from this than pulling the plug from google.cn.
... the truth is having short content pages indexed works against you--that's why i no indexed them to being with.
Again, I'm not as big of an expert on SEO as Aaron, but I think there is something called "page rank sculpting" in which you push your sites page rank to high-quality pages and no-index the ones that are shorter in terms of original content. We did that because one of our people read about it on a blog.
We're probably holding ourselves back having removed this and we are putting it back on because we didn't realize it was off.
That is why I thanked Aaron.
Is the page rank sculpting thing not a good idea? I thought this was a fairly certain thing: only index the best pages.
Scraping 3rd party content (without permission) and then putting nofollow on the links is bogus.
But you keep avoiding that issue because you realize what you are doing, and your goal is to cash in on others content for as long as you possibly can.
Nice diversion attempt though!
Of the pages indexed in Google that are of this type of spam in nature, Mahalo accounts for less than .1% .. I think this is more Google's issue than Mahalo's.
Huh? I'm not sure which way I stand on this whole mess, but the fact that he makes up a tiny portion of the "spam" doesn't seem relevant? Is a petty thief less bad because he accounts for less than .00001% of theft in the world?
To me it doesn't seem to help his credibility.
1) "Spreading our PageRank" is BS. More pages == more total PageRank. Wikipedia's summary isn't half bad: http://en.wikipedia.org/wiki/PageRank
2) Remove the nofollow's from the attribution links to content you've scraped. Anything else is just plain rude.
I think this is a question we all, other [slightly petty] arguments aside, want to hear an answer to. I doubt we will though - care to prove us wrong Jason?
so, i think you have the issue confused.
Also, on #1 I think most SEOs would say your wrong. More pages does NOT equal more page rank. It's the EXACT opposite.
MORE QUALITY PAGES and more QUALITY links = bette SEO from what the top SEOs have told me. I hope this helps with your site.
1) "MORE QUALITY PAGES and more QUALITY links = bette SEO from what the top SEOs have told me. "
For this purpose, indexed by Google counts as "quality." Please read the PageRank paper, and you'll quickly see that more pages in the index == more PageRank.
That doesn't always mean putting crap pages into Google is a good idea, but strictly speaking, it will raise the total amount of PageRank your site has. Seriously, read the paper yourself.
2) "but we don't scrape content. our search results do have abstracts but they are smaller than Google's, and in some cases they actually are google's."
Perhaps scrape is the wrong term. But you absolutely do include other people's content and nofollow it, for example all of the images on this page:
NoFollow is meant for links to content which you don't want to "vouch" for. If you've included their content in your page, I think you can vouch for it.
Maybe an honest oversight, maybe not. Either way, you know about it now, so fix it.
keep in mind all of your content is creative commons licensed...so if you keep up your scrape and nofollow attribution game, you are going to be in for some surprises where you feel the effects of what you are doing onto others.
your choice...either fix the nofollow anti-attribution issue yourself or have the market fix it for you.
you can also tell how insincere he was in that his recommendation never hit the homepage of his site...he didn't want certain people to see it
I'm also willing to jump into this mud pit and get my ass kicked by you guys. :-)