Hacker News new | comments | ask | show | jobs | submit login
Maddox on Link-Baiting Aggregation Sites (thebestpageintheuniverse.net)
271 points by asianexpress on May 23, 2012 | hide | past | web | favorite | 100 comments

This is the state of the Internet under the DMCA--anyone can post or repost anything, even if they don't own it, and it is the burden of the original content owner to find each instance of unauthorized use and request it be taken down.

Part of Maddox's post addresses the frustration this can cause among people who create original content for a living. It seems to me that this frustration drives a lot of the worst impulses of groups like the RIAA and MPAA.

It also seems likely to me that there could be good technological solutions to the problem, that don't require lawsuits and crazy new laws. However, there is no incentive for people to develop these technological solutions. Instead the financial incentives (in this case, ad revenue) drive tech folks to build stupid sites like Ranker.com.

Acknowledged: I'm only addressing part of the story here...the friendly "you might like this!" emails are ridiculous. It's probably some 20 year old making 8 dollars an hour sending them...the 21st century equivalent of the telemarketer.

I definitely agree with you -- but I do think it's a double-edged sword. On one hand a lot of unauthorized content gets posted and people get frustrated because their original content is ripped/duplicated (with other people sometimes taking credit!), but on the other hand, the ease with which people can post stuff also leads to greater exposure.

Any extra steps to post content can mean the difference between something going viral and something remaining unseen in the dark corners of the Internet. I don't know if this is a good comparison, but this reminds me of the piracy study[1] that said piracy was beneficial for sales. These lists are a way of "pirating"/distributing content, though it seems without any real gain to the original creator due to lack of attribution (that's where the comparison definitely breaks down). I suppose the two would be more similar if you could somehow watermark the content to somehow point it back at the creator

Obviously, this method is really annoying for a lot of us, but the fact that it works so well and is generating all that traffic, likens it to all that Viagra spam we get -- people keep clicking! I don't know if there's any amount of technology that will help people gain common sense

[1] http://torrentfreak.com/bittorrent-piracy-boosts-music-sales...

Honestly things are changing. When I make a rageface comic for /r/f7u12 (don't judge me) I really don't care if it gets reposted to 7gag, 4chan, whatever. I'd much prefer assumed copyleft than assumed copyright.

You indicate that the DMCA isn't powerful enough when in fact it also happens to be too powerful. It is a bit of a blunt instrument and is frequently used incorrectly for censorship and not just rights protection.

A good "technical" solution might be a search engine that harshly penalizes these types of articles and the domains that host them.

I just signed up for Ranker to figure out the email situation for Maddox.

- They send through mailgun.net.

- Mailgun.net doesn't include list-unsubscribe header, which means that spam reports from Google apps/gmail won't back to the sender.

- If no unsubscribe link is included, then CAN-SPAM is not being followed either.

So if you wanted to unsubscribe from this, the only way to do it is mail the list owner directly (in this case, Nicole).


- Mailgun needs to implement list-unsubscribe if they want to get back spam reports from ISPs that don't support ARF/JMRP.

- Mailgun should ensure their customers adhere to CAN-SPAM laws, which requires a working "opt-out" link.

It was satisfying to have this pattern defined by someone. If there is a content akin to peanut butter cups and dollar menu meals, these aggregation sites are the best fit. And, like a peanut butter cup left at my keyboard, when I happen upon one, I will digest each, and the consequences are mentally similar- indigestion of the mind, malaise, unease. And the uneasy insight I'll just do it again and again.

I thought Maddox had "retired". After reading his book, like five hundred years ago, I sort of forgot him. I'm glad he is still the same old in-your-face writer that he is.

He does make a good point. Cheap content is a big problem these days. Cheap content that can be easily copied by scrappers, that is. Maybe this will force content providers to change the medium?

I thought he gave up and became an annual april fools person.

This pretty much sums up what's going on with "Chance."


That was an April fools joke.

April fools post!

Not to defend ranker.com, but maybe maddox@xmission.com really isn't in their database.

It's possible that someone is automatically forwarding a different address to maddox's inbox.

It'd be worth examining the SMTP header of one of these messages and tracing back the ownership of each relay.

That's not actually his problem. They could easily add an unsubscribe link and make everyone's lives easier, but that would screw up the faux familiarity.

In fact, if this is a US company, I'm reasonably confident those messages are not CAN-SPAM compliant.

The text of the personal reply makes this possibility unlikely:

  I'd really hate to lose the "Best Page In The Universe",
  maybe we can find something a bit more suited to your liking.
Obviously it's not just a random email address that someone forwarded on to Maddox if the site name is specifically mentioned.

Biot, meet formmail. Formmail, mmet biot. So it goes.

It may well be that, but that doesn't make it okay. They shouldn't be building an email database by scraping, which is almost certainly what happened.

In a vaguely related note: I recently had a situation with an organisation that kept bulk emailing a plug for its conference and would ignore requests to eliminate all addresses using[mydomain].com from their database (they were sending email to aliases as well as actual mailboxes).

> It's similar to the copout line included in email marketing when companies know their contact lists are spurious, but they want to err on the side of self-interest by emailing you anyway: "if you received this email by accident, please unsubscribe by using the unsubscribe button."

That's not a cop out line. For some websites, yes, it very well could be, but there is a legitimate purpose to it as well. Anybody can go on a website and enter "maddox@xmission.com" in the email input box for a subscription to SpamMePlease.com, even people who aren't Maddox. Putting that line in the email is a way of saying "you might not have done this, so here's a quick link you can use to tell us to fuck off."

You can argue that they should use a "confirm this subscription" email instead, but the two options aren't that dissimilar when you think about it.

Which is why double opt-in is the only way to run a list if you truly care about marketing with permission.

Respectfully, I think you are entirely wrong: the two options are VERY dissimilar. In one case, you send them a single "Click here to confirm" email -- and maybe a single follow-up a few days later.

In the other case, you're continually marketing to them and putting the burden on the recipient to opt-out, usually in tiny text at the bottom of the email.

and on top of that, the last thing you want to do for many actual spam lists is to click their per-address unique tracking tag which tells them that your address is valid.

If you've already started communicating with them I guess it's probably reasonable to consider it 'blown' anyway, but I'm always extremely dubious of opt-out/unsub links for that reason.

Sure they are. I have a common name for a gmail address (early adopter). I get emails for every Steve, Shirley, Shane, Stephanie, Sam, etc in the world who have my last name and a gmail account. I also get emails from every mailing list these people have signed up for that do not use confirmation emails. The email address has become unusable because of that alone.

>> I have a common name for a gmail address (early adopter). I get emails for every Steve, Shirley, Shane, Stephanie, Sam, etc in the world who have my last name

I do, too.

Have fun with it?

I get emails for every "J" name with my same last name. At times I respond and get into a whole lot of mischief.

Yes, it can be tempting, especially the chain mail senders who ignore my "you have the wrong email address" pleas.

I have a fairly unique first name and managed to get it @gmail.com and the amount of legitimate, misaddressed email I get is staggering. 5-10 per week.

The unsubscribe link is how spammers know that the autogenerated email address they are sending to is read by a human being. It makes the address more valuable when sold to another spammer. That one spammer may stop sending you messages from the same brand, to avoid landing in blacklists, but his other brands and his affiliates will be sending you more.

The US is a bit different from the rest of the world since US law (the Yes-You-Can-Spam act) requires that US-based spammers include and respect an unsubscribe link. Neither requirement is actively enforced, so the question of whether one can expect an unsubscribe link to work remains questionable.

Oh god, I love seeing Maddox beat this drum.

Calling out HuffPo was awesome too.

I'd rather see him call out HuffPo for promoting dangerously idiotic quack bullshit.


I think your link is broken.

Cracked.com has mastered the shitty 'listicle'[1].



about the spam: I miss the days of god-tier trolls SPEWS.

Cracked at least makes a modicum of effort -- they have a stable of columnists who are more focused on being funny or informative than being eye-grabbing (Seanbaby probably being the most well known).

9gag and FunnyJunk are a whole different level of content farming.

> Cracked at least makes a modicum of effort

In particular, Cracked at least makes a pretty decent attempt to make their (mostly original!) editorial content the focus of their articles. The "listicles" that Maddox is really focusing his ire at here generally have little to no original content at all -- the bulk of their posts are either "borrowed" images or scraped/stolen content.

This. Cracked was my first thought when reading the post, but Cracked has a load of writers and is one of the sites that gets ripped off often.

Honestly, I consistently find Cracked 'listicles' more insightful than I expect. It's pretty impressive, for what they are.

Yes, I was wrong about cracked and I was unfairly harsh.

Yeah, cracked.com actually treat their lists as editorial content -- "Articles are workshopped, researched, written and edited."[1]

Needless to say, they don't take kindly to people aggregating their content on other sites.


I want Maddox to riff on those those insipid ads that go like: "Single mom in Podunk, OH discovers 1 silly trick to make wrinkles disappear, and doctors HATE her!" So much potential for comedy there.

You should send that to him as a suggestion. He loves it when his readers suggests things they'd like him to write about.

Don't make Maddox do all the work. Post a list of 7 funny comments on your own site and I bet Maddox would want to link to it.

Or he'd copy and paste your mail and rip you a new one on his site.

oddly, hardly anyone realizes that Ranker.com is actually one of the most successful semantic web plays...

I've never even heard of ranker.com before. I must not be searching for terms they rank for.

Say what?

They use data from Freebase (the little brother of the Google Knowledge Graph) to help users populate the lists.

They get 8+ million page views on a good week.

Fortunately Google's Panda has dramatically slowed down the growth for sites like Ranker. Google applies an extremely heavy penalty for using content from sites like Freebase and Wikipedia. If those penalties didn't exist, there's little question Ranker would be a top 1k site.

Other lame content theft sites like Mahalo have been practically put out of their misery.

Ranker ~is~ a top 1k site.

Personally I've looked over the data for a number of successes and failures and I don't believe the conventional story about Panda.

Adding the data in this article to everything else I know about Ranker just confirms my model of what works and what doesn't.

No, they are not, at least not according to major reporters of such stats. What measure are you pointing to in order to indicate they are a top 1k site?

they ranked at #992 in Quantcast last week; they're got the Quantcast tracking code too, so the traffic numbers aren't crazy

I actually don't think they do; they're directly measured by quantcast. If you go to


and select page views, week, and global they still peak at 4mm. Which is nothing to laugh at.

There is a gmail plugin for sending canned responses. For repeat spam offenders I create a filter that automatically deletes the email and responds with a canned "Stop spamming me. Your original email was deleted and never read." I never have to deal with the spam and they have to deal with mine.

That works great if the spammer hasn't forged the return address. Many spam emails contain forged headers and a working link to the scam website in the body.

A few years ago some #$^%er sent a few bazillion porn dvd spam emails with one of my domains in the header. An amazing number of admins/folks like yourself actually reply to a forged address.

That's assuming your replies end up in a mailbox they actually monitor, though.

Typically these repeat spammers are individuals who manage some community email list (think apartment association or university student org) and don't know how to do anything but add people and copy/paste into the to field. For these people it works great.

> Tools to automatically check whether or not submitted content was original would be trivial to make. Yet they don't exist because it's not in these sites' best interest to stop accepting stolen bullshit.

How is a tool supposed to know who has permission to do what? Yes, you can use heuristics to make guesses. But you're making guesses, you don't actually know anything.

He's talking about originality, not permission. An automated tool could fairly reliably determine how much content was already posted elsewhere.

Permission is what's required. I get that he hates that, but that doesn't make him right.

He's using the number of other copies as a heuristic. But that heuristic fails when people want to spread their content, which is far less uncommon than he believes.

Permission is what's required to keep from getting sued, except that the DMCA keeps you from getting sued when it's user-submitted content. Permission isn't necessary at all. Maddox is promoting originality because it's a component of having a quality site. I'm sure he's aware that most of the people running those sites don't care about quality and is making an issue out of it because he wants other people to dislike the sites.

I've linked to terrible list posts just for easy references. I'm sorry for making the web a worse place :-\ I'll be much more diligent from this point on.

I know this is more like a sub point of the article, but I'm amazed at how poorly people deal with spam.

That's because it's illegal in most municipalities to douse spammers in gasoline and light them on fire.

Well if you think about it, the more popular you are, the more spam you have to deal with, especially if they get a hold of your main email address. I don't know what it's like to have as large of an audience that Maddox has (or had?), but even as the CTO of a small startup, I have to deal with a decent amount of spam that makes it to my inbox despite Gmail filters. Sometimes mixed in among those emails is an important one, unfortunately.

Though yeah, in this case I probably would have just filtered this particular list haha

Being an angry asshole is kind of Maddox's schtick

Blech, all-caps large bold Arial just screams “I DON’T KNOW WHAT HELVETICA IS.”

Love that a Maddox page hit HN front page, though. It’s like a blast from the past, but new.

Edit to make this comment slightly more acceptable on HN: CSS protip: If you want to show Helvetica to OS X users but not Windows users who are lucky enough to have it installed because Windows renders Helvetica like shit, use this font stack:

  font-family: "Helvetica Neue", Arial,
    Helvetica, sans-serif;

Yes, yes, 100 times yes. I'm so sick of the shallow, mindless BS these sites repost constantly. Great post, great details and investigation. Thanks

I thought this was going to be about Mashable. :(

I read Maddox as Madoff, and thought this was written by Madoff about how SEO is like a pyramid scheme :(

Damn, that was a very long and hard to read Maddox article. He's losing his charm. Or growing up, or something like that.

Even more likely, you're growing up.

Maddox was funny when I was 12. I'm 22 now.

This article is not intended to be funny.

Your reply is funny though. Very.

The older folks are going to get a good laugh at you.

"I'm 22 now."


Being younger is hilarious!

I know, right? when I was 3 I didn't understand the hilarity of it all. But now I'm almost 7 and I know it all.

That article was kind of serious. He still writes articles in his old school style http://thebestpageintheuniverse.net/c.cgi?u=onions

He was here 10 years ago, he was the king then, he's still around in 2012 and he still rocks. Whichever way you want to look at it, 99% of the people dissing the quality of his work and noting how he's lost the magic don't have the talent to judge him anyway.

Yelling and swearing and saying "shitbitch" just isn't funny now that I'm not 13 any more.

Nobody has mentioned this yet, but if HN doesn't, who will? So, here goes: the graph makes no sense! Overlying the reading time and attention span graphs only makes sense if they use the same scales, but then why would attention span decrease as the number of list items increases? Shouldn't it be a simple horizontal line?

Maybe it means "remaining attention span"?

For some reason whenever I read Maddox's posts I picture him as Spider Jerusalem from Transmetropolitan

finding it ironic that i found this article on HN, a content aggregation site...;)

HN is a different beast. We link to others' content instead of rehosting and claiming credit (and ad revenue) for it, and much of the value of HN comes from the original content we post in the comment threads. This site doesn't even vaguely resemble spammy content rehosting sites.

I was being a bit cheeky. I agree that HN is different than the some of the blatantly awful aggregation sites. HN uses only headlines, gives attribution, etc. I don't buy the community argument though - lots of sites take linked headlines, snippets and degraded images and package it up with a comment system. Aggregation - community driven, algorithmic, etc. - is still aggregation.

this is basically 8 out of every 10 Huffington Post articles

Apparently it's perfectly okay for him to post lists though?


Did you even read the entire article? The link you posted is fair use images, and 100% original writing content. It has nothing todo with what he is complaining about.

He certainly whines very impressively and profanely. The comedy is in his article having contributed about as much value as any given content aggregator, it's little more than a big bitch fest.

That seems like needless hyperbole, if not a poor attempt at snark. There's plenty of content in there.

I've happily wasted many hours on his site. And his book is awesome, too.

I agree linkbait lists are garbage, but the writing is awful and full of profanities, and the author goes on and on about the linkbait content he supposedly despises (complete with pictures).

I was going to say it all comes off as gradeschool, but little kids have it much more together.

This is what Maddox does (and has been doing since 1996).

Wow, I remember reading similar "insight" about top 10 lists back when Digg was the thing. Somehow I'm not surprised this website looks like it was made in 1993.

That's because it was made in the 90's. Maddox has been writing for ages [1], and he keeps the site "shitty" as a sign of protest [2].

[1] http://www.thebestpageintheuniverse.net/c.cgi?u=archive

[2] http://www.thebestpageintheuniverse.net/c.cgi?u=faq

Ironic that this article is in itself linked..

Not really; it's linked here, not copied wholesale here.

Personally, you can't blame people for doing their jobs.

You can however blame them for repeat spamming you. Blaming the list is just dumb.

> "Personally, you can't blame people for doing their jobs."

Woah. Why not? Why can I not blame someone for doing a job that's a net negative to society? Why can I not blame someone for doing a job that's purely parasitic and creates absolutely nothing of value?

By that logic, I can't blame dealers for pushing crack to kids - because they're just doing their jobs. Or, more legally, I can't blame used car salesman for being lying, swindling cheats, because they're "just doing their jobs".

because life is a rigged game and some people came into this world with a poor hand. if you blame ppl for doing their jobs, you have to blame God for creating an unfair world.

You can blame people for taking and performing a job that requires unethical behavior, and if they don't perform it in accordance with the laws about unsolicited commercial email, it's also their fault.

I don't think the Nuremberg defense is sufficient to excuse spammers.

Sometimes it's amusing watching people Godwin their own arguments and not even realize it. Other times, it's just sad.

I'm reminded of Randall, Dante, and the roofer discussing the "independent contractors" who were killed on the second Death Star. I have to agree with the roofer -- like those contractors, spammers knew what they were getting into.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact