Hacker News new | past | comments | ask | show | jobs | submit login
Is Tumblr a bot fest? (svenduplic.com)
131 points by ofca on Oct 22, 2011 | hide | past | favorite | 33 comments

From what I've heard (I know a couple of guys that work there), Tumblr is working on the spam and has made large amounts of progress.

I made a couple posts about this earlier this week. My site has nearly 15k Tumblr followers, so as a result, I see some of this stuff more acutely than most people.

http://shortformblog.tumblr.com/post/11645079360/tumblr-like... (on a couple of methods I've been seeing a lot) http://shortformblog.tumblr.com/post/11654489531/tumblr-fake... (on how the empty profiles have a payload)

I think a big part of the problem is that there's an easy-to-exploit black-hat SEO technique that many like spammers have been using. (To put it simply: You can be guaranteed that the phrase "liked this" shows up on most Tumblr pages.) Since I implemented the technique I mentioned in the first post — which I admit isn't entirely desirable, because it also blocks some relevant content too — my Tumblr spam has gone down significantly.

Also, note how, in that first post, I've gotten some spam notes in the past few days. It's because it contains the phrase "liked this" right in the post. I have the search analytics to prove it.

The author may not realise how big tumblr is with teens. It's huge, in my community far bigger than Twitter.

Most use it to repost, like and consume. It's the same with Twitter, they got so big in the UK because of celebrities. A lot of people use it to follow celebrities and maybe sometimes tweet them or friends.

A lot of people that fit the profiles described are lurkers. Content consumers, but not creators. That's a pretty common thing on most social networks.

Seeing the same as well. If I would have been conspiracy theory sort of guy I might have thought it as a strategy to gain confidence in beginning writers so they would create more and more content that eventually will make their blog substantial. Or it's just spam.

Tumblr isn't a bot fest necessarily, but the fact is that any site with a large amount of traffic will generally become a target for bots. It's almost a compliment to a website, but at the same time it can be a nuisance. The Tumblr bots I'm familiar with follow by category and tags, not necessarily post content. The reason the author's blog gets autofollowed is likely because he is in a lucrative niche.

My stats

- Member since 25/6/2009

- 7,500 posts

- follow ~ 400 people


- post mostly photography / design / fashion

- bots seen - < 5

[ same username there :) ]

Just for other's reference the tumblr mentioned is NSFW.

I have a small tumblr and I notice a lot of action from one particular spam bot, sometimes my posts are liked or reblogged like 60+ times by different accounts with sites that have the same template and advertisements. I've flagged them as spam through tumblr's built in system and it seems to have decreased.

(as I see it) Tumblr is/was kind of a clean 4chan, some memes/internet traditions, even some real world may get started there, or start in NY and get exposure there. Same with Twitter, same with anywhere you call good one year and bot infested a subsequent year.

We've been testing a new startup with a hacked together Tumblr, to simulate the app we're building. Because of this we've been monitoring our Tumblr traffic & interactions closer than most casual bloggers.

We have 300 followers right now, and I'm going to about 15% of our Likes are SPAM. A problem, but not completely overrunning the system. They seem to also be using Tumblr tag pages to find articles to like. Articles tagged with common product keywords like handbags, shoes, or a brand name get much more SPAM Like activity than other posts.

Tumblr does seem to be hiding the entries in the Dashboard for some of these, indicating that they have some sort of system in place for isolating spammers.

I've not seen this on my tumblr...

I also have not seen this on my tumblr blog. I've only stumbled across one or two bot-like blogs.

I get them ALL the time. At least 1-2 likes from bots per post.

it depends on what tags I use, iPad is a popular one obviously.

Also, soemtimes I'll see a like of an ancient post with no rhyme to why. I'll agree though that a good percentage of likes / reblogs are spam / bots. They also often have ads on the blog so whoever is running it will get credit when you visit the blog to see who just liked / followed / reblogged you.

#triathlon mostly as its just my blog for talking about triathlon related things.

One friend of mine runs a Tumblr on design, gets over 15,000 pageviews a month. He says that he gets 3 to 10 followers/likes who are spam a day. He blocks all of them to avoid linking his site to websites with plenty of ads or porn. It seems something is going on, it needs to be sorted out immediately.

I think sites like quantcast, compete, google analytics etc tend to not include bot traffic b/c bots often don't evaluate javascript, and those tracking services often use toolbars/ajax requests that bots don't have.

So, pointing to pageview graphs that track ajax requests / toolbars and assuming that it's mostly bots is likely to be a false assumption b/c bots are likely not included in those pageview sums.

The "Like" and "Follow" buttons on Tumblr require JavaScript, probably to make CSRF harder. A bot that automatically likes posts would need to evaluate JS.

Liking a post on tumblr is just a POST request; it doesn't require Javascript. You can scrape the page (for the content id and authentication key) and submit the POST request to like the content without ever running Javascript.

I'd be more interested in knowing how Tumblr instantly knows just from my email address that I'm interested in following software developers. I didn't share that with them. I hardly use that gmail address for anything outside of google groups and github.

I hardly use that gmail address for anything outside of google groups and github.

Tumblr probably has an "invite friends" feature where it offers to suck up your address book. Maybe a few software developers you corresponded with in the past used that feature.

edit: I didn't consider that they may have just randomly picked any category and getting "Developers" was random. It just surprised me, out of ~40 categories.

I've seen bots on Tumblr but far less than I've seen on the likes of Twitter. If you want to see a true botfest look no further than Tagged.

Would it be outlandish for investors to request a one-time site-wide captcha?

Captchas won't really stop them, it just adds expense and time. Many black hat tools integrate captcha breaking services. Cost is somewhere around $2 per 1000.

Twitter is a botfest too - post anything with the phrase SEO in it and you get a bunch of followers with no icon, no tweets and no value. This is where Facebook has a big edge, the focus on real identity and the crack down on API use has kept Facebook much cleaner than the others

I don't think it's the focus on real identity for facebook, so much as the segregation of the network. Since most status updates aren't public by default, you can't use the equivalent of search.twitter.com to find all recent people posting about a keyword. This has good effects (no spam every time you mention 'iPad'), but also bad effects (no easy way to find people you don't know talking about topics you care about).

It is not just SEO. If you post anything with the word iPad or iphone you get a bunch of spam replies (like six or seven) all promising you will win a free ipad. It has gotten so bad that I had to resort to writing it in l33t speak.

All you have to do is read the description of trending to see how people will try to game Twitter:

"Twitter's Trending Topics algorithm identifies topics that are immediately popular, rather than topics that have been popular for a while or on a daily basis, to help people discover the "most breaking" news stories from across the world."

If you want to up your followers, toss a profanity in every tweet. It's that simple.

Agree, FB really does have an edge on this but then again, what Twitter & Tumblr have above FB for me is the discovery of new content / new "friends".

Disagree it still exists on Facebook very heavily. You probably just don't see it in your circle of friends. So yes Facebook has an edge in the sense they keep the hordes of dead accounts, spam bots, etc. away from you.

heh, Just tried mentioning SEO in a tweet, but even then I attracted two followers. Twitter seems to have the bot situation under control.


But Tumblr has already made it

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact