It is extraordinarily odd to me that digg would create fake accounts to promote sites. This is a very inefficient (and public) way to tweak an algorithm that they have full control over.
The only way this makes sense is if it was done by someone without access to the algorithm. Either an external voting ring, or someone in the company who isn't supposed to be doing this.
The explanation could be very simple. Obviously Digg's algorithm is extremely important to the company, and Digg had put together a long QA and approval process to ensure its integrity. A change to the algorithm may require 5 people to sign off, lots of analytics, gradual deployment, etc.
In the meantime, they run into some problems with the ranking, and knowing their own algorithms, they chose to game it (or tweak it) externally rather than internally because they can't afford to go through the official process. And the intention is that whatever they did would be eventually incorporated back into the main algorithm.
Unfortunately, some low-ranking QAs were assigned to execute the hack, and they are oblivious to the negative perception such maneuver would result.
And another reason for doing it in the first place might have been to artificially inflate digg counts across the board to mask the lack of site activity after the exodus of so many users.
This is plausible, but here's the problem I have with that.
If you look at the data, these accounts are not digging only preferred publishers.
They are Digging mostly preferred publishers, but also occasionally random blog-spam. (ex buzzll.com)
If Digg was behind this, why Digg random blog-spam as well?
No, if Digg was behind this I expect that would either make the accounts realistic by simply taking what an average user Diggs and replicating it a few times, perhaps with only preferred publisher diggs.
Any first-year CS student could come up with a script to create plausible user names, at least better than those in question.
The Digg team is not the best, but I have slightly more faith in them than this.
What I think we have here is someone who figured out that in order to slip in diggs for blog-spam, they had to digg some "top stories".
Let's think about this for a moment.
Let's say you're writing a Digg spam-bot.
You know in order to make your account look good you need to Digg some top-stories.
The easiest way to do that is pick some stories you know will be top stories...stuff that's always on the front page...e.g. Digg preferred publishers.
Let's say you didn't do this.
In this instance, you'd have a bunch of accounts digging only the same small set of blog-spam.
They would stick out obviously as a set, and digging only the same small stories.
But, wait...
Now they are not digging only random small stories...they are Digging top stories...so now they are part of a huge set of "normal" users, and their digging patterns look more like a normal user.
I think what we have here is simply a spammer who found a hole in Digg's automated account detection.
It could also have been an unofficial handshake deal where Digg agrees to overlook the 3rd party gaming the system.
They gain plausible deniability, can maintain the integrity of their brand and the trust of users. Or at least that's how they would like it to happen should someone catch on.
My thoughts exactly. It doesn't make sense to me that Digg would use such a public and labour-intensive method of promoting their partners' content. I'd like to see data on what percentage of their partners were affected by this. It could be that someone created the accounts then offered Digg's partners increased visibility in exchange for cash, suspecting that Digg wouldn't look too closely at gaming that favoured its customers. Even then, LtGenPanda does state in his update that the activity stopped entirely after his email to Digg, but before he made the information public. Further, they didn't simply ban the accounts, as banned accounts can still digg submissions and those accounts ceased all digging shortly after his email was sent.
It could be that Digg disabled the accounts, but if so, why not delete them? Unless they have a shadowban feature that they're not telling anyone about, the timing and manner by which the accounts ceased their operations is highly suspicious.
>It is extraordinarily odd to me that digg would create fake accounts to promote sites
Agreed.
The fact that it was primarily promoted publishing partners doesn't implicate Digg at all: Many of these spam accounts first seek to establish an algorithmic legacy by submitting and/or digging and promoting sites that they think will do well (see virtually every "Today I learned that Costco sells caskets/survival food kits" post), and that would be my suspicion here.
Eventually once that legacy is established they move over to the actual promotion pimping.
It's possible to get the registration times for these accounts from the Digg API. I did that and they were registered in large blocks starting 10/16/2010 10:03:05 and ending 10/17/2010 13:37:59.
The blocks (ordered by the registration time) have registrations averaging a few minutes apart with hours between blocks. IMHO these could easily have been generated manually.
That's what happens when you depend on solely on a user community for everything. You can't control it and the sooner digg realizes that and deals with it, the better off they will be.
They've created a monster and are trying to control it by cheating.
You would think that they could have come up with more creative names than ddX and such... If you are going to create shills, at least come up with some better names and change up some of the account creating dates and such. No wonder they are having such problems as a business.
That's interesting, but it's nowhere near conclusive.
Digg just had a major update and are in a state of transition still, we have no idea what sort of banning, shadow-banning, temp-banning etc procedures are in place.
(unless you are a Digg employee, in which case please do share)
I think the conclusion that Digg is behind this is tenuous. The evidence presented is:
1. When told about these accounts the activity on these accounts stopped.
2. The OP claims: "On a technical side: Digg can only ban accounts but cannot stop accounts from digging. So, if this was from some exterior group, digg would have only banned them as they cannot stop them from digging."
1 can easily be explained by recognizing that Digg may have decided to temporarily put a hold of the accounts once informed of the odd voting patterns.
2 is bogus. It's their site and they can do whatever they want. It should be easy to keep accounts live but stop their diggs from counting, for example.
It strikes me (see my other comment in this thread) that the accounts could have been created by people outside Digg manually (they are only roughly 100 accounts and they weren't created quickly) and then used for this purpose.
While it's true that digg can do anything they want on their site, pragmatically speaking their development process may not support creating and deploying the change you're thinking of in a matter of minutes or hours.
An alternate explanation is that the bogus accounts are external, and digg blacklisted the IP's associated with those accounts upon learning of this. Most larger websites will have a fast way to blacklist IPs on their load balancers or the like simply to avoid buggy robots, spammers and trivial DoS attacks.
Agreed. The OP represents himself as one with intimate knowledge of Digg's internal policies. It's their site, and it's naive to think that Digg can "only ban accounts".
Author needs a detailed rundown of Diggs publishing partner program and how it works for this piece to be complete.
Otherwise, this is great scoop. As someone who runs a bootstrapped, ramen profitable niche social news site, I have to chuckle at the irony. The Reddit founders have admitted publicly on many occasions that they used sock puppets to bootstrap a community on their site. Meanwhile is having to fire up Sock puppets to survive after raising $40 MM. Ouch.
Also funny, the article says Kevin rose posts were only getting 30 diggs these days. Were we getting a glimpse behind the curtain of the true size of Diggs active community there? To put it perspective, Windy Citizen does about 70-100k visits/month. Our hottest stories each week will crack 20-25 votes.
When you own a site and you change it you are not gaming it, you are tweaking it. You don't actually have to game it, you own the thing. If digg wants to use their own batch of sockpuppets to influence the front page to their partners benefits they are free to do so, the responsibility, the upsides and the downsides are all theirs to take home.
This is like saying google is 'gaming' the search results by having their people influence the rankings / search result pages by changing things in a way that benefits google.
What would be more surprising is if digg would not try to modify the homepage to benefit themselves and their partners (assuming all these parties are in fact digg partners, for which I see no proof). They're a commercial entity, I do not have high expectations that digg would do what they could to make a fair rendering of what their audience thought was best.
Digg has been gamed beyond the point of recovery by voting rings and 'power diggers', to see the people behind digg join in the fray is absolutely no surprise to me.
Other sites call it editorial control, on HN stuff gets banned, or flagged. No algorithm has ever been able to survive in the wild without some form of human supervision anyway, in that sense digg is no different.
Whatever the motivation for the changes I hope they'll succeed in getting the runaway train under control and to get rid of the feeling that digg is being run by a couple of individuals gaming the system but if their main 'tweak' is to join in the game in a way that makes them look like a voting ring then I doubt that will be the case, and digg will lose visitors faster than they did up to now.
An alternative (and simpler) explanation for all this data that the author uncovered is that this is just a real (external) voting ring on Digg, possibly one that is for rent, which happens to benefit those parties that are already using digg for promotional purposes. Maybe digg turned a blind eye to this happening.
Bottom line is it is their site, and they can run it any way they want to including the boosting of stories they think ought to be boosted. That's not going to work as a long term strategy, but their 'publishing partners' may find it worth enough money that digg comes out ahead. It would appear to be a 'slash-and-burn' strategy though.
The individual profiles I looked at all have large numbers of diggs and no comments or submissions, so a voting ring for sure, now it needs to be figured out who runs it, and if it is the digg people they should explain the purpose of this in this way rather than in a more direct one.
I agree that it is their site and they can do what they want, but I think the main issue people have is that they pretend that they are open and for their users. It is one thing to do what you do and not be ashamed of it, but to try and pull the wool over your users eyes and make them out to be stupid is just wrong.
I don't use Digg, I haven't for years, because it was too heavily gamed and not enjoyable to use anymore.
Your comment about Google though, they have to keep the trust of the users and the web as a whole. If they were to game their system for their own gain (that is artificially influence results to improve their own standings) and people found out or could prove it, Google would be in trouble. I am not saying they don't or haven't, just that with Google the trust is important because people have more at stake than on Digg, a lot have their lively hoods tied to it.
> Your comment about Google though, they have to keep the trust of the users and the web as a whole. If they were to game their system for their own gain (that is artificially influence results to improve their own standings) and people found out or could prove it, Google would be in trouble.
Don't tell me you really believe that all those sites that happen to be google affiliated really have that much standing in their algorithmic view of the web. Google has been artificially inflating the ranking of their own site for a long long time.
For instance, not that I'm in the habit of searching for celebrities, but type in paris hilton in the google search bar.
The first result is a link to google images, followed by half a page of sample images from images.google.com, followed by wikipedia (ok, that's a good one) and then a bunch of youtube stuff.
I'm sure they're all 'relevant results' but I highly doubt those image links are really the most relevant results on the subject.
The wikipedia link should have been #1, the images and youtube results probably don't even belong on the first page (unless you search for 'images').
This sort of thing will happen with a great many subjects, but with a high frequency query like 'paris hilton' the effect must be enormous.
I'm not Google fan, but I don't believe that they special case "affiliated" sites in search results. That would be an idiotic move on their part. Possibly undermine their cash cow (search) to boost marginal sources of revenue?
Honestly, your claim sounds like most conspiracy theories. It completely falls apart when you assume the participants are even mildly logical and rational.
Here's what's going on there: Google figures, from your query, that your intent is probably shopping. And those are the top 4 results from the shopping corpus. If you click on the "Shopping results for..." link, you can see the full shopping corpus.
I don't really know how the product corpus works, but AFAIK Google is not being paid anything for those results. They're compiled from a bunch of commercial sites across the web. Google gets paid when you click on the ads, which are clearly marked at the top and right hand column.
Here's your chance to make it better: what are the queries that turn up too many Made-For-Adsense pages, and what results should they be returning instead?
Well... in my field about half of the top sites which are returned for the query "Learn Chinese" (or something similar) are thinly-disguided AdSense spam or content-free marketing landing pages of little use to anyone actually hoping to learn Chinese. They exist almost exclusively to serve AdSense. Sample URLS: elanguageschool.net, clearchinese.com, instantspeakchinese.com, minmm.com, mychineselearning.com. Or try pretty much any site on page #2.
My own Chinese learning site is Popup Chinese (http://popupchinese.com). We have fantastic testimonials, tens of thousands of users, and links from tons of places, but we're still invisible in Google after more than two years. The Skritter (http://skritter.com) guys (who post here occasionally) should also be listed far above where they are - they've got a great product and it's insane that they're drowned out by this sort of flotsam. Ditto for NCIKU.com.
Everyone I know in our industry views Google with complete cynicism. Quality sites that won't play black hat SEO are at a major disadvantage to much spammier marketing sites and link-farms. All that being said, it is nice to see someone at Google is at least asking the question seriously. And given what you are up against... good luck.
Being a master at your own system doesn't mean they are artificially influencing it. It is hard to avoid, and I don't doubt that they know more and use it to their advantage, but I don't think they are doing it in a way that impacts their competitors like you think.
Search for 'search engine' or 'engines' and you will find that Google isn't the first, in fact it is 4th and isn't to the main page but to their custom search engine. There are other cases where this exists, but I guess it isn't the best example just a simple one.
So an external voting ring was created to submit and vote up articles that were created by digg's advertisers? Sorry, but I don't buy it.
If digg was behind this, then their actions are unethical. It essentially amounts to treating advertisements as content. It's saying "hey, the community likes this" when the truth is "someone paid us a boatload of money for this". Granted, the sanctity of digg's voting system had been violated long before this. But that still doesn't make it right.
"If digg wants to use their own batch of sockpuppets to influence the front page to their partners benefits they are free to do so, the responsibility, the upsides and the downsides are all theirs to take home."
The downside is that someone identifies and calls them on it, which is purportedly what is happening here (though the analysis and conclusions are flawed, in my opinion). I'm not sure what you're getting at.
"This is like saying google is 'gaming' the search results by having their people influence the rankings / search result pages by changing things in a way that benefits google."
But....people do that sort of analysis of the search engines all the time, trying to find patterns where it betrays user trust for selfish reasons. Search engines tread a dangerous realm where they can lose user trust (and usage) if they abuse their position.
"Bottom line is it is their site, and they can run it any way they want to"
I don't know why, but this sort of comment always bothers me.
Yes, we all know it's their site and they can do what they want. It's also the attention and trust of the community, and they can do what they want, including choosing to move on, distrust, question, etc.
Wow - if this is true which I am incline to believe though admittedly haven't reviewed the raw data myself, I expect Digg will begin hemorrhaging even more users. This flies in the face everything (I thought) they stood for; allowing the users decide which news is newsworthy.
Why are people still upset about this? By now it should be clear to everybody that digg shifted to a more publisher based article listing. Either that's fine with you or you go somewhere else. Why all the complaining?
From a technical perspective though I wonder why digg would add all these bot accounts, creating clutter on the site, when they could also easily achieve the same effect by some background tweaking
People are upset because Digg cultivated an audience of rabidly anti commercial poor teenagers, and rabidly anti commercial poor teenagers do not like learning that they are advertising inventory for a company with tens of millions in revenue.
There is a lesson here regarding audience selection.
It's different and important because the Author of the post is implying it's Digg people gaming their own system rather than people like LtGenPanda and MrBabyMan doing it from the outside.
"I am going to wait until 6:34 CST, that is 1 hr from when digg got a first chance to read it. If by then, they do not give me a reasonable time to wait, I will be going ahead and make this link public."
The only way this makes sense is if it was done by someone without access to the algorithm. Either an external voting ring, or someone in the company who isn't supposed to be doing this.
Even that doesn't make much sense.