"@cdixon open sourcing it would be an enormous pain in the ass. selling it not as bad but yahoo infrastructure and auth problematic."
Besides that, open sourcing delicious wouldn't solve the problem - someone still has to host it, and maintain it, and import several million people's bookmark collections in to it. That takes significant time and money. Having the code is just a small part of it.
A smarter thing to do would be to campaign for the (public) data to be released as a huge data dump, ready for people to run their own analysis on (an Amazon Public Data Set for example). This too has plenty of problems though - under what terms should that data be licensed? People's bookmarks belong to them - would they be happy with their contributions being released as part of a massive data set for anyone (including sploggers) to do anything they liked with it?
Since users can already import/export their bookmarks, the only support Yahoo needs to provide is keeping the import API open for a little longer after the site is shut down. Of course this assumes that users would want to migrate their data to a replacement service.
Or, for that matter, that users would share their tags with researchers who build on social tagging data:
Here is a (scraped?) delicious data set, for those who are interested: http://arvindn.livejournal.com/116137.html
Open sourcing it wouldn't solve the discovery/hosting part, but I still think they should do it.
I'm sure it wouldn't be "easy," but it wouldn't be unprecedented.
Yahoo! folks also should take a look at dataliberation.org, for best practices on getting on data out (also a Google joint).
(Nope, I don't work for Google :D, just think they've done some good things in this space. )
A bit of data I'd like to get out of delicious that i can't right now, is social stuff/ relationship stuff. It really helped w/discovery.
Though you're totally right: link stuff is thorough.
Sadly corporate responsibility is not the norm, though it should be.
It's the tens of thousands of man hours that have been spent creating one of the best indexes with careful tags for millions of pages. The data's what makes Delicious matter.
And, if you open-source the data, I see people crying foul over privacy.
But having >1,000,000 people with freshly tagged and exported links in a standard format seems to provide an opportunity to those who think they can do better.
One request for whoever that is: can you add the 'sort my links by popularity' feature that Yahoo! never developed?
wget --quiet --http-user=`head -n 1 ~/.deliciousrc` --http-passwd=`tail -n 1 ~/.deliciousrc` -O ~/docs/delicious.xml http://del.icio.us/api/posts/all
I run this out of cron nightly and back up my bookmarks in a git repo, along with my other dotfiles. I really appreciate services like delicious, but I don't trust them.
I'm going to miss delicious's data set, though. Searching a really big link collection with tags was great - I found a really good Ethiopian cookbook, some good+obscure research papers, and lots of other stuff that way.
curl -n -o ~/docs/delicious.xml https://api.del.icio.us/v1/posts/all
machine api.del.icio.us login yourusername password yourpassword
`curl https://user:email@example.com/v1/posts/recent` works just fine.
`curl https://user:firstname.lastname@example.org/v1/posts/all` does not.
Maybe a server-side problem?
What i'd like to see is people quickly rolling out tools that help us get our Delicious bookmarks into other sites easier. Eg into Google Bookmarks. That currently requires you to install the Google Toolbar and import them through that (ugh).
A system that actually gets all the data from delicious into google bookmarks seems to be worth while.
It even worked more smoothly than the Google Toolbar's own importer, which complains about dupes (wrongly), and prefixes all the labels with "Tag:" for no good reason.
Will blog this ASAP.
As for two specific examples: Learning Ruby on Rails has been a long, slow, struggle for me. A number of reasons for that, but one of the main ones being that I don't have many other programmers in my area that I can talk about it with. Although there are scores of blogs and tutorials on Rails, it's hard to know which articles are good, and nobody has the time to read them all. So if I was looking for a post on, say, integrating jQuery with Rails, I could do a quick search at http://www.delicious.com/tag/rails+jquery, and I'd find a bunch of articles that had been selected by individuals, for their own use. It was great for finding high-quality content.
Another example deals with niche interests. Delicious made it easy to suss out who is talking about $randominterest. Just by going to http://delicious.com/tag/randominterest, you can see who else is bookmarking it, who's writing about it, and so on. Especially when dealing with, say, fringe programming languages, or uncommon design details, Delicious makes it easy to find people.
I wonder if this could be implemented as an API of websites? (Or perhaps on behalf of websites?) Instead of one single site, you'd have a p2p network of sites using protocols similar to those used for digital cash schemes. Think: a distributed open marketplace of middlemen. A sort of Diaspora middle tier.
I have more than 2k entries there collected over the course of the last five years. I only bookmark sites that are for some reason important to me, and have some lasting value (eg. I don't bookmark most HN submissions). I tag everything I save and some of my tags have special semantics that is probably only useful to me.
The difference between Delicious and news aggregators like Reddit / Digg / HN is that the rating of links is a side-effect, not a conscious action.
As a result ranking on Delicious does not prefer yellow journalism, obscure interests or vanity displays; much like Google's ranking but small enough to not be a target for promotion through SEO.
In other words, the whole social aspect.
The main problem was that the RSS feed was limited to 15 entries.
This service just needs to stay alive.
More interestingly, I wonder how much Yahoo would sell it for?