Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
IMDB data (imdb.com)
75 points by helwr on April 24, 2011 | hide | past | favorite | 34 comments


Just a quiet reminder for anyone wanting to use this

Please refer to the copyright/license information listed in each file for instructions on allowed usage. The data is NOT FREE although it may be used for free in specific circumstances.

http://www.imdb.com/help/show_article?conditions which links to http://www.imdb.com/licensing/ and

Minimum Price: We offer data licensing packages that are customized to meet your needs with annual fees ranging from $15,000 to higher depending on the audience for the data and which data are being licensed. We are not able to offer any sort of data license for less than $15,000.


http://www.themoviedb.org/ is a imdb clone but free and open/run by the community. xbmc uses it for movie lookups if you wish.


I've recently started developing a django app for syncing with tmdb's data https://bitbucket.org/zalew/django-tmdb/wiki/Home

just released the first commands and everything is in early dev, but it works.


I actually built a movie watchlist and todo list for myself using Django and tmdb. It's live, and the code's on github.

http://movies.tshaddox.com/

https://github.com/baddox/watchlist2

edit:

The code's probably not very "pluggable" or "reusable," since that wasn't a design goal :). It's hopefully readable. Feel free to recommend a film (just search for it on the right sidebar).


> The code's probably not very "pluggable" or "reusable," since that wasn't a design goal :)

cool :) I see we took different approaches.

my app's goal is to provide flexible ways of syncing with db and nothing else, at least for now.


I had forgotten how messy my tmdb API usage is. I'm using some guy's wrapper on pypi[1], as well as a manual fork (read: copy-paste-modify) of what I think is the same guy's github repo.

[1] http://pypi.python.org/pypi/themoviedb



I'm just building a mango app to sync with IMDb. Oh well...

I'll have a look at yours. I didn't know about tmdb.


Just checked themoviedb.org - but it does not seem like it has a lot of data. How many movies does it have and is it comparable to imdb in size?


Although someone already commented and I am late to the party on this one I have yet to see xbmc using tmdb not find a movie I was watching. Sure IMDB can be considered more "complete" but tmdb certainly has popular mainstream titles.


No. I always point XBMC at IMDB. tmdb doesn't have nearly the breadth or depth of IMDB.


that really sucks considering that contributors on the net built most of imdb. they didn't have anything like a cc license for user contributions back then


I believe the CDDB/Gracenote issue was exactly this; cddb built a massive database of cd's from user input, sold out to another company, which promptly closed it up and made it commercial. My memory may be wrong here, however.


Yep, that's pretty much what happened.

Only IMDB is now owned by Amazon. You would have hoped that they would be a little more community spirited with the data given it's origins.


It's this kind of crowdsourcing that I dislike. Mine the efforts of the people and keep the information to yourself. I'm mainly disappointed that because of this limited access, I can't have a better experience browsing the data on the web or in a mobile app. As it stands, both their site and the iOS app are slow, and the iOS app navigation just stinks.


I'm assuming most of the non-review/summary data is just fact. Presentation is copyrightable, but I don't think simple facts can be copyrighted that way.


In Australia at least, compilation of data is copyrightable, even where the individual sources may not be. Databases which you could argue are just fact but still restricted by copyright include telephone directories (Whitepages) and public transport timetabling information (which our state government controls rather tightly and won't even release to Google Transit).


Just an extra data point: public transit timetables for Western Australia's capital, Perth, are provided to Google, and any member of the public can download the same data from a website. Conditions attached but no questions asked.

http://www.transperth.wa.gov.au/TimetablesMaps/SpatialDataAc...


Don't worry. Google will cave eventually, then everyone will be praising the bureaucrats that stuck to their guns to get a windfall licensing deal out of Google!


The point is that the data should be available to the public under a license such as Creative Commons Attribution 2.5 (used for other government data). Governments aren't about making a profit and I doubt Google will pay a cent for this data anyway when other governments are giving it to them for free.


Read my post as high sarcasm...


We use it at the University in Oslo as a realistic database for practicing sql. It's great fun.

I remember one question. List the directors that have directed at least 20 movies and acted in all of them. This is fairly tricky, and returns a list of mostly explicit movie directors.

What's more fun though is a sparql endpoint. So you can query it and link to it on your own sites. I found this one on a quick google search (couldn't find the one I looked at before). http://www.linkedmdb.org/


That sounds like a great example database for database classes! Something a student could wrap their head around to understand the dataset. I found a lot of times students new to database concepts had trouble unless examples were given in terms of a dataset they could relate to or at least understand the relationships without the DB language.


At Stanford in CS107 we had to write an IMDB six degrees of separation, using this same database. Insert the names of 2 actors, and our app will calculate the degrees of separation between them. Kevin Bacon was highly connected :)


Who doesn't love Kevin Bacon! What a great project, this would applicable as an enhancement to projects like xbmc.


This is exactly how I learned SQL, too. Through http://sqlzoo.net/ . I agree it's a fun way to learn it.


Don't parse it all yourself. http://imdbpy.sourceforge.net/


Oh nice. Thanks for this. I downloaded the database for my own interest the other month, and after viewing the horrible format I gave up. I am going to check this out!


When was that page last updated? Look at the machines/OSs referenced on it: OS/2, Acorn, Amiga, Win 95/98/NT!

I haven't seen the string 'ftp' mentioned on the same page so many times in years.


Hard to say. The page and the files both report recent timestamps.

The wayback machine saw the page in '96, has the windows version appearing between '97 and '00, and after that pretty much the only changes have been to the list of ftp servers.


Most of the tools probably aren't updated much (or at all) and the system is only maintained for the distribution of the data files.


It would be so cool to make http://TheWikiGame.com for IMDB data, but I guess the license dis-allows it?


You may also be interested in IMDB history http://www.imdb.com/help/show_leaf?history


They recently did a re-design and screwed up the photo links on the pages of the people in the database (like Steven Spielberg). Kind of annoying.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: