Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Scraping Roger Ebert’s reviews and finding his favorite movies on Amazon Prime (linisnil.com)
197 points by catwind7 on June 10, 2020 | hide | past | favorite | 129 comments


If you need a more stateful version of requests:

    import requests
    session = requests.session()
    # now use session like you would requests
    session.get("http://httpbin.org/cookies/set/name/value")
    print(session.get("http://httpbin.org/cookies").content)


oh I need to try that - I had this feeling that there was a more stateful version but for ..... some reason ... reaching for a new dependency felt easier at the time haha. Thanks


See also "Where to Find Roger Ebert’s Great Movies Streaming" [0] which has US listings for Netflix, Amazon Prime, Hulu, Disney Plus, Criterion Channel, Kanopy, HBO, Starz, and Showtime as of March 2020.

[0] https://www.rogerebert.com/features/where-to-find-roger-eber...


I guess this list of movies applies to people who live in US. Available movies differ substantially between countries. With Netflix it's easy, it doesn't even show the movie you can't watch, with Amazon you have to click through every movie you are interested in to see the screen that says that this movie is not available in your region. There are even series where only one or two episodes are unavailable.


It is the fact that we have a relatively agreed-upon canon of great films (with immense re-watch value over the years) that keeps me torrenting instead of streaming. Sure, you could search from one streaming site to another for a given film, but the film may be either completely unavailable, or it may be offered one moment and the mysteriously removed the next. Meanwhile, some torrent communities are run by ardent cinephiles and they have all of these films, and once you have downloaded it, you can go back to it whenever you want.


You can also solve the “what streaming service” issue by buying the movie on DVD or Blu Ray allowing you to go back to it whenever you want. Additionally if you’re ok with DRM iTunes has most stuff.

Useful tip for finding which streaming service a film is on: google “film name streaming” and google lists the answers in a box at the top of the results.


googling for film name streaming doesn't work, most of the results are specific to the USA instance of Netflix/Prime, not all content is available globally. Most of the websites that list content are spammy looking.


I put my country name on the end of my query. It is a bit hit or miss though.


I'm not trying to promote piracy but an HD with hundreds of my favorite movies on it is 1000x better than hundreds of Blu-Rays and DVDs. Some Blu-Rays have 3-6 minutes of un-skippable ads. Even if your player overrides the "not skippable" part you still have click through 3 to 10 segments to get the movie playing. Seeking on Blu Ray is much worse. And, just the hassle of taking them out of their case, loading them, unloading them, putting them back in their case, just sucks. I have a few hundred DVDs and Blu Rays. I haven't put one in to a player in > 7 yrs.

DVDs are also pretty poor quality now-a-days being interlaced and ripping either is slow and painful. It's funny how DVDs looked so good at one point. Now they look pretty poor on a 60inch 4k TV.

And, since we're talking about mostly old movies, I recently tried to purchase as a gift 8 famous award winning USA movies from the 30s-40s as DVDs or Blu-Rys, most of them are out of print. One claims to be in print but was out of stock according to Amazon. A few you could get for ridiculous prices out of some collectors store. A movie from 1940 is 80yrs old. It really shouldn't cost $60 in fact arguably those movies should be public domain but I'm not crossing my fingers it will ever happen.


yeah, do not over look ebay. a hint with regards to using ebay is to search across a week or two to get an idea of availability and what a good price is.


I find Justwatch.com very useful.


The problem with recommending that people buy the DVD/Blu-Ray is that the agreed-upon canon of great films stretches into the many hundreds of titles (Ebert’s list at 364 titles is only an abridged canon.) Therefore, for someone seeking a cultural education, buying the physical releases would run into the thousands and thousands of dollars. It would be a challenge even for North Americans or Western Europeans to amass a suitable collection, let alone people in lower-income countries. Consequently, a person can only realistically learn about great cinema by streaming from multiple subscription sites, or torrenting, and people disappointed by the former ought to remember that the latter exists.


You don't have buy all at once, you can't watch all at once. If buying old DVDs and bluray, you can get a lot of movies very cheap. If cancelling whatever streaming service(s) you had, it goes a long way.

I'm much more worried of the day when Bluray and DVD won't be made anymore. Which might be pretty soon. Also storing hundreds of discs is a big chore. I rip everything, but that is beyond the abilities of most and the equipment is becoming even more niche than the play equipment. (Some people don't even know their Playstation can play DVD and Bluray movies!)


If you have some local friends with taste, you can each buy one disc a month, and watch them together (when that becomes available again) or just trade. Doesn't get you as much variety as fast, but it also makes it easy to rewatch things since you're not subject to catalog changes or roughing up by the mpaa.


This doesn't sound like a realistic solution. Firstly, few people are fortunate to have local friends with the same tastes – high culture is infamously a niche thing. Secondly, one might want to rewatch certain great films every 5-10 years of one’s life (because our own life experiences can deepen our appreciation of great art), but 1) is one still going to be friends with the same exact cinephiles decade after decade, when people naturally lose touch with friends as the years go by, and 2) is one even still going to have a functional player for physical releases?


What?

Srsly?

I had a simple DVD Collection of over 100. If you buy 1-4 DVDs per Week, you already have min. 100 dvds after 2 years.

It is neither expensive nor difficult to do.


We’re not talking about mass-market movies, the DVDs of which you can pick up at your local hypermarket for just a few $currency_units. We’re talking about the "great movies". These are often only available in rather expensive editions that are rarely discounted substantially, such as the Criterion Collection in the English-speaking world. Not only are these editions expensive in terms of their list price, but in most of the world they are not available locally, and if you wish to order them, you must pay substantial shipping fees and deal with customs hassles.

Also, your 100 DVDs is nothing when the canon of great films in the opinion of most film scholars would easily approach a thousand titles.


> rather expensive editions that are rarely discounted substantially, such as the Criterion Collection

Criterion's web store (https://www.criterion.com/shop) has sales all the time. There's one on right now through June 15 that gives you 30% off all titles. I picked up two titles through a 50% off sale last month. Subscribe to their email list or follow them on social media and they'll give you a heads up when the next one is coming.


Criterion’s web store won’t ship abroad, because Criterion’s licensing agreement with the rights owners stipulates that they will only sell to North American consumers. So, sales there are little help to cinephiles elsewhere.


I have checked a few movies on their site and all movies i checked randomly, were available on amazon.


Can you please elaborate what movies i can't get at amazon?

I'm serious.

I'm only aware of this issue for specific smaller docus which haven't been aired much and not much interest exists. I had to buy that from different countries etc.


You can sub to one streaming service and never run out of great films to watch (in part because titles rotate, not in spite of, but that's beside the point). Justifying piracy because, for example, you're disappointed that Netflix is missing one great film, is disingenuous.


You misunderstand my point. It’s not that a person might be looking just for "great films to watch", in which case any streaming site might satisfy them. It is that a person might be looking for the great films, that entire canon of films which scholars hold to be important. Netflix is never going to have them all regardless of how much they rotate.


That doesn't really affect my point. No streaming service will meet your unreasonable demands (for many reasons, first of all is that there is not an objectively agreed upon list of the great films), and therefore justifying piracy because of it is disingenuous.


That a service should have it all is not an unreasonable demand, and it is peculiar that you chide someone for "justifying piracy". After all, the HN crowd is generally very sympathetic to Library Genesis, which is aiming to gradually contain all books on all subjects, and therefore the curious reader can conveniently and at no cost learn about classic 20th-century literature, copyright be damned. Someone who wants the same solid cultural education with regard to films will not be served very well by the commercial offerings compared to torrent communities, but a cultural education is more precious than anyone’s claim to copyright.

As for "there is not an objectively agreed upon list of the great films", there may not be one single list, but as a broad consensus one can take the overlapping suggestions of the critics who contribute to the Sight and Sound poll, the winners at Cannes and Venice, those films and directors who have been celebrated year after year in Cahiers du Cinéma, and so forth.


That a service should have it all is not an unreasonable demand

Yes it is, especially when you said that it is a requirement for a service to have a certain list of movies, and then admit that that no such list exists. Therefore, a service can never meet your stated requirements (and therefore, you've created a self fulfilling framework to justify your piracy of certain movies). If you're playing so fast and loose with what constitutes a list of great films, then you can subscribe to the Criterion Collection and call it a day.

it is peculiar that you chide someone for "justifying piracy"

You said piracy of certain films was ok because of their lack of inclusion in a service didn't meet your requirements, requirements that could never be met. I'm saying it's wrong to knowingly create unrealistic requirements and justifying piracy based on that.

If you're going to change your argument to say that certain films should be free to all, due to certain significance to the public, that's a different stance (and one I'm more sympathetic to).


Have you looked at Netflix's film catalog lately? I'd guess there are under 300 great films. The rest is filler garbage they get for cheap.


Netflix is just an example, but 300 movies is too. So consider:

How long does it take to watch all the great films on a streaming service (not end to end, I mean as far as watching movies fits into your weekly schedule)?

A streaming service will have a different set of movies by then. Yes, some movies will have left the platform, but some will have been added. Some will have remained too!

If you're going to come back and say you've reached the end, you've watched everything you wanted to and were still left wanting, first off, congratulations. Second, I would argue that most regular streaming services are not meant for (or marketed toward) a cinephile such as yourself.


There's something to be said for watching stuff that's _not_ in the canon, because sometimes you find a probably objectively mediocre movie that means a lot to use personally and would never end up on any list of great films.


And how could an ardent cinephile enter such torrent communities may I ask?


What is the best way to find such movie torrenting sites in 2020?


Thank you for doing this. I wish "top critics" was a search option on Prime (but their Roku interface is just abhorrent).

There was a time when I hoped Netflix would do something similar: click here to see your favorite critics film list, but since Netflix has lost the streaming rights to soooo many films compared when they first started streaming ca. 2007, this is not doable. (Even DVD rights in most cases: films that are in my "Movies You Rated" queue from 2003 are no longer in Netflix's library.)


They have a really poor UX imo. The categories are seemingly arbitrary, and they have niche and seasonal ones that don’t make sense. During the December holidays they didn’t have a holiday themed category until near holiday end.

Further, they make “innovation” like auto play when nobody asked for this, and without any way to stop it (unless you go to desktop and disable it deep in your settings).

When searching you will receive similar results for titles they used to have... well why not state “are you searching for x? It’s no longer available, here are some recommendations” so the user isn’t wasting time wondering if the search feature is just “bad”.

In the world of constant, unannounced, and live experimenting on huge user bases a little messaging goes a long way (with the comparison being nothing)


I'm surprised not to see more people adding their favorite movies. Just sticking to things semi-related. I may have to go watch "My Man Godfrey" again but at least for me "The Thin Man" and its 5 sequels stuck with me where as though I watched "My Man Godfrey" years ago but I don't remember any details of it. (same lead actor and he pretty much always plays the same type of person)

"The Man Who Shot Liberty Valance" is a good movie but if you want some other westerns starting James Stewart I can recommend "Destry Rides Again" which is not super serious but I found it throughly entertaining. Also "Broken Arrow" (1950), (not the 90s John Woo movie which is a fun popcorn movie).

Seeing "The Good, the Bad, and the Ugly" on the list I recently saw "The Outlaw Josey Wales" which I enjoyed (also a Western with Clint Eastwood)

Of the movies on that list I've seen the one I'd recommend the most is "Women in the Dunes". I don't want to over hype it I've seen most of the movies on the list and they are all great stories but "Woman in the Dunes" might give you something new to think about where as the others are mostly just great entertainment.


In addition to “Woman in the Dunes”, I'd also nominate “The Gospel According to Saint Matthew”. And, in fact, “It's a Wonderful Life” for those who haven't seen it (it's a much more grown up and dark film than you may have been led to believe).


Other good Jimmy Stewart westerns are "The Man from Laramie" and "Winchester 73"


There used to be a site call ClerkDogs.com that probably had the best movie recommendation system I've used. You started off naming a few movies you liked, and it would provide a list of movies you'd also probably like, and it was very accurate.

From what I remember, the database was cataloged and maintained by actual humans, and not some algorithm following behavior patterns.


Yep, jinni.com did the exact same thing, and I loved it. Unfortunately, it seems like B2C just didn't work financially so they switched to an entirely B2B model to help providers with their recommendation engines and no longer have their data accessible to end-users.


It’s a shame that the economics of recommendation engines doesn’t seem to work very well in the B2C space. Good rec. engines can be very useful.


https://www.criticker.com/

This does almost exactly what you describe. I really like it. Not human curated but matches your preferences with other humans with similar preferences to give you recs.


Instead of using Python, here is a solution that only requires sh, curl, sed, sort, uniq and grep.

This solution uses a generous 87s delay to retrieve the Amazon pages. There are 328 films listed as "great movies" on rogerebert.com. As such, the script, named "1.sh", needs 8h to complete, e.g., the time while you are at work or sleeping. No cookies, no state, no problems.

   Usage: sh -c 1.sh > 1.html
Open 1.html in a browser and it shows whether each "great movie" is available as Prime Video or whether it is only available in some other format, such as Blu-ray, DVD, Multi-format, Hardcover. A link to the item on Amazon is provided.

   #!/bin/sh

   curl -HUser-Agent: -H'Accept: application/json' --compressed 'https://www.rogerebert.com/great-movies/page/[1-16]?utf8=%E2%9C%93&filters%5Btitle%5D=&sort%5Border%5D=newest&filters%5Byears%5D%5B%5D=1914&filters%5Byears%5D%5B%5D=2020&filters%5Bstar_rating%5D%5B%5D=0.0&filters%5Bstar_rating%5D%5B%5D=4.0&filters%5Bno_stars%5D=1'|grep -o "/reviews/great-movie-[^\\]*"|sed 's/.reviews.great-movie-//'|sort|uniq|while read x;do y=$(echo $x|sed 's/-/+/g');echo $x;curl -s --compressed -HUser-Agent: https://www.amazon.com/s/?k=$y 2>/dev/null|grep -m1 -C4 a-link-normal.a-text-bold;sleep 87;done|sed '/^[^< ]/s/.*/@&,/;1s|.*|<base href=https://www.amazon.com />&|;s/ *//;/^$/d;/^[@<]/!s|$|</a>|;1s/@//;s/@/<br>/'


Roger Ebert had an unexpected impact on me in my 20s. In my late teens, I started religiously reading his reviews, and this continued until his death. I've never really been a "movie buff," but Ebert's witty prose and pointed technical critique reminded me of Scalia (whom I also loved reading). Thank you for putting this list together!


The tragedy of Roger Ebert is that, although he had a vast knowledge of great cinema, he established a niche where he was expected to mainly write about ephemeral popular films. His time was mostly spent on Hollywood blockbusters and popcorn, and he was appearing in media where he could not go into any great depth due to space limitations, or because that would be a turnoff for his mass audience.

So, only in a few Ebert productions like his "Great Movies" books can one get a sense of the films that really mattered to the man and to art. Compare this to a critic like Richard Brody, who in his career has been fortunate to focus entirely on art cinema (though of course Brody’s net worth is probably an order of magnitude less than Ebert’s was).


I've only read a few Ebert reviews, enjoyed them and happened to be looking for a book to read, so I appreciate the recommendation!

edit: The introduction alone makes the book worth reading:

> Of all the arts, movies are the most powerful aid to empathy, and good ones make us into better people. Not many of them are very good.


I briefly had a review gig, highly amateur, that was even compensated. Ebert had an influence on me in at least two ways which immediately spring to mind.

First, he has biases. Go and read his review of Leon (aka The Professional) and you will know what his gut reaction will be to something like Moretz's role in Kick-Ass. He didn't seem to pay much attention during "genre" flicks and a couple of his review shows that. However, he displayed his biases clearly. You could account for them. He did not dress them up in academic language or try to justify them in a manner that was beyond any reproach. I could mentally just adjust for what I would get out of his reviews of most horror films. Because people have biases, an open reckoning of them is as fair as you can get.

Second, reviews are to be written on multiple levels, each accounting for a different audience. What I call the TV Guide review consists of stars granted, a few actors mentioned, and a very brief subject summary. At another level, he writes for people who want to know, "Given what I care about, would I want to spend my time and money seeing this film?" At a third level, he talks to people who have already seen the movie, people who have the ability to engage with the language of film and debate its merits. Each of these levels is valuable. Each is its own particular craft and requisite skills, not to be evaded no matter what the subject matter.

I sometimes disagreed with his ratings but I rarely found a review which did not add something to my experience of a movie.


Besides being a well respected film critic and incredibly knowledgeable about film, he was also just a great writer. I read his reviews every week simply for the pleasure of reading them, even if I wasn't interested in the films he reviewed.


I enjoyed Ebert when he teamed with Siskel in the early days before they became great friends, and they'd argue. Their debates were better than the movies :-)


As someone who loves both movies and video games, it was disappointing to hear his thoughts that video games aren't art. Did he ever retract those views?


Not exactly retract, but he did write[1] about how he felt his initial statement was not completely fair, and at the end concludes with saying that it's hard to come up with a definition of art that would exclude video games but include other popular forms of media accepted as art, and that for gamers, they could possibly have experienced something that, to them, makes games equivalent to art.

[1] - https://www.rogerebert.com/roger-ebert/okay-kids-play-on-my-...


Last few years, I have liked reviews by Berardinelli. If I remember correctly he has an engineering background (and I think has designed the site himself... not as polished as some others). Not great prose/writing style. But reviews are good.

https://www.reelviews.net


Author here.

The list (both on my post + the google sheet) should be correct now - I underestimated how many different releases of the SAME movie title there are ...

Thanks all for catching the mistakes.


FYI - the "Review URL" links at the bottom don't seem to be working properly.

e.g. for "Moonstruck", the link leads to:

"www.linisnil.com/articles/scraping-roger-ebert-reviews-and-amazon/www.rogerebert.com/reviews/great-movie-moonstruck-1987"

instead of:

"www.rogerebert.com/reviews/great-movie-moonstruck-1987"


hey thanks for catching that - turns out leaving out the protocol in markdown causes makes it a relative url. fixed.


I miss Roger Ebert.

I honestly watch fewer films since his death.

Maybe this list will change that.

Also "Spirited Away" is on the list but doesn't seem to be included with prime.


Spirited Away is now on HBO Max if you're looking for it


updated the list. Thanks for the catch! sorry about that. I'm using an `in` keyword which just checks substrings ... that I meant to change. Woops


I’d like to see this for more services. Amazon Prime has IMHO the worst UI/UX I have used in a streaming service.


Agreed. What’s most annoying is how they deal with geo blocked content. They don’t tell you it’s blocked until you start watching it. So you browse the catalog for five minutes, finally find a movie to watch, press play, and then get an error message.


Or how they sometimes won't allow you to watch the original English version because you're not in an English speaking country. Even if your language is set to English (of which around 1/4 is still not translated). All movie previews are in native language which necessitates a visit to YT or IMDB. Even non-prime content is affected, just recently I paid 2.99 to watch some cheesy 90s movie with my friend who had always wanted to see this "classic". The syncing was abysmal and resolution 480p i.e. 720x480 i.e. DVD.

After we were done, I noticed there were a handful of HD torrents in various languages. Reminded me immediately of Gabe Newell's thoughts on pirating games: "Piracy is almost always a service problem and not a pricing problem".


Self-promotion: A number of years ago I made a little website which links you to a random review written by Roger Ebert. It isn't the cleanest of implementations, but it did what I needed it to do when I built it.

http://randomebert.com/


Side note: one excellent film that Roger Ebert didn't review is Life Itself, the documentary about his life that came out right after he died. It's full of joy and heartbreaking at times, but it really solidified his place of most universally relatable movie critic in my mind.


thank you for this. We will be checking this out!


FWIW the link for "The Bicycle Thief" (1949) links to "The Bicycle" (2015). Great list though. It would be interesting to see the thing done for the AFI and BFI top 100. (Although I suspect that most movies on the AFI are probably already on Ebert's list.)


I was just seeing someone else's tutorial on scraping Amazon prices. They also ran into an issue where they needed to scrape twice instead of once. Not sure that it's the same issue you're facing but I thought I might drop my two cents.


I actually did notice that issue, even with using a stateful client like mechanize. Sometimes I had to scrape > 5 times in order to get through the "anti robot" page.

Other times, I get no issue at all. It's weird - maybe they're doing some pattern matching on request metadata on their end?


"The Man Who Shot Liberty Valance"

IMO, John Ford's best movie, hands down.

Unfortunately, it is not actually available on Prime without "CBS All Access" [Edit: ah, I see that this is not just "Included with prime", but all movies]


Prime just recently introduced channels, which dramatically increased the amount of content available, but each is $3.99. I personally love this because I’d rather have the option to subscribe to Smithsonian content or MGMs back catalog using the same streaming service I already use, rather than paying to use 10 different or getting stuck using the small list of (mostly old, TV movie, or B movie) content on just prime or Netflix.

So this is a feature, not a bug. Mostly because I’ve accepted how backwards the movie industry is going to be with copyright. But still having the option is the less evil.

That said, Maybe the UI can make this more obvious?


First time surprised me. I can see it being useful when you have the channel. Or as a separate tab when you want to possible add a channel.

It kinda taints the other content. You ask internal.. Is it really available without subscribing? to channels you have access to. Then you finally click and they have season 1 and 3 and 5 of some show. Terrible UI. Feels like you get so much less than you do.

With netflix it feels like you could just fall into a series immediately.


I typically use Google to search which streaming service has a movie, and for The Man Who Shot Liberty Valance, Prime says "Subscribed" underneath it.


Prime has had channels for almost five years now.


Although it's not listed in this post because it's a rental. I feel like it deserves a mention nonetheless. "Beyond the Valley of the Dolls" is a 1960s cult classic. It was directed by Russ Meyer and written by Roger Ebert. There's a link to it on Roger's site as well as available to stream from Amazon for cheap:

https://www.rogerebert.com/reviews/beyond-the-valley-of-the-...


When you said you were faking a proper agent with `requests`, do you mean you were setting the headers to look like a browser, as in here?: https://stackoverflow.com/questions/27652543/how-to-use-pyth...

That was going to be my suggestion for how to get around the anti-robot responses.


No agent at all is required. I got past the anti-robot response using no user-agent header and a simple delay.


yeah so I have a user-agent generator that uses a random (valid) agent for each request.

that, combined with cookies enabled, still didn't seem enough to get around the anti-robot barrier 100% of the time. 20% of the time, I get stopped.

My guess is that they must do some kind of pattern matching on their end that kicks for some sets of requests but not others? But maybe it's not that sophisticated and I'm just missing something else in my headers


Reading Ebert's Great Movies site was hugely formative for me. I also love that he's low-key one of the best writers about addiction and recovery. Side note: A colleague who worked at the Sun-Times when Ebert also did recently told me about how whenever management threatened cuts, he'd come into the newsroom to throw his weight around against it. Even when he was getting to be in ill health. Much respect.


I think this is referring to his list of "Movies You Must See Before You Die". On that note, I should mention "Gates of Heaven", which is part of that list. A compelling documentary on animal cemeteries, by Errol Morris.

https://www.rogerebert.com/reviews/great-movie-gates-of-heav...


Self-promotion (of sorts), but I (with some friends) have been watching lots of movies and keeping globally ordered rankings on github here: https://github.com/chillee/movierankings

I find globally ordered rankings of movies to be an interesting exercise of consistency.


Folks interested in this topic might like my new side-project: get an email when your favorite director releases a new movie - https://directoralerts.website


Minor instruction to make it work for me:

pip3 install BeautifulSoup pip3 install mechanize mkdir data

Guess you can check data directory but not sure about the pip3, python does not like R I am not sure and can it pip3 install when not exist ....


Only three movies in the past fourty years, non past 87, is surprising. Is Prime that full of older content or did Ebert's recommendations just stop coming from major studios?


Using https://www.rogerebert.com/great-movies you can filter by date


Thanks, but that doesn't explain why Prime doesn't have any released after 87.


I'm having a hard time coming up with the data to confirm this theory but 1986/1987 seems to me to be the peak of both VHS sales and the bottom of VCR prices.

I assume from that many studios had to come up with their first licensing schemes for movies "for home use" (contrary to movie theater and broadcast use), which could potentially still apply to (and restrict) streaming.

Once again, this is my personal hypothesis (and I'd be happy to see some data to support or contradict it).


1988 was the year of the Writers Guild strike, so while your exacts may be mistaken you're probably right that the licensing scheme is involved.


Presumably because contracts with studios won't let them stream those.

Why won't they get streamed? 1980 is 40 years ago, and movies pushing 40+ years old aren't in the day-to-day zietgiest and aren't getting a lot of searches. Newer movies still catch eyeballs occasionally. I'd bet '87 is where the long-tail drops off noticeably.


It's been a while since I've looked at Prime's catalog, but is it really exclusives plus fourty year old content? I understand why those distribution rights are more valuable, but that is also why Amazon would desire them more.


I cancelled my Amazon prime subscription, because of the ridiculous things they did in France.

And following the dark pattern path to unsubscribe. I'm happy that I did it.

Fyi, I'm from Belgium


I didn't hear about this. Can you give a quick summary of what actions Amazon took in France that you are referring to?


Useful information but the results look like garbage in Chrome and IE. The multi-line titles all run together and would greatly benefit from a table outline.


Safari too. I just discovered that modern user agents don't include table borders by default in their stylesheets -- at least, choosing Develop -> Disable Styles didn't make the table borders appear.

I'm having flashbacks to the chiseled borders of Netscape.


there's a gdoc spreadsheet link at the end


Are these free on Amazon or is this just an advertisement for Amazon Prime (or does everyone have a subscription but me?)


Not an advertisement for amazon prime :) These are free with prime, not free for non-subscribers

I'm actually going to be cancelling my prime membership, which is another reason I wanted to see what I could watch for "free".


Everyone had a subscription at one point not so sure anymore.


If the author reads this, the link in the Google sheet to the Battle of Algiers goes to Algiers, a totally different movie.


i'm fixing this list ... just realized a few bugs in the code. sigh


Don't mean to undermine this guy's screen-scraping adventure... but if you want to use something that will tell you all the streaming services that have the list of movies, you can use:

https://letterboxd.com/dvideostor/list/roger-eberts-great-mo...

You can look at each movie to see what streaming service it's on one at a time for free.

If you have a pro paid account, you can even do:

https://letterboxd.com/dvideostor/list/roger-eberts-great-mo...

Which shows that there are 39 movies in Amazon Prime US from Ebert's "Great Movies," not 21 like this guy's spreadsheet says.

To be fair, the exercise was to scrape the reference sources... so it might just need some refinement.

Need to double check though if both lists are correct, only confirmed number totals.

Full disclosure: That letterboxd list is not mine, I just found it


FWIW, I screen scraped rogerebert.com and copied all of his ratings and an excerpt of every review to letterboxd:

https://letterboxd.com/re2/

Just the great movies:

https://letterboxd.com/re2/tag/great-movie/films/by/release-...

You can then filter those by streaming service, but you need a pro account. Looks like 38 movies:

https://letterboxd.com/re2/tag/great-movie/films/on/amazon-p...

https://ibb.co/KFSj9jg

Apparently I missed the Buster Keaton movies:

https://www.rogerebert.com/reviews/great-movie-the-films-of-...

https://letterboxd.com/director/buster-keaton/

But that means 39 isn't quite right either since Ebert is saying all of Buster Keaton's films are great.

Anyway, the scraping was easy. The harder problem was parsing the html reviews (even with BeautifulSoup, the html is a mess), and then matching the reviews on Ebert's site to the correct movie, which I did via queries to tmdb and a lot of heuristics. There's nearly 8000 reviews and many have wrong years, bad titles, etc on rogerebert.com. It was a fun spare time project for a couple weeks.


Nice! Yeah, I wish letterboxd was free somehow without ads and they made their beta api public.

Yeah, I bet there's not a great standard for normalization/corrections of tiles, making a distinction of like when a movie was made and when a movie was released and translations and imports.

Good work.


I ended up using the director and cast that are listed for most reviews on Ebert's site for matching the right movie. Even that required some tricks due to spelling errors or differences in how names were listed. I then flagged any matches that weren't unique or where the title wasn't similar enough for me to manually review. I think I only had to double-check about a hundred or so.

I didn't use the letterboxd api. Instead, I generated csv files for the letterboxd importer. I then did a csv export from their site I could reconcile to look for import errors.

Trivia: Ebert reviewed a few adult films which I couldn't import to letterboxd because the site officially doesn't allow those.

BTW, it's only $19/year for an account. I have my own account I pay for which follows the re2 account. That way I can easily see any of the re2 reviews for movies in my own watchlist.


Yep, I caved and got an account last month too


I've added Buster Keaton's silent films from 1920-1929 since those are the ones Ebert is referencing in his review. That brings the total number of Great Movies available on Prime up to 43.


"Which shows that there are 39 movies in Amazon Prime US from Ebert's "Great Movies," not 21 like this guy's spreadsheet says."

I could be wrong, I am not a Prime Video user, but the result I got was that there are 217 movies in Prime Video from Ebert's great movies.

   links -dump 1.html|grep -c Prime.Video 
Instructions on how to generate 1.html are here: https://news.ycombinator.com/item?id=23508182


A weekend project idea for geeks like me who like films, but have the feeling to have already watched everything. I find that imdb ratings have a high (but not 1.0) correlation with me liking the movie (provided I like the genre). You can still download from imdb flat files that contain all movies, ratings, as well as cast/directors/producers/writers. Stick that in a database, with a basic UI to hide movies you have already watched. And you can make a good personal recommandation engine for movies you didn’t suspect exist. The power of this approach is that imdb is pretty much an exhaustive catalog.


I tend to like movies/shows on Rotten Tomatoes, where the critics like the movie less than the other users, e.g. The Orville https://www.rottentomatoes.com/tv/orville

P.S.: And audience score should roughly be above 70.


I tend to prefer RT over IMDB as a starting point because the former assesses whether a review is generally positive or negative. It gives me an idea of whether a movie/show is worth looking into. If even 70% of reviews are positive then I'll drill down and skim the actual reviews to get an idea of whether it's something I will personally like. If it's something like 5% of reviews that are positive, I probably keep looking.

IMDB always seems to skew mid-high since they're averaging out lots of "star" ratings. Seeing 3.5 stars doesn't tell me much. But even if 90% of the reviews were "nothing amazing, but fun to watch" then there's a chance it might be fun to watch.

Obviously it's just a tool for sorting and surfacing options, but it can be useful. I probably won't depend on a review aggregator to decide whether or not to go see a movie in the theater, but they are helpful when I see a movie/show pop up on Netflix or Amazon and I want to see if the general consensus is "worth a watch" or "effing terrible".


I completely agree with this. Whenever critics' and users' assessments of a movie or a TV show diverge, it is almost universally the critics who end up being wrong (in my subjective view).

They sometimes over-rate things people in their position cannot be seen disliking because it's about Societal Importance or some such thing that has little to do with actual quality. And conversely, they sometimes ignore or under-rate things because they cannot be seen to overly praise a work that's criticised for things having little or nothing to do with actual quality.

Thing is, I don't need to be culturally influenced, have my outdated views updated to match what's currently considered mainstream, my privilege checked, etc. I just want to watch a good fucking movie and decide for myself whether it contains a message that changes my mind on a topic.

For example, critics and users disagree on The Shape of Water[0] (users are right, it's very mediocre). They also disagree on Green Book[1] (users are right again, it's a great film even if it doesn't tick all the wokeness and political correctness checkboxes).

[0] https://www.metacritic.com/movie/the-shape-of-water

[1] https://www.metacritic.com/movie/green-book


> "For example, critics and users disagree on The Shape of Water[0] (users are right, it's very mediocre). They also disagree on Green Book[1] (users are right again, it's a great film even if it doesn't tick all the wokeness and political correctness checkboxes)."

no, the green book is awful, a particularly bad example of the feel-good movie, never once inviting the viewer to get lost in a believable world, instead inviting the viewer to question every directorial decision made. it was fake and pretentious at the same time, and completely safe around race relations.

the shape of water was not perfect, but better. not the best example of interracial relations (metaphorically), but gentle, revealing, quirky, ambient, and unpretentious.


I've generally found in life that my own preferences are more in line with the non-professional crowds than any self-anointed critics or other experts.


I'm probably the opposite but it's not really surprising that there's a divergence between the two. The box office makes it pretty clear that big budget popcorn movies/summer blockbusters/etc. are very popular with mainstream audiences. Whereas people who feel a calling to be film critics (or any sort of culture) tend to be a bit dismissive of at least cookie-cutter flavors of blockbusters (e.g. Transformers, most modern superhero movies, etc.) and more appreciative of indie and other smaller releases even if they aren't full-on fans of European and Asian art house fare.

Even Ebert, who was a relatively "everyman" sort of critic, definitely had a serious film critic persona if you've ever listened to any of his DVD commentaries.


If you watch enough movies the cookie-cutter type gets boring. If you watch once in awhile the cookie cutter formula can be reassuring that whatever leisure time you have you will get a predictable experience.


I think you touch on something else as well. (And this also applies to other formats like reading.) Some people just want entertainment/escapism. Others are going to (also) appreciate craft, to be challenged, etc.


So, critics are for the movie-privileged? Seems to be an upper-class thing if one can watch a lot of movies to the point that things become "boring".


I've watched at least 500 movies probably 300 full tv series.

It does get repetitive. It is boring. It does become a job.

Critics are elites.

Same thing happens in music. You get critics rating albums in magazines. Some people rely on them but most listen to the radio and pick songs they like based on that.


I tend to find that the movies I really like have a strongly bimodal distribution of comments on IMDB - some love and some hate.

e.g. "Blood Machines" - which I can see why a lot of people dislike it but I think it is awesome.

Edit: Probably not a great idea to search for it if you have a problem with nipples.


Do the movies you really dislike have the same distribution of comments? :)

I do think that that there is something to media that creates a "love it or hate it" response. It's usually trying something new (Blood Machines certainly was).

p.s.: I saw Blood Machines when it premiered on Shudder & really enjoyed it.


Good question - I checked IMDB for Force Majeure and it does seem to have that "love it or loathe it" distribution - I tend to agree with the comment about it being "mind-numbingly pretentious drivel" (which I guess some people would apply to Blood Machines).


> I find that imdb ratings have a high (but not 1.0) correlation with me liking the movie

You could track your own personal rating of the movies, then regress these ratings against a variety of external rating sources (IMDB, ebert, etc) to see which is the most predictive.


Along the same lines, MovieLens has been an ongoing project since 1997. The data is also available: https://grouplens.org/datasets/movielens/


>You can still download from imdb flat files that contain all movies, ratings, as well as cast/directors/producers/writers.

What's the easiest way to do this?


https://datasets.imdbws.com/

It looks like it's actually not as easy to find any more if you try to find it on their developer portal as it's been revamped recently with their official paid API.


A long time ago, you used to be able to download flat files of things like soundtracks and connections. Apparently that is gone.


Assuming it's the same one I used in the past, https://imdbpy.github.io/ effectively does this, putting the files into a local SQLite database for you.


On a tangent: I dream of a web where whenever there are sets of items (eg eberts-great-movies, and movies-on-amazon) you can easily apply set operations (like intersection) on them (so if ‘n’ stands for ‘intersection’, eg eberts-great-movies n movies-on-amazon).

So, in effect if you’re on a site that deals with a set of items, like the amazon prime movies, you can tell the browser to intersect this set with a different set at another URL.

I understand that doing this would require the right ‘infrastructure’ to be in place.


Ironically the pirate movie scene has tools like this[1][2], downloading new stuff based on certain rules is completely automated.

All the paid options hold on to their APIs way too tight to let anything like this to be viable.

I'd love to have a service that would cross-reference my queue of movies and TV shows and tell me which service has the most matches so that I'd know which one to pay for each month.

[1] https://radarr.video [2] https://couchpota.to


I don't know why there's no good way to know what's streaming on a certain service. Seems like the only way is through third party sites that are probably made up of people manually adding to the list.

If everyone is going to make their own streaming service then there needs to be a standard interface to make them collectively easier to use. A "guide", if you will.


Unfortunately their profit incentive is the opposite: to train you to watch what they choose to recommend, rather than take a step back and look at what isn't there. See the hollowing of the Netflix catalog.


There was briefly an MPAA backed service for this whose name is escaping me, which I used quite a bit. Unfortunately it fairly quickly devolved into only giving links to services where you had to pay per-movie to watch.


This so-called semantic web was all the rage but somewhere between then and now walled gardens screwed it all up.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: