
Finding Free Food with Python - jamesbvaughan
http://jamesbvaughan.com/python-twilio-scraping/
======
dheera
So "back in the day" when I was an undergrad at MIT I signed up for every
mailing list I could get myself onto and then trained a Bayesian spam filter
to recognize free food emails. I threw all free-food e-mails in a freshly-
untrained spam box and sure enough, after a few hundred e-mails it kept
putting free food e-mails in one folder which was extremely convenient.
Spamicity (i.e. free-food-icity) for phrases like "bring spoon" and "thesis
defense" were particularly high.

~~~
douche
I don't know how it was implemented, but there was a similar free food
blitz[1] list at Dartmouth in my day. I swear, just catering the campus-funded
student organizations' events must have floated half the restaurants in town.
Most days, you could eat like a hobbit.

[1] Dartmouth is a weird place and for the longest time ran a unique homegrown
emailish/im protocol called BlitzMail, with an entire vocabulary and culture
associated with it.

~~~
dheera
Interesting. Read a little about BlitzMail. MIT had Zephyr as an IM system
which was kind of fun.

I miss the days of those old IM systems that I could write my own processing
pipelines and UIs for. It's sad that in these walled garden days of Whatsapp,
Wechat et al. that there is no way for me to programatically access my own
incoming messages.

------
danso
I've been wanting to make my students do, as a side project, a free-food
searcher for Stanford. Wouldn't have to be hard at all: Download (or even
wget) the CS/Engineering calendar, do the minimal scraping needed to figure
out if the words "food", "meal", "dinner", "will be served" "refreshments"
appear anywhere in the text, then return True.

That's the main thing, the rest is just gravy: returning the event serialized
as JSON/CSV (event name, date/time, URL) to be used in a web app or
notification system -- so a simple web scraper can lead to exploring web dev
(even just a simple Flask app) or fun APIs like AWS SNS and Twilio. You could
even fit in a good "cache invalidation is a hard problem" lesson.

I never get around to assigning it because most people don't think of it as
"serious work". Also, I'm afraid the CS dept. will obfuscate their calendar if
random people start showing to things for free food. But I keep telling
students, the only/best way to learn coding is to do something that directly
affects your life or your bottom-line. It's the best way to put constraints on
a project, i.e. think of things as the MVP that improves your life.

I learned Ruby and Ruby on Rails much faster than I had any right to, when my
new job in NYC required it. I practiced not by writing Ruby on the job, but
writing Ruby to scrape Craigslist apartment listings and feeding them into a
spreadsheet.

I've thought about creating "personal data/programming projects"...in which
the data comes from the student. Such as the SQLite that stores their
Chrome/Firefox/Safari history. Or the parseable HTML dump that Facebook gives
you when you request your records. Or your Twitter data dump. Or your Google
search history.

But I've been reticent to do this. Partly because it's not a guarantee that
every student uses Google or Facebook or has an Instagram. And partly because
I'm deeply paranoid students (especially those who are novices about
programming and operating systems) will accidentally upload or otherwise
expose this sensitive data dump.

~~~
Abundnce10
_I learned Ruby and Ruby on Rails much faster than I had any right to, when my
new job in NYC required it. I practiced not by writing Ruby on the job, but
writing Ruby to scrape Craigslist apartment listings and feeding them into a
spreadsheet._

I started teaching myself computer programming in 2011/2012: I bought a bunch
of Head First books published by O'Reilly, watched the entry-level CS video
lectures that Stanford provided for free, participated in the first few
Udacity courses, and I tried a couple Coursera classes. But I think I learned
the most (in the shortest amount of time) when I went through your online
book, 'The Bastards Book of Ruby'[0]

I loved the real-world examples you used: fetching Tweets through the Twitter
API, web scraping with Nokogiri, and manipulating images with
ImageMagick/RMagick are a few that stick out to me.

I'm sure you're doing great things at Stanford but I'm also confident that you
could come out with a new book (covering the topics you spoke of above) that
would help motivate people who are on the fence on whether or not they should
start/continue learning computer programming.

I'd be happy to collaborate on something but I would argue that Python would
be a better language to use than Ruby. Here are some of the topics I'd like to
share with people (which I've found to be useful in my career/side-projects):
reading data from CSV/Excel files (xlrd[1]), fetching data from APIs
(requests[2]), web scraping (BeautifulSoup[3], Selenium[4]), connecting to a
SQL database within a Python script (psycopg[4]), complex mathematical
computations (numpy), and downloading videos/metadata from YouTube (youtube-
dl[6]) are a few that come to mind.

[0] [http://ruby.bastardsbook.com/](http://ruby.bastardsbook.com/) [1]
[https://github.com/python-excel/xlrd](https://github.com/python-excel/xlrd)
[2] [http://docs.python-requests.org/en/master/](http://docs.python-
requests.org/en/master/) [3]
[https://www.crummy.com/software/BeautifulSoup/](https://www.crummy.com/software/BeautifulSoup/)
[4] [http://selenium-python.readthedocs.io/](http://selenium-
python.readthedocs.io/) [5]
[http://initd.org/psycopg/](http://initd.org/psycopg/) [6]
[http://www.numpy.org/](http://www.numpy.org/) [7]
[https://rg3.github.io/youtube-dl/](https://rg3.github.io/youtube-dl/)

~~~
danso
Thanks for the kind comment. I have been thinking of putting together a Python
and SQL book since I teach those primarily and no current book fits my needs.
I hope to have the same appeal except being a much better programmer these
days :)

------
TeMPOraL
I wonder how do those "free food" promotions make sense, business-wise. As
this article shows, by doing such promotion you hit a completely different
(and useless for you) clientbase - people just waiting for the promotions.
This feel similar to the way most people seem to use Groupon - they're
interested in using what's currently on big discount, and they won't be coming
back to a place when it's on its regular pricing.

~~~
bryanrasmussen
I guess I fit this profile, only I have guilt about it. So if I do like it I
will come back. I guess really it only helps get me back if it something I
would do frequently enough to use the service. For example a deal to go to a
hotel or massage I will probably only use once and not return, a deal for a
restaurant I end up liking will get me back.

~~~
TeMPOraL
I'm not necessarily saying you're saying you should feel guilt about this; the
companies invent many ways to screw you up, so I'd say that a small amount of
exploiting the very rules they set up is in order. The point is, some of those
promotional strategies have obvious problems which have already been
demonstrated in the past, so I'm not sure why people keep implementing those
strategies.

------
guaka
Finding free food with PHP:
[http://trashwiki.org/en/Main_Page](http://trashwiki.org/en/Main_Page)

~~~
NTripleOne
Not sure why you're being downvoted, I laughed - and I'm a PHP developer.

------
DerJacques
Great read that shows how web scarping can easily be utilised in a meaningful
way.

However, doesn't this script either send out the same text very often, or
potentially send it out too late (by e.g. only letting the cronjob run every 6
hours)?

I assume that time is of the essence in this situation. Some sort of log on
sent texts would surely be helpful.

~~~
jamesbvaughan
Right, that's a very good point! That actually got very annoying when I first
deployed this script, but I ended up adding a kind of logging to it. I wanted
to keep the script in the post as simple as possible, but here is the one I'm
actually using:
[https://gist.github.com/jamesbvaughan/4c501fc99acb75852756a4...](https://gist.github.com/jamesbvaughan/4c501fc99acb75852756a4d1dfc8ca3d)

------
tedmiston
Cool project. I do something similar for the daily free ebook from Packt. Also
for when new episodes of Silicon Valley are posted.

Tiny detail - you can do the regex match case insensitive and avoid the call
to lower() for every string.

    
    
        re.match(..., re.I)

~~~
douche
Do you have that Packt script handy?

There's a lot of junk that gets put up there, but also some gems.
Unfortunately it seems like the gems are always on the days I don't check
it...

~~~
tedmiston
So today I noticed that they announced additional free ebooks on twitter
besides the daily one. I'm thinking of writing custom code now that subscribes
to their tweets and filters for mention of free. May be doable with IFTTT.

------
t0mk
I wonder how difficult would it be to do this with cloud services, I mean to
scrape voucher codes for free credit. For instance when Digital Ocean
announces some promo and you can get 10 USD worth of credit.

There are some sites that allegedly publish coupons, but I feel like a dummy
only scrolling through those, it's full of ads and crap.

What would be the proper channels to scrape for promo codes of the cloud
providers? Twitter feeds, something else?

~~~
nl
_What would be the proper channels to scrape for promo codes of the cloud
providers? Twitter feeds, something else?_

Affiliate programs.

------
canremember
If you're interested in free food, I recommend checking out
www.freefoodguy.com. He's a blogger who's sole mission in life seems to be
finding free food deals and sharing them via his email list.

------
tobilarscheid
Regarding free food you could also try natched:

[https://natched.com](https://natched.com)

Disclaimer: I helped building this.

------
akeck
This automation will increase the consumption of free food offers by removing
friction. Restaurants will not make money on their promotions, since the
script users will only consume the free food in their local market.
Restaurants will stop using free food promotions. ;-D

------
abbiya
[https://gist.github.com/mseshachalam/b57907a37763532917fc2ca...](https://gist.github.com/mseshachalam/b57907a37763532917fc2ca79e4d5b77)

cloudflare captcha is blocking me to test this.

~~~
applecrazy
Try changing the user agent string of the requests library to something more
legit, like a Chrome user agent string. Makes most websites less suspicious of
your traffic.

------
chris_chan_
Haha, very nice. I think this might works with
[https://www.groupon.com/](https://www.groupon.com/) too.

------
robertcorey
i dunno how well this would work in practice, in my experience signing up for
promotional email results in tons of junk emails with actual deals only
offered occasionally. It'd be hard to parse out real value

------
ing33k
postmates uses cloudflare and it may show a captcha page sometimes .

------
gandutraveler
New User = Free Food

------
masthead
This is so easy, I wonder how it made all the way up to the front page.

I write these kind of mini-scripts all the time

~~~
ramkarthikk
It could be because the author took time to give a detailed explanation of how
he implemented it. It shows to a beginner programmer that (s)he could automate
simple tasks like this easily and a way to approach it.

~~~
djaychela
Absolutely - I spend a fair bit of time on here, but I'm only a beginning
programmer, using Python. This is exactly the sort of thing that inspires me,
and also it's well written in terms of understanding for someone of my level.
Obviously it's trivial to 99% of HN readers at a technical level, but I think
there's room for this sort of thing.

I teach for a living (music technology), and it's incredible how badly a lot
of things are explained (in all fields). Clear explanation and worked
examples, combined with appropriate progression in difficulty is what makes
for good learning, and I think this is a good example.

~~~
jamesbvaughan
Thanks, that makes me happy to hear! My goal with posts like this is to do
exactly that, and I hope to post more things like it in the future.

