
Show HN: MechanicalSoup, Python library for automating interaction with websites - mattme
https://github.com/hickford/MechanicalSoup
======
flexd
An alternative to this is RoboBrowse [1] which is also based on requests +
BeautifulSoup4 and seems a lot more mature.

[1]
[http://robobrowser.readthedocs.org/en/latest/readme.html](http://robobrowser.readthedocs.org/en/latest/readme.html)

~~~
mattme
Haha! If I'd known, I would have used that myself.

~~~
mattme
Still, it proves the idea is a good one!

------
wc-
Mechanize is a core part in quite a few of my projects lately and the fact
that it hasn't been modified in over 2 years has been very worrisome.

There are lots of edge cases out there on websites. Mechanize has built up
years of fixes and workarounds for these, I hope that MechanicalSoup can learn
from these the easy way rather than waiting to make the same mistakes again.

I also hope that this repo grows into a bigger community of support, not just
one person contributing (who could leave / get bored at any time). Looking
forward to following this!

~~~
danso
Er...that's just the Python Mechanize, right? Ruby's Mechanize has been
regularly updated and patched, though I can't say I've used it to the extent
that I've run into infuriating edge cases:
[https://github.com/sparklemotion/mechanize](https://github.com/sparklemotion/mechanize)

~~~
wc-
Right, python's mechanize at
[https://github.com/jjlee/mechanize](https://github.com/jjlee/mechanize) seems
to be pretty stagnant.

------
lazerwalker
I personally find Capybara[0] to be the happy medium for web scraping, if
Python isn't a hard requirement. It has a simple API, like MechanicalSoup, but
it can also easily be configured to use Selenium, node-webkit, or any other
browser you want for full proper JS evaluation.

[0][http://github.com/jnicklas/capybara](http://github.com/jnicklas/capybara)

~~~
actionscripted
Unfortunately there isn't a Capybara library for Python. There are comparable
packages like Lettuce and now MechanicalSoup (to a certain extent).

------
Deusdies
This is fantastic. I've used python mechanize in some very large projects and
it was very frustrating - their lack of documentation and, well, the fact that
it's complete "abandonware".

I've had mechanize repository cloned for a year now, planning to do something
with it - never got around to. Looks like MechanicalSoup just got themselves a
new contributor!

------
diminoten
Sell me MechanicalSoup over Selenium.

~~~
wc-
Selenium seems very heavy-weight to me (granted I have only used selenium
server). If you don't need to interpret javascript after a page loads then you
might be able to use mechanize. In my experience I've gotten better
performance and a higher ease of development with mechanize over selenium or
other "full" headless browsers. Different tools for different jobs I suppose,
I just tend to go for the smallest tool first.

edit: replace mechanize with mechanicalsoup in the above paragraph, they are
aiming to solve the same problems in the same way.

~~~
dkhar
> Selenium seems very heavy-weight to me

Fair point. Perhaps try Sulfur?

~~~
mhluongo
I started googling, excited about discovering a new library, only to realize
what you'd done there -_-

~~~
mherrmann
Try "Helium web automation" ;-)

------
goorpyguy
Does it have a javascript engine? Because we had to abandon
BeautifulSoup/Mechanize over this a couple years ago and switch to HTMLUnit
(Java).

------
jdnier
There's not a lot to it so far (a single class, three tests). I wonder if the
author has a road map for the project.

------
webmaven
How does MechanicalSoup (or RoboBrowse, for that matter, this is the first
I've heard of either) compare to Scrapy?:
[http://scrapy.org/](http://scrapy.org/)

------
rhgraysonii
Have any documentation on a roadmap for things as they go forward? Would love
to send some PR's your way :)

------
supsep
This is exactly what I was looking for my next project. I was trying to do
this with Node.js to avail, Thanks!

------
jpd750
Pretty cool, thanks for sharing!

------
volent
Would that be an equivalent to Casper.js ?

