Hacker News new | past | comments | ask | show | jobs | submit login

Does anyone have a good strategy for archiving and downloading the textbooks from aimath.org? They are all excellently formatted in HTML, but I am not certain what the best way to get the complete book would be.



When I have books in form of webpages I normally write a small crawler in python, extract the text div with beautifulsoup, add <hn> tags for chapter names and throw them all together in html form. Add a cover image and combine everything with pandoc.

Nothing fancy but works reliable in an automated fashion




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: