Does anyone have a good strategy for archiving and downloading the textbooks from aimath.org? They are all excellently formatted in HTML, but I am not certain what the best way to get the complete book would be.
When I have books in form of webpages I normally write a small crawler in python, extract the text div with beautifulsoup, add <hn> tags for chapter names and throw them all together in html form. Add a cover image and combine everything with pandoc.
Nothing fancy but works reliable in an automated fashion