| ||Ask HN: Scripts/commands for extracting URL article text? (links -dump but)|
1 point by WCityMike 3 months ago | hide | past | web | favorite | 4 comments |
|I'd like to have a Unix script that basically generates a text file named, with the page title, with the article text neatly formatted.|
This seems to me to be something that would be so commonly desired by people that it would've been done and done and done a hundred times over by now, but I haven't found the magic search terms to dig up people's creations.
I imagine it starts with "links -dump", but then there's using the title as the filename, and removing the padded left margin, wrapping the text, and removing all the excess linkage.
I'm a beginner-amateur when it comes to shell scripting, python, etc. - I can Google well and usually understand script or program logic but don't have terms memorized.
Is this exotic enough that people haven't done it, or as I suspect does this already exist and I'm just not finding it? Much obliged for any help.
| Apply to YC