Hacker News new | past | comments | ask | show | jobs | submit login

Hmm. Now I'm thinking that I might end up using your idea (scraping the dark web) and using something like httrack[0] to do exactly that: structure.

[0] https://en.wikipedia.org/wiki/HTTrack




I once tried using HTTrack, but I found it was doing too much magic under the hood and was hard to work with. As dumb as wget is (that blacklist bug is over 12 years old now!), it at least is understandable.


Thanks for saving me the headache :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: