Hacker News new | past | comments | ask | show | jobs | submit login

We've built an automated scraper and data ingestion ETL, which crawls search engines for medical PDFs. We'll definitely need to add in compliance and verification to make sure we're not crawling any protected documents.



While I appreciate the sources being available, it seems like you're not just linking to where you got them but you have a static cache that you are linking to, which is being a primary redistribution source. I would presume it's only a matter of time until the lawsuits over copyright happens.


If he’s getting lawsuits it probably means he’s getting a lot of users and that would actually be a problem he wants to have eventually




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: