Hacker News new | past | comments | ask | show | jobs | submit login

Yes, it's one of the most popular feature requests. We don't support auth yet, but it's on our shortlist and we hope to have it ready soon.



how are you going to do it without having to know the actual authentication key(s)? if i don't trust anyone enough to give my auth away, and so unless the site being scraped has some sort of oauth support, how are you going to get any data?

of course, if this was an offline product, or self-hosted product, then it would solve that problem of auth instantly.


Would there any way to fake the beginning of an OAuth session with Facebook, Google or any other OAuth authenticated site? Kind of like replaying cookies to hijack sessions?


The route of proxying the web page presents much difficulty in doing actual authentication on Facebook or Google's website via the proxied webpage without first rewriting most of the javascript and hijacking their Ajax calls on the fly.

The approach I took was to hijack the Cookies from the browser once the user has signed in after on e.g. Facebook via the browser extension.

The route of proxying the website does in fact do away with the need to install any external 3rd libraries.

This browser extension I built coupled with the web service its integrated to does allow for scraping of pages from Facebook, Google and LinkedIn logged in pages as well.

https://chrome.google.com/webstore/detail/krakeio/ofncgcgajh...


Hah, I've been working on this recently with Facebook, on a TV set-top-box. It was painful and I ended up giving up. xd_arbiter.php is the key, I think.


I hope that the lite plan will feature auth handling, I can't imagine the service being useful in most cases without it.


We're working on auth... it's the most requested feature at the moment. And we're still beta at the moent, so all usage is free


Glad to hear it. I was just saying that I think auth should be included as a basic feature in all paid for plans.


I'm wondering how you will be able to with the numerous ways of CSRF protection implementations.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: