Ask HN: What are the most challenging pages to scrape? - gbajson
======
randomerr
Try PWA based one. Since they load in segments and cache a lot you'll have
fun:

[https://pwa.rocks/](https://pwa.rocks/) \- Look at the business and news
webpages

Also look for the website that use AMP. They're even more fragmented then PWA
pages. Below is article about AirBNB using AMP and iFrames:

[https://medium.com/swlh/how-airbnb-is-putting-amp-at-the-
cor...](https://medium.com/swlh/how-airbnb-is-putting-amp-at-the-core-of-its-
digital-strategy-d6b9cf1fc0ad)

[https://www.ampproject.org/](https://www.ampproject.org/)

------
gbajson
I am looking for pages which are difficult to scrape, to see the applied
techniques, and to learn how to bypass them.

