Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> CSS/Xpath are very fragile. You most likely will be changing them in the future.

Genuinely curious what the alternative is



I've been doing research on this but it's not clear whether this problem is a pain for enough number of businesses to justify further investments.

I often feel like web scraping is a commodity without understanding any of the inherent technological complexities and challenges.

Very discouraging field to be in, especially when people claim to have pain but are unwilling to pay very much for it or show appreciation for the effort that goes into it.

edit: thanks for the downvotes. perfect illustration of how innovation is punished and unrewarded in this field.


How many downvotes did you get? A few may have just been because your response is vague and doesn't say much for the question. But that doesn't mean a downvote should occur, of course.

Otherwise complaining about downvotes is no good either. Some will downvote because of that.


> especially when people claim to have pain but are unwilling to pay very much for it or show appreciation for the effort that goes into it.

Welcome to pretty much every profession in the world.


FYI, I only downvoted you after you complained about downvotes.


I only downvoted because there was no alternative offered, just complaining about how underappreciated scraper creators are.

> perfect illustration of how innovation is punished

I see no innovation in your post.

The complaining about downvotes was just the icing on the cake, cementing the downvote.


for unstructured data applying NLP

the other alternative is parsing schema.org schemas or other markup




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: