

Ask HN: Would you pay money to do a regex on the whole internet? - Ellipsis753

Sometimes I&#x27;d really like to see all the pages that contain a certain regular expression pattern.<p>Sadly I can&#x27;t seem to find a service that offers this for a vaguely manageable price.<p>So I wondered if there&#x27;s any demand for this.
Do you ever want to do a Google search with regex and equally as important would you pay to have a download of all the pages that matched your regex?<p>What would you use it for? Would a text only search be enough or would you be likely to want to use it to match HTML code?
======
bayonetz
God yes! I am tired of writing scrapers and crawlers myself when most of the
time, I'd like to just do some prototyping with a regex in a search box
against the internets. This could seriously accelerate the prototyping loop.
Traditional search finds the the most most popular / trusted / authoritative.
When I am crawling, usually this is not what I want. I want to get constrained
information (usually text) from everywhere including the long tail.

This reminds me of SPARQL. SPARQL can do this kind of thing theoretically but
go try using a SPARQL endpoint...go on, I dare you. What a terrible
experience. This contrast then reminds me of how I'd rather use JSON instead
of XML to serialize in about 99% of cases. Or Python instead of Java to
prototype with. Basically, <Hacker-Scruffy-Thing> instead of <Mega-Structured-
Thing>.

Yes, please bring us the internet regexes.

~~~
Ellipsis753
Our of interest how much would you be willing to pay per search and would you
be doing searches for text only or HTML code?

Finally is it important to you that the data be very fresh? For example is a
month old copy of the internet enough for your searching needs?

~~~
bayonetz
I'd like a regular search page to use first for query development and then
once I'd figured out my queries, an API would be good for automation. Maybe
make the search page free and charge for the API? Cost-wise for the API, the
kinds of pricing offered by services like firebase, parse, readability, etc.
seem too high. I think APIs cost too much in general though. Something along
those lines would be reasonable.

------
jgeorge
It was a sad day when the net outgrew Kibo.

