
Ask HN: Best tool for technical web crawler data (entire internet) - ghawkescs
I am looking for tool or search engine for technical web crawler results.  I want to know what sites are using certain products and&#x2F;or plugins.  I have seen nerdydata.com but I am looking for other options.
======
xstartup
We are using publicwww for ad scraping (finding which ad network a website
uses). It works well!

Our product is very simple:

a) Scrap ads b) Display ads c) Sell subscription to customers.

Advertisers pay to spy on competitors' ad, surprisingly it's quite easy to
build if you know how to scrap and implement a search on top of it.

Scraping costs $10 per customer and we sell it to customers for $159 per
month.

------
mgliwka
Besides nerdydata there’s also the similar publicwww.com for custom searches.

Then there’s the usual suspects with predefined technology definitions
(builtwith, similartech, datanyze).

Last but not least I’m also building such a product and will be launching next
month.

I would love to discuss your needs and see if we can accommodate them - you
can reach out to me at mg@locatetech.io

~~~
veggies_4us
I have tried publicwww but over 80% of the results were either duplicates
domains, dead links, or didn't contain my search term.

I'm glad I only wasted $49 and didn't opt for the annual license.

------
mtmail
[https://builtwith.com/](https://builtwith.com/) is in the same space.

~~~
ghawkescs
Getting security warnings for the site on all browsers, guessing something is
wrong right now?

------
veggies_4us
they don't advertise it well, but nerdydata's custom reports had some
excellent results... way more than their regular search website shows.

They were also able to customize our search and extract specific data from the
page. pretty advanced stuff.

~~~
ghawkescs
That sounds promising, can they run a custom report across their entire index?
I'm trying to get a count of competing products to gauge market size and
interest.

~~~
veggies_4us
They gave us the option of either running a report on their entire index, OR
if we had a list of domains to search explicitly (which we didn't have).

Our use case was different, though, we wanted to find websites using a
javascript library, and then run a javascript function on all of those sites
to extract account information and specific html patterns.

But yes, their larger custom reports index did the trick, would recommend.

~~~
ghawkescs
Great to know, thank you. Was the pricing reasonable?

