
Show HN: Discover the tech stack for any website - ishansgupta
https://sitestacks.com/?ref=nyc
======
dna_polymerase
So today I learnt that Amazon.com is acutally hosted on Google Cloud Platform
uses Google Domains as well as 1&1 Hosting, alongside Adobe TagManager and
Google Analytics. Who could have known? \s

Something is really off there, what do you do to get these results?
[https://sitestacks.com/amazon.com](https://sitestacks.com/amazon.com)

~~~
bootcat
wow !! unsure if that's actually true.

~~~
igetspam
It's at least partially true:

This is valid:
[https://mail.google.com/a/amazon.com](https://mail.google.com/a/amazon.com)

This is not valid:
[https://mail.google.com/a/thisisafakedomain.cx](https://mail.google.com/a/thisisafakedomain.cx)

All that means is that they at least used Google Apps at one time. I have a
number of abandoned accounts that show as valid even though the domains are
dead and the accounts are no longer used.

The same type of domain enumeration doesn't work for Drive.

------
wolfgang42
Interesting, but the results seem kind of misleading. For example,
[https://sitestacks.com/linestarve.com](https://sitestacks.com/linestarve.com)
says I'm using "NameCheap Web Hosting" \-- they're my registrar, but I run my
own HTTP server for that domain. Meanwhile
[https://sitestacks.com/analytics.bitmash.io](https://sitestacks.com/analytics.bitmash.io)
reports that it's built with Piwik (true, but I would expect that to appear on
a page _using_ Piwik, which it doesn't), but for some reason leaves out all
mention of webserver, DNS hosting, and so on.

~~~
ggiaco
Thanks for the feedback. Sounds like NameCheap should be picked up as a
registrar; we'll take a look!

On the second point, we should clarify that the data is meant to be domain-
wide (except for the real-time mapper for domains not already in our universe
- there we only look at the first page so we can return data quickly).

~~~
wolfgang42
What do you mean by 'domain-wide'? Is it supposed to pick up everything on the
domain and its subdomains?

------
scaryclam
Had a quick go and was pretty disappointed. Most of this is just having a look
at the third party js loaded. Other things are wild guesses based on DNS,
which has nothing to do with the website but are either out of date or reside
in unassociated parts of the business. The tech stack detection was was almost
entirely wrong :(

------
afloatboat
In order for me to actually use your service it would need to be rolled into a
browser extension. This needs to be at your fingertips when you need it, now I
need to copy-paste the url into a new tab. I use Wappalyzer [1] a couple of
times a day and love the browser integration.

[1] [https://wappalyzer.com/](https://wappalyzer.com/)

~~~
ggiaco
Got you covered [https://www.producthunt.com/posts/sitestacks-for-
chrome](https://www.producthunt.com/posts/sitestacks-for-chrome)

------
manigandham
These tools never work well, usually a weak combination of scraping the html,
JS tags and checking DNS entries. Siftery tried something similar with a bunch
of VC raised and is equally bad.

The only decent one is [http://builtwith.com](http://builtwith.com) which also
happens to be a great 1-man business.

~~~
thephyber
From the website:

> This is a limited profile. > To see additional products added by employees
> and vendors, check out Siftery's profile.

Both sites use the same domain provider and the same SSL cert provider. I'd
say with medium confidence they are probably 2 products of the same company.

~~~
manigandham
Looks like that's corroborated by the other comments on this page, which makes
it even worse.

------
bootcat
doesn't [http://www.builtwith.com/](http://www.builtwith.com/) do this already
and have so much data too ?

~~~
ggiaco
Yeah, and they're not the only provider you can point to. Some competition
pushes everyone to do better?

Here's my pitch for why you might want to use SiteStacks with a browser
extension:

Lightweight: The extension is only 25 KB (mostly images). The one-click
technology lookup runs entirely on our servers. No content insertions or
background processes to slow down your browsing. It's like your favorite
search engine.

Secure: SiteStacks doesn't download any of your browsing data - only the
active tab URL is passed along.

Great product coverage: SiteStacks is supported by Siftery and its library of
over 40,000 products. SiteStaks includes data for some products that isn't
publicly available anywhere else.

Best-in-class accuracy: The data on SiteStacks benefits from validation form
the awesome Siftery community. This built-in constant feedback loop helps us
identify data collection methods that are yielding bad data and ultimately
promotes best-in-class data accuracy.

------
edoceo
Dupe from 20 days ago

[https://news.ycombinator.com/item?id=15249136](https://news.ycombinator.com/item?id=15249136)

------
ishansgupta
SiteStacks can find the technology used at any domain, including a set of
roughly 700,000 that we’re regularly checking.

What makes the dataset unique is the combination of programmatic data (code
breadcrumbs, network requests, DNS, some NLP, etc.), but augmented by data
validated by users directly.

The user validated data is only available on Siftery (e.g. for
sitestacks.com/uber.com you have to follow the link through to
siftery.com/company/uber to see the full set), but all the programmatic
methods are improved by user-validated data (e.g. if a method yields too many
false positive, we bump it out).

We think this approach helps create the most accurate dataset of its kind.
We’ve done some internal benchmarking and feel really good about it.

We’re looking for feedback on how this can be better, and open to partnering
with others who want to make use of this data for good.

~~~
adzicg
just tried this for our app, and it wrongly reported mandrill and flash (we’re
not using any of them). we used mandrill a few years ago, so this might be
some stale historical data, but the app never used flash.

~~~
ggiaco
What's the URL? We can report back exactly why it was picked up.

Even if we're wrong, it's exactly this kind of feedback loop that's built into
the product and ultimately helps make the data better for everyone else.

------
ishansgupta
I should also mention we have new browser extensions for chrome - [1] and
firefox - [2]

[1] [https://chrome.google.com/webstore/detail/sitestacks-
instant...](https://chrome.google.com/webstore/detail/sitestacks-instant-
tech-l/nnknmohbeolbkgeggkiaifelfkdlnfak)

[2] [https://addons.mozilla.org/en-
US/firefox/addon/sitestacks/re...](https://addons.mozilla.org/en-
US/firefox/addon/sitestacks/reviews/)

------
enitihas
Also see Whatruns which was on hn last month :
[https://news.ycombinator.com/item?id=15098028](https://news.ycombinator.com/item?id=15098028)

------
justboxing
> We’re looking for feedback on how this can be better, and open to partnering
> with others who want to make use of this data for good.

Responding to the feedback and data discrepancies mentioned here would be a
good start. The HN community here is testing this for you for free and
providing you with valuable feedback, and asking you questions that you need
to answer, if you want to make your product useful.

I don't see you (OP) responding to anyone. The 2 posts from you are both
promoting the site.

~~~
ayanb
We are looking into the data discrepancies, it might take a bit of time to get
through the individual errors. DNS + Registrar data seem to have brought on
false positives. We need to beef up on Front-end tech too. The real time
mapper needs work.

Humbling comments really, we have our work cut out.

------
Sreyanth
[https://sitestacks.com/products/g-suite-formerly-google-
apps...](https://sitestacks.com/products/g-suite-formerly-google-apps-for-
work)

Doesn't seem like it is accurate though. Seems like this is more of
crowdsourced data than automatically figuring out things.

------
deadghost
Ran it on [https://canpicker.com/](https://canpicker.com/). It's kind of cool
and accurate but I was expecting it to pick up stuff like react and maybe
finer grain details like individual libraries.

------
Doctor_Fegg
Passenger doesn't necessarily imply Rails. builtwith.com gets this wrong too.

~~~
ggiaco
Thanks for that piece of feedback. Team's looking into it.

------
whipoodle
I wish I could be more specific but I looked at this for the site I work on
and quite a bit of it is incorrect.

------
lozzo
I tried it with my website. It spotted google analytics (correct) and HTML5
(correct again)

I was curious to see if it was going to work out that I am running it against
Google App Engine but it did not figure that

------
mceoin
Congrats on the launch. It looks like I can click on a tag (on hover, color
and cursor change), but when I do that nothing happens.

~~~
ggiaco
You're right. We'll fix that. Thanks!

------
israrkhan
It does not pick react, and mis-reported a site as using angular while it was
based on React.

------
schmidty
Id like to see some ability to detect different CMSs

~~~
ggiaco
Our focus has been more on business tech, but we're expanding coverage into
cms, libraries etc.

Here's some data we have on CMSs currently

[https://siftery.com/categories/content-management-system-
cms...](https://siftery.com/categories/content-management-system-cms/web-
content-management)
[https://sitestacks.com/products/wordpress](https://sitestacks.com/products/wordpress)
[https://siftery.com/wordpress](https://siftery.com/wordpress)

