

Show HN: My side project that searches the web for technologies - tectonic

Hello HN!<p>I've been working on http://underthesite.com for the last month and now think it is ready for some full strength HN feedback.  What do you guys think?  It crawls up to 10 pages of a given site while you wait, looking for community-provided CSS / XPath selectors and regular expressions.<p>Additionally, I'd like to appeal to you to submit matchers for technologies that you care about.  Technologies are easy to add, so add your favorite jQuery plugins, analytics tools, client-side node.js wrappers, what have you.  I'm going to be running a large crawl in the next few days and want to make sure to have a good selection of technologies ready.  After the crawl, I'll do some analytics and then come back to you guys with some interesting breakdowns of technology usage across the web.  So, please tell me what statistics you'd be interested in (top javascript framework, percent of sites using backbone.js, etc.) and submit your favorite technologies.<p>Thanks!
======
dyogenez
The custom matcher idea for technologies is really cool. I was thinking about
doing a similar thing, but as an open source project that people could
contribute matchers - but your approach is a lot easier for people to
contribute.

Think there could be a problem where people might not want to contribute when
they can't use what they contribute outside of your system. Have you
considered extracting the matcher into a gem (or _your language equivalent_)
and allowing people to use the matcher? Might not be the best idea because it
opens up competition, but if it became the defacto tool for analyzing sites,
there would be a lot of trust back your way.

On a side note, if this was an open library, or you did have an API to pull
information, I'd contribute and definitely use it on a site I recently built (
<http://lineofthought.com/> ).

------
ZackOfAllTrades
The little guy in the bottom right corner gets the point across, but he is
creepy as hell. Change him into something less Japanese. Maybe a magician?
Pulling back the curtain to show all the magic? Or an engineer with a helmet
showing the gears?

~~~
tectonic
I'd be interested to hear other people's thoughts as well, since most of the
people who I have polled did not find him creepy.

~~~
ZackOfAllTrades
Since nobody has replied, I should probably give a better answer: This is
based on personal presence and tastes of course. The little guy just brings
with him connotations of Howl's Moving Castle and the like. It may not be a
bad thing directly, but I think there are better choices. The blinking doesn't
help much either.

When I think of the underpinnings of a site, I think of images that go along
with the daft punk branding/style. Things are sleek, shiny, and, despite all
the complexity involved, just work. I feel like that might be better branding
for this sort of website. <http://en.wikipedia.org/wiki/File:DaftAlive.jpeg>

I hope that helps explain my thinking better.

------
geddes
This is really great! On the Analytics end I'd add the following services:

Omniture (Adobe) Site Catalyst. This can usually be detected by looking for a
request to the the *.2O7.net domain (though not always, some Site Catalyst
users CNAME their own tracking domain to 207.net, in that case it's harder to
detect)

ChartBeat (static.chartbeat.com/js/chartbeat.js)

Some others: WebTrends, CoreMetrics, Hitbox, Performable

------
networkjester
Disqus doesn't use Disqus?

<http://underthesite.com/sites/disqus.com>

Ah ha! Never mind; Their blog does!

<http://underthesite.com/sites/blog.disqus.com>

------
JoachimSchipper
Looks nice. You may want to consider moving it somewhere closer to
iterationlabs.com though, to maximize the SEO benefits. (I forgot whether you
need iterationlabs.com/underthesite or whether underthesite.iterationlabs.com
would work; look it up or consult someone who knows what (s)he's doing.)

~~~
tectonic
I'm sorry, I don't understand.

~~~
JoachimSchipper
As you probably know, Google thinks a page is more important if more sites
link to it. Similarly, I believe Google thinks "a bunch of pages" is more
important if more sites link to it; I forgot the exact meaning of "a bunch of
pages" in this context (and I _really_ don't have time right now), so you will
have to look into this yourself.

That is, if you want iterationlabs.com to rank in Google.

------
MatthewB
Looks good...but isn't this the same thing as builtwith.com and a few others?

~~~
tectonic
The other sites don't allow the community to add their own technologies and
matchers. My hope is that people who run or are passionate about particular
web technologies will add them to underthesite.com. It a good way to spread
word about your open source project.

I will be adding much deeper breakdowns about technologies soon.

------
adrianwaj
Now all you need is some charts, extrapolate some trends, and write some
interesting blog posts.

~~~
tectonic
That's the plan!

~~~
adrianwaj
You could also setup chatrooms or forums around technologies:
<https://github.com/chrismatthieu/CHATS.iO>

Other ideas:

\- measure who has implemented what technology and when

\- work out various combinations of technologies

\- work out who would have the most expertise in an area

\- try and create a hacker score for users: (function of: time started, site
traffic, simultaneous technologies, ??)

------
tectonic
Clickable: <http://underthesite.com>

