
Show HN: GitMap – Interactive map of top 10,000 repos clustered into ecosystems - oracleofnj
http://oracleofnj.github.io/gitmap/
======
andrewstuart2
This is really cool, and props for making it at least dynamic-ish by rendering
JSON data with D3.

That said, some of the clustering (or at least the clustering visualization)
seems strange, such as AngularJS within jQuery, PouchDB within React, EmberJS
within Rails, redis-doc being included within Go, oh-my-zsh within homebrew,
etc.

FWIW, there's also a bit more info available here:
[https://github.com/oracleofnj/gitmap](https://github.com/oracleofnj/gitmap)

~~~
oracleofnj
Thanks for the feedback! I have a few ideas for how the clustering could be
improved at the higher levels. The re-clustering step I'm doing right now is
pretty simplistic.

------
agentgt
For some other clustering algorithms (albeit based on terms and semantics ie
document clustering and not necessarily relationships) one might be interested
in Carrot2[1]. Carrot is neat because you can change/tune the clustering
algorithms on the fly something I think would make this gitmap more
interesting.

It would be interesting to see document clustering applied to code (I have not
seen it but it seems it might even be more reliable than affinity). That being
said ... a lot of memory will be needed since most of the clustering
algorithms implementations need to run in memory :)

[1]: [http://project.carrot2.org/](http://project.carrot2.org/)

------
andy_ppp
Amazing to visually see React in such a short amount of time is nearly as
large an ecosystem as Jquery?

Seems so difficult to believe, I question if I'm reading the data correctly!

The pace at which new things happen that you have to know is incredible.

~~~
karavelov
Yes, when you put all the
scala/akka/play/spray/netty/typesafe/typelevel/gloovy/gradle/etc project
inside the React cluster - I don't see the connection.

~~~
kami8845
Well, React.js _was_ made by Facebook.

~~~
karavelov
and not of the other mentioned technologies was made by Facebook, so I don't
see the connection

------
stared
By any chance could you post repos/stars data? (I guess downloading it would
be much faster than crawling it.)

In any case - thank you a lot for sharing the source code! :)

~~~
oracleofnj
I added a file "starcounts.json.gz" to the repository.

~~~
stared
Thanks, but I meant more "cached_repos.json.gz" and "cached_users.json.gz". It
took 2 nights of fetching and it is nowhere near there (~2100 crawled repos,
and users).

------
CiPHPerCoder
This is cool.

TIL paragonie/awesome-appsec is in the top 10,000, but other projects like
paragonie/random_compat aren't :(

------
mc808
It's kind of interesting that almost all of the top projects are programming
tools. Programmers building better programming tools to help build better
programming tools. I wonder what nice things we could have if we diverted half
of that effort into end-user projects with the tools we've already got.

~~~
andrewstuart2
I think that's simply an artifact of the fact that client projects using these
open source projects are often not themselves open sourced, or hosted on
GitHub.

------
masukomi
I really want to be able to run this for a specified repository

------
wyclif
I was surprised to see a giant Jekyll cluster.

