
Analytics at GitHub - janerik
http://johnnunemaker.com/analytics-at-github/
======
phunge
Jay Kreps speaks the truth. His talk "Building LinkedIn's Real-time Data
Pipeline" is along the same lines as the Log blogpost mentioned here and is
also extremely informative.

------
fuziontech
Fantastic read. Concise and solid decision explanations. Thanks for writing!

Was there any other reason you chose kestrel over alternatives like kafka? Did
you test any others, or where you just that satisfied with kestrel?

~~~
jnunemaker
We chose kestrel mostly just from usage/familiarity. We've been satisfied with
it, but are currently researching/testing kafka.

~~~
nextplaylist
Are you guys using either of them elsewhere or just for analytics?

~~~
technoweenie
We use Kestrel for our internal hooks delivery system also (based on
jnunemaker's suggestion).

------
khaledh
Very good article. It aligns with our envisioned architecture for our next-gen
analytics platform.

So far our decision is to keep the raw events in Cassandra, and pre-aggregate
most data for fast reads. Just wondering about your decision to not store raw
events in Cassandra, and use raw files for that, and using Cassandra only for
storing Hadoop analysis results. Do you think this decision may affect you
later if you ever decide to support real-time analytics?

------
nicklovescode
As an aside, do you have any info on the visual software used to run the
charts? I'm guessing d3 is there somewhere., but maybe not. I've struggled to
find a beautiful charting library and yours are beautiful!

~~~
calavera
we use d3 for all our charts.

~~~
nicklovescode
any chance of you guys open-sourcing them?

~~~
Caged
Most of our graphs are pretty stock d3 code tailored for specific datasets, so
I don't see much value in open sourcing them. Is there anything in particular
you're interested in?

~~~
jrpt
There's a need for a good charting library built on top of d3. Kind of like
Highcharts, in terms of usability, but free. d3 is powerful but not as easy to
use and customize as Highcharts.

~~~
middleman90
Can I suggest [http://www.sibdo.com](http://www.sibdo.com) For individuals
it's free and built on top of d3 with some extra functionality that Higcharts
does't have. You can even drag files directly onto to the visualizations and
the data will render. Also really nice UI for mobile.

~~~
sheff
Looking at the Sibdo pricing page, it looks like much higher pricing (compared
to the more established competitors) at $95 a month for use on a SINGLE
website with a confusing limitation to "50 users" whatever that means.

Not only that, the example graphs and charts look very basic.

~~~
middleman90
Good feedback thanks

It would be interesting to know what you mean by basic as we're a start-up and
would appreciate any feedback.

------
nickstinemates
> For any business, the process of collecting data, measuring performance,
> making changes, and reviewing if those changes were successful is really
> important.

This applies for any sort of goal/process/?, whether programmatic or personal.

Very cool story, I'm looking forward to additional features. We pull a _lot_
of data about Docker from GitHub that could be more readily available. We'd be
more than happy to discuss or beta any new features, if you're interested.

------
alexatkeplar
Nice to see lots of parallels to how we have architected things at Snowplow
(trackers -> collectors -> enrich -> storage -> analytics)

