

How We Work on Queries at GitHub - samlambert
http://samlambert.com/posts/how-we-work-on-queries-at-gitHub/

======
wldlyinaccurate
I really wish this went into more detail. You are notified when a query is
slow. You can EXPLAIN it in chat so everybody can see. What happens next? Are
slow queries treated as high priority? Is there any tooling around debugging
complex queries? Basically, what makes GitHub's process different to the
decades-old "grep the slow query log and run an EXPLAIN"?

~~~
technoweenie
> Basically, what makes GitHub's process different to the decades-old "grep
> the slow query log and run an EXPLAIN"?

Sam's chatops tools opens this process up to a lot more people that otherwise
wouldn't be comfortable logging on to servers to access the slow query logs
(assuming that they even have access to the servers). It's a great way for app
developers to level up their sql skills from other more experienced coworkers.

The impact of slow queries determines their priority. How frequent are the
slow queries? Is it from a background job, or does it cause exceptions on
important pages or API calls?

I don't know if this process is streets ahead of what other companies have,
but it's made a hugely positive impact on our MySQL infrastructure.

------
revskill
Why doesn't Github open source their products/libraries regularly ? There is
too few open source projects from Github on Github.

~~~
holman
We do!

[https://github.com/github](https://github.com/github)
[https://github.com/libgit2](https://github.com/libgit2)
[https://github.com/boxen](https://github.com/boxen)

When it's easy to extract, well-documented, and has a clear team of
maintainers, we try to open source. Sometimes it's difficult to nail one or
all of those bullet points, though.

~~~
joshmn
Haystack looks interesting ;)

~~~
NicoJuicy
You can't expect them to opensource everything, i think it's awesome they
opensourced hubot!

------
famousactress
So, maybe a toy example but I can see the query included a join. Curious how
you guys clone enough tables (and their keys) to troubleshoot things like
that? Seems like it gets a lot more complex than the example suggests pretty
quickly. Wondering if you have neat tools for that.

[Edit: Just noticed poster is author. Hi Sam and welcome to HN :)]

~~~
samlambert
Basically you can hit /mysql clone for any table and it will make its way to
an isolated db for the user running the command.

It would be cool to be able to pass the script a query and have it clone all
the tables.

------
joshmn
Nice read Sam, thanks for the writeup.

Thought I'd let you know you have an error in your json - section.name
([http://i.imgur.com/CoHol0f.png](http://i.imgur.com/CoHol0f.png))

;]

~~~
samlambert
haha thank you :)

------
blaincate
from :
[http://ghtorrent.org/downloads.html](http://ghtorrent.org/downloads.html)

atleast :

    
    
       4,151,457 repos   
    
       2,480,478 users
    
    

popular repos :

twbs/bootstrap 40662

jquery/jquery 34633

joyent/node 34522

mbostock/d3 30247

h5bp/html5-boilerplate 28736

popular users:

visionmedia 10712

torvalds 9984

paulirish 5885

schacon 4431

mattt 4053

pjhyett 3732

src code:
[https://github.com/akuchlous/githublike](https://github.com/akuchlous/githublike)

------
nathantotten
Now, does anyone have the time to figure out how many users Github has in
their users table? :)
[http://dheera.net/projects/blur](http://dheera.net/projects/blur)

~~~
minimaxir
Atleast 3,815,207 users. (who have performed some meaningful GitHub action)

Via BigQuery:

    
    
       SELECT COUNT(DISTINCT actor) FROM [githubarchive:github.timeline];

------
simonw
"Once we have decided if we want to modify our schema we can perform an
incremental rollout across our cluster. I will cover this more in another
post." \- looking forward to that.

~~~
techdebt5112
my guess is a thin wrapper around PTOSC.

------
yRetsyM
Are all github internal services a dark colour scheme? good way of determining
internal vs external I suppose

~~~
Caged
There's no hard rules on it. When I started working on it a couple of years
ago, I was fond of dark color schemes for monitoring interfaces.

------
ckluis
Awesome. I’m going to share this to see if we can do something similar where I
work. Neat idea.

------
albertoleal
Is haystack open sourced anywhere?

~~~
samlambert
Unfortunately not. It is so closely tied to our applications.

~~~
possibilistic
Thanks for posting this! I like hearing about internal tooling. Is there more
on the query tagging? How do you guys bubble query annotations through the
stack?

I don't know if you're at liberty to discuss further, but how has it been
scaling a giant Rails app? Have there been any pushes to break it up into
smaller components? Ie. fast moving stuff stays Rails, core infra moves to
something statically typed?

~~~
technoweenie
The query annotations show up as a mysql comment next to the query. I don't
know if we have any automatic indexing of the annotations themselves.

We try to stick to ruby/rails since so many people are comfortable in that
environment. We try to balance the desire to break pieces out with the fact
that it lowers the number of devs qualified to work on it.

