Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Governments on GitHub (rilldata.com)
55 points by danthelion on June 9, 2023 | hide | past | favorite | 31 comments



I have some doubts about the data: the government of France has supposedly 1 repos, but for instance the French Statistics Office has 244 repos [0], the "digital" team of the Prime Minister's services has at least 537 repos, and that's what's on top of my head.

[0] https://github.com/orgs/InseeFrLab/repositories + https://github.com/orgs/InseeFr/repositories

[1] https://github.com/orgs/etalab/repositories + https://github.com/betagouv


Thanks for flagging this! These were missing from the initial dataset I used as a starting point, but I can easily extend it myself.


Here is a better list of Norwegian government agencies on GitHub https://github.com/MikAoJk/norwegian-public-organizations/tr...


Surely there are many more that are not present in the data set. How do you plan to find them?


Something broken in the source data, if I compare the organisations with the list that GitHub has crowdsourced: https://github.com/github/government.github.com/blob/gh-page...


Yep, the amount of UK gov organisations seems very small compared to this list.


I don’t think open issues is a very useful metric because an open issue may stick around for 10 years and just stay low priority. So number of open issues doesn’t really mean much. Maybe instead show issue velocity of how many created+closed each period to show activity. Or show commits.


they're all useless vanity metrics that cannot offer any comparison to another organization or project

software that is complete also fails every vanity metric


It depends on what you’re measuring.

If you just want to see if a project is active then commits is useful.

If you want to see if there are multiple committers, then active people is useful.

I don’t think you should compare repos based on this metric, but knowing if a repo is active or not is something.


Yeah, I definitely agree, more sophisticated metrics are in the pipeline. Wanted to get an MVP out as quickly as possible.


No explanation of where this data is coming from or how it's being labeled? Your top table on the right hand side shows seven "countries" but only one of the names in there is actually the name of a country (Norway).

It looks like you're just taking the top-level headers from that "who's using Github" page? Civic Hacker is one of those and is showing up as a "country" in your dataset, but clearly isn't. Looking at the first few tiles, it includes at least a few US cities, some US states, and Romania.


I can see how the "Country" name was confusing; thank you. Renamed it to "Government" for a bit more clarity.


It's a bit overarching. Central gov is different to stuff that happens in regions. In the croudsourced list for these on GitHub the UK has UK Central Government and UK Councils, things are probably different for the say the ministry of justice Vs some small town.


I don't quite understand what this dashboard is aiming at. What is the target objective: government participation on FOSS, being mentioned, projects being funded or something else?


I created a dashboard to monitor government activity on GitHub


What is the source of your data? How do you define "government activity on GitHub"? Do you look at domain used in contact, affiliation, etc.?


This GitHub page was the starting point: https://government.github.com/community


From what I gather from this page it seems organisations need to be added by the community or organisation members. I know a few organisations that aren't on that list for my country like https://github.com/GouvernementFR, https://github.com/DISIC, https://github.com/ANSSI-FR and https://github.com/ansforge. No promises but I might make a PR tonight on the page you mentioned.


That would be great, scraping that site is the entrypoint of my data pipeline!


Just out of curiosity about how people actually access stuff, does scraping mean (as I'd expect I guess) the rendered website, or consuming the data files https://github.com/github/government.github.com/tree/gh-page... ?

Querying wikidata may also be useful https://github.com/github/government.github.com/issues/877

Added: note about your project at https://github.com/github/government.github.com/issues/1167


I have missed the data file entirely and just extracted all the entities from the rendered website, then enriched all the repositories with some extra info through the API (still ingesting some data for a few repos, hitting the API limit frequently).

The goal is to have a deeper level view of public government orgs, including a more granular view of contributors and how they interact with code.

Great idea about wikidata! That's another data rabbit-hole to climb down into.


Even from this page there appears to be a fair bit missing. E.g. one of the biggest UK gov orgs is @alphagov (~1.6k repos), which I can't seem to see on your dash?


For The Netherlands only a couple of organizations are listed there, lots are missing.


What's the aim ?


As a state government employee I don’t understand what the purpose of this is. For example, my organization has hundreds of repositories but they are all hosted internally on TFS. We have zero repositories on GitHub. Is the goal to show what government code looks like, get community help for the code, accountability, showing your tax dollars at work, something else?


Great, but most of the Norwegian governments are missing. Here is a more up to date list https://github.com/MikAoJk/norwegian-public-organizations/tr...


Anyone (or the author) knows the name of the UI library the site is using?


The dashboard was made using [rill](https://github.com/rilldata/rill), which uses Sveltekit


Awesome, great work. Thanks!


Can I browse the repos from this page?


You can now, added the URL as a dimension!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: