
Enigma Public: Broad collection of public data - merinid
https://public.enigma.com/
======
victor9000
Sorry to be that guy, but it would be great if that left hand nav would auto
collapse into a hamburger button on mobile. It's taking up sooooo much space,
and it shrinks the content to the point where it's unreadable.

~~~
awqrre
the layout is pretty bad on desktop too... (with the data that I tried to look
at, probably 75% of the page is blank:
[http://imgur.com/a/ecIKj](http://imgur.com/a/ecIKj))

~~~
danso
You bring up a good point, IMO: the "Open in Data Viewer" button is far-too
obfuscated. If I hadn't used Enigma before, and hadn't already known what was
contained in a previously-visited dataset, I too would have assumed that I hit
a dead end because the button was the last thing I noticed on the page.

The horizontally-flowing layout is problematic overall, but placing that
"Explore Data" button much higher in the right-most column would be a decent
compromise: [http://imgur.com/a/lS7ks](http://imgur.com/a/lS7ks)

------
nl
This is really really good.

I follow the many emerging "collect lots of public data and make available"
services, and I think this is one of the better ones I've seen. The data looks
quite wide ranging too.

~~~
jackschultz
I always feel like gathering data and formatting correct data sets is the most
important part of data science. Back during classes where you're taught
machine learning require perfect data sets for you to learn the algorithms.
And then when you're analyzing data, you'll use a library that runs these
algorithms for you. Which means you'll spend the longest time making sure that
you're data sets are correct and easily useable.

I love working on these types of projects. Things like scraping sports stats
(which may or may not be considered legal, but not like I'm trying to make
money for them), data from posted articles, music data, data about locations
of public restaurants, etc.

Finding some place that offers correct public data is fantastic to see.

------
Gys
Why is the 'United States' not below 'Governments' like all the other
countries ?

~~~
skizm
Optimizing for their audience maybe? Similar to sites putting "United States"
at the top of a countries dropdown even though everything else is in
alphabetical order? Just a guess.

------
danso
I've interacted with Enigma folks throughout the past few years, have always
been impressed with their work and methodology. I've had friends who've worked
at other massive-public-data-gathering startups, it sounds like a tough
business, since collecting/cleaning data is hard, but having data isn't alone
a competitive edge. Don't know if Enigma will find success with public data
(though their offerings go beyond data, but enterprise platforms apparently),
but I've been impressed at the scope of their collection and ability to
wrangle data into a standardized structure.

Here's one example: Senate lobbying disclosures. Enigma has taken the original
XML data sources and created several flat tables (lobbyists, issues, reports)
that can be linked through foreign/primary keys:
[https://public.enigma.com/browse/lobbyists/09264ee1-792f-445...](https://public.enigma.com/browse/lobbyists/09264ee1-792f-4456-99ac-50a0232ef07f)

Here's what the raw material looks like:

[https://www.senate.gov/legislative/Public_Disclosure/LDA_rep...](https://www.senate.gov/legislative/Public_Disclosure/LDA_reports.htm)

Excerpt:
[https://gist.github.com/dannguyen/7588b8334f5c8954d2c2b13bc4...](https://gist.github.com/dannguyen/7588b8334f5c8954d2c2b13bc46cb292)

I've written my own scripts to clean up and organize this shitshow but it's
nice to have Enigma to double-check against, or even get ideas on how to
structure things. What's just as impressive to me is the work put in the
taxonomy of datasets, e.g. United States > U.S. Senate > Lobbying Reports.

For less data-savvy users, just having a Google-like simple search bar is
great for discovery of datasets that contain a term of interest:
[https://public.enigma.com/search/google](https://public.enigma.com/search/google)

Note: Enigma has had offered this public data for free before, you just had to
sign up for an account to even browse the data. This public interface is much
nicer, especially for sending people links. Haven't tested out the export
functions or the quotas, but in the previous incarnation, free accounts got a
huge number of downloads a month.

------
bowmessage
Wow that's a lot of data! Is this free? I couldn't find pricing information.
How is it funded?

~~~
lanewinfield
It looks free (Creative Commons)—they have enterprise clients to keep it all
afloat.

"We release all of our data under a Creative Commons License for anyone in the
open-source or civic community to freely build upon and extend. We regularly
collaborate with hundreds of journalists, not-for-profits, governments, and
many other committed and curious people to help put data to good use. If you’d
like to help out or suggest a project, please get in touch.

For commercial applications of our data services, we have solutions for
enterprises of every size. Please contact us, and a member of our sales team
will help you find the best solution."

[https://www.enigma.com/public-data](https://www.enigma.com/public-data)

~~~
x1798DE
> under a Creative Commons License

Looks like specifically CC-BY-NC 4.0

------
gooddelta
It looks like Enigma dog-foods their own data infrastructure products
(Concourse / Assembly?). The public data thing inspired them to build better
tools for themselves and sell them. Big lesson in that: sometimes what you
learn along the way will be the most valuable.

------
tempodox
Is there a REST interface or do we only get (bad) browser GUI?

~~~
maddalab
[https://docs.public.enigma.com/](https://docs.public.enigma.com/)

------
jhoechtl
What has happened to the initial idea of developing a platform for distributed
homomorphic encryption?

EDIT: Seems like I am confusing it with
[http://www.enigma.co/](http://www.enigma.co/)

