
Self-Hosted “SQL-Not-required” Data Analytics and BI Tool. Inspired by Looker - mariusandra
https://github.com/mariusandra/insights
======
tlrobinson
Shameless plug for Metabase, which is also open source and does not require
knowledge of SQL: [http://www.metabase.com/](http://www.metabase.com/)
[https://github.com/metabase/metabase](https://github.com/metabase/metabase)

Supports MySQL, PostgreSQL, MongoDB, Microsoft SQL Server, AWS Redshift,
Google BigQuery, Druid, H2, SQLite, Oracle, Crate, Google Analytics, Vertica.

~~~
nwenzel
Serious question even though as I read and reread it, I am aware it does sound
like snark...

How is this a sustainable product/company? (AKA How do you make money?)

Is love to dig in to what looks like an awesome product. But to justify the
time investment, it would be helpful to know how it will be around long term.

~~~
salsakran
We offer white label licenses for embedding, do a bit of hosting and are going
to be launching adjacent (paid) products.

(another person from Metabase)

------
golergka
I strongly feel that if a domain language is not significantly more complex
than the domain itself, then the person who is supposed to have domain
knowledge should be able to learn the language without much difficulty. I've
seen a lot of attempts to build 'simple' analytical systems without SQL, and
it works well for simple queries - but 'business people', when they're smart
and motivated, soon start asking more and more specific and detailed
questions, with additional conditions and joins - and it soon turns out that
SQL is, in fact, an appropriate instrument for complexity that they need.

Your examples seem pretty simple. What about a question like this: how did our
5-week cohort (users that at the time of the event were registered between 4
and 5 weeks) ARPU vary over time and over character class they chose at
registration? Is there a statistically significant corellation between
participating in any of the 20 custom events we've done in the last half a
year and user retention (which must be calculated relative to each users'
cohort)?

Of course, a lot of business analysts are moving away from SQL to R, but I
doubt that they're after simplicity.

~~~
mariusandra
Indeed my examples are simple. However this was the problem: there was no tool
around to easily answer those simple questions. Now there is... at least one
that fits my arbitrary definitions of what such a tool should do.

Hopefully having a tool to ask these simple queries will give rise to smarter
questions, which will either prompt the development of custom dashboards or
improvements in the tool itself.

I agree in theory that "business people", if and when they start asking these
questions, can and should learn SQL. However my experience seems to point the
other way. They'll ask the developer to do more and more reports, eventually
hiring someone for the "data analyst" role.

~~~
golergka
> eventually hiring someone for the "data analyst" role

Oh, I was presuming that a 'business person' and 'data analyst' is the same
person. However, usually 'data analyst's job isn't just to write SQL - he's a
person who understands the product, marketing, dives into data and researches
it, and comes back with actionable insights for the whole business.

And if you're looking for easier scenarios, Excel and it's pivot tables are
usually quite enough.

------
juandazapata
Hey, this is cool. Thanks for making it available for free to the world.

Note apart, I feel like knowing SQL is something that every developer should
be comfortable with. The amount of magic that an ORM does, will tend to zero
the more SQL you know. It's weird that we as developers don't know how to
query our main data repository, don't you think so?

Once again, thanks for making this available, I feel like this would be useful
for a lot of people.

~~~
mariusandra
Hi, I totally agree that every developer should know SQL. Even if you're a
COBOL developer.

The point of this tool is to remove the developer as a bottleneck for when
business people want to ask questions.

I will eventually add a "generated SQL" view where you can see how the sausage
is made if needed... but I assume most of the time it'll stay hidden.

------
bigger_cheese
Looks interesting.

A few suggestions for useful features: Time Series functionality would be
handy (Exponentially weighted moving average, centered moving average etc).

Aggregations across sliding intervals (by shift / by day/week/month/ Finical
year etc). Being able to quickly answer questions like What is the production
for week to date, current shift etc. How about previous week/shift etc.

One other comment I will give is I find SQL to be useful as a protection
against product lock-in. It's essentially the lowest common denominator
language for dealing with data extraction/transformation etc. Our business has
been burnt before by having our reporting/BI written in a tool specific
language it made migrating away from the product very difficult. Ever since
I've very much been of the mindset to do as much using SQL as possible.

~~~
mariusandra
Hi,

We already support the most basic sliding intervals: day, week, month,
quarter, year. Custom ones would be convenient, I agree.

Check the menu when you hover the header of the column "week confirmed_at"
here: [http://insights-
demo.mariusandra.com/explorer?columns=Order....](http://insights-
demo.mariusandra.com/explorer?columns=Order.confirmed_at!week%2COrder.user.country.name%2COrder.id!!count&sort=-Order.confirmed_at!day&treeState=Order%2COrder.user%2COrder.user.country&graphTimeFilter=last-365&facetsColumn=Order.user.country.name&facetsCount=6&graphCumulative=false&percentages=false&filter%5B0%5D=Order.confirmed%3Dequals%3Atrue)

The plan is to add a view called "generated SQL", where you can see and
copy/export the raw query that went into generating the data. This way you can
always validate the data and get some "insight" into how it's fetched.

------
huy
Wanted to share about my BI startup here for those who're interested too:
holistics.io - we're a small, self-funded BI startup in Asia (Singapore).
Started 2 years ago and have customers around the globe, from small startups
to big unicorns.

[https://www.holistics.io](https://www.holistics.io)

We're not free or open-source, but a lot more affordable than other
competitors. Our pricing starts with $50/mo for 5 users. We have a few
customers who moved to us from open-source alternatives or other commercial
competitors.

We support PostgreSQL, MySQL, SQL Server, Redshift, Greenplum, BigQuery,
Presto, etc.

If you try it out let us know you're from HN and your first 2 months are on
us.

------
grizzles
I'd like to build something in this area that used NLU
([https://en.wikipedia.org/wiki/Natural_language_understanding](https://en.wikipedia.org/wiki/Natural_language_understanding))
to let the user ask an actual question, after they have marked up their
database schema. That would be a fun product to build.

~~~
mariusandra
Go for it :)

------
mrw34
For anyone interested in running a Docker/CentOS instance:
[https://gist.github.com/mrw34/d60fa30a08c864286e819647fc0eaa...](https://gist.github.com/mrw34/d60fa30a08c864286e819647fc0eaac2)

------
reilly3000
Does it support Redshift?

~~~
mariusandra
I haven't tried, but if it doesn't, it shouldn't be hard to add.

So far I have only tried with PostgreSQL (full support) and SQLite (some date
functions break), however any SQL dialect should work fine... except maybe
when it comes to the date functions, which everyone has implemented
differently.

~~~
ProblemFactory
RedShift is very close to PostgreSQL (it even uses PostgreSQL drivers to
connect to their clusters from programming languages or data IDEs).

A sneakier difference is query latency. RedShift is scalable for queries that
scan through very large amounts of data - but there is overhead for each
query. One complicated JOIN can be faster than 50 small queries that ORMs and
web apps like to make.

