
Show HN: Open-Source Business Intelligence for BigQuery – Looker Alternative - akalitenya
https://mprove.io
======
segah
@akalitenya are you even in the clear with this? Some of the Old LookML syntax
is an exact copy.

But more importantly, the challenge for any such tool is to go beyond use by
2-3 people. At 2-3 people anything will work. Where BI tools (open source and
close source) struggle is scale: having all the right features for,
essentially, a group of users who actually don't know how to work with data
(did I just say that aloud?). Chartio caps at 20 people. RJ capped at 50-100
(and later became Stitch for that reason). We haven't seen where Metabase
caps, but I bet it is in a similar range. Very few BI products have actually
surpassed 100 users at target installations. And beyond 1,000 is a real
challenge that only few, and even then with a lot of assistance, can support:
Tableau, Looker, Microstrategy, maybe Birst, maybe Domo.

Also, a combination of BI with LookML is a complicated product. During my days
at Looker, we were handling 50+ bugs / week, and filing 1,000+ tickets. Every
day we were filing over a 100 new features.

So with all that, the question is, is it really worth the struggle? What's the
end vision for supporting this? Why should someone who implements BI for a
living bet on this product?

~~~
akalitenya
> Some of the Old LookML syntax is an exact copy.

I met the online Looker demo a few years ago when I was looking for a business
intelligence tool for another project.

Looker has a closed source code, so I did not see what algorithm is used to
build queries.

I kept those LookML features that I understood and liked. However, in some
places LookML is confusing (for example - references and aliases). I made them
differently.

Later Looker quit using YAML.

> Very few BI products have actually surpassed 100 users at target
> installations. And beyond 1,000 is a real challenge that only few, and even
> then with a lot of assistance, can support: Tableau, Looker, Microstrategy,
> maybe Birst, maybe Domo.

Scaling that size is not a top priority right now.

> Also, a combination of BI with LookML is a complicated product.

You should have told me this a couple of years ago. It seemed pretty simple.

> So with all that, the question is, is it really worth the struggle?

Yes, If people will use it.

> What's the end vision for supporting this?

In future - maybe "skinny" or "thin" option mentioned here -
[https://medium.com/open-consensus/2-open-core-definition-
exa...](https://medium.com/open-consensus/2-open-core-definition-examples-
tradeoffs-e4d0c044da7c)

>Why should someone who implements BI for a living bet on this product?

If you can not afford Looker (like me) and want to use similar product.

~~~
endlessvoid94
I used Looker for years and have always wanted a product priced less for the
enterprise and more for developers. I even prototyped a similar tool myself.

Congratulations on the launch. I will be using mprove and will give you
feedback along the way.

EDIT: I would love to get in touch with you and learn more about what your
plans are. Possibly collaborate if you'd be open to that. I'm
dpaola2@gmail.com -- I couldn't find your contact info anywhere!

~~~
akalitenya
Thank you, I sent you an email. I am open to any suggestions -
akalitenya@mprove.io.

------
danpalmer
The headlining feature seems to be that it has a dark and a light theme. This
isn’t very encouraging for the rest of the product. I’m sure there’s a lot of
good stuff here, but themes aren’t important enough to be the first feature
mentioned.

~~~
morenoh149
+1 it should not be the first feature listed, bump that down

------
sandGorgon
this is super cool and i would pay for this. Bigquery is a cheap alternative
to a lot of the mobile analytics tools.

Quick point however - why do you need a new database ? You can use a table
inside bigquery itself. It seriously reduces the dependencies required.

~~~
akalitenya
Mprove creates permanent derived tables in BigQuery if the user wants it.

But you can't use BigQuery for OLTP.

~~~
sandGorgon
No - I'm talking about MySQL being needed as a dependency.

[https://github.com/mprove-
io/mprove/blob/master/deploy/docke...](https://github.com/mprove-
io/mprove/blob/master/deploy/docker/ce-prod/docker-compose.yml)

Can you not use biquery database itself. Create tables for your internal use
instead of MySQL ?

~~~
akalitenya
The MySQL image specified in the docker-compose file you mentioned. It is used
for the internal data of the Mprove application (users, projects, members,
etc.). Each user action in the web client (angular) can initiate several
queries to this database through an backend request. Delays here are crucial.
network latency - so you need to keep the database as close as possible to
your server side, read / write delays - for queries such as finding a user /
setting a username / creating a member, etc.

~~~
sandGorgon
Hmm..I would prefer to not have it. In production, managing database
persistence is very hard. Especially when you go down the kubernetes road.

I would take higher latency, but avoid pulling in a whole database
infrastructure.

Plus a huge number of us use postgresql..so that becomes another set of a
mess. I would strongly urge you to do this on the same bigquery database that
you would connect to anyways.

~~~
akalitenya
Bigquery exists only in Google Cloud. This is a columnar database like
Redshift. It is not designed for fast processing small queries that are
necessary to support the operation of any application. Its main feature is
that it can scale the execution of heavy analytical queries between 10,000
nodes transparently for the end user. Users have no control over Bigquery
instances. This is super cost effective cloud analytical database as a service
to be used as centralized data warehouse for company of any size.

------
vgt
BigQuery PM here. Nice work!

Here's a recently built Graphite connector as well:

[https://twitter.com/vadimska/status/1112816503055843330](https://twitter.com/vadimska/status/1112816503055843330)

------
siculars
Will this handle (repeatable) “record” (struct/array) data types natively?

/disclaimer: work in google cloud

~~~
akalitenya
Yes it should, there is special "unnest" parameter for fields in BlockML
reference -
[https://mprove.io/docs/blockml/fields/dimension](https://mprove.io/docs/blockml/fields/dimension)

~~~
segah
wow, impressive! how about symmetric aggregates (e.g. being able to do correct
summation/aggregation on numeric values despite a one_to_many join)?

~~~
akalitenya
Yes, look at this page -
[https://mprove.io/docs/blockml/fields/measure](https://mprove.io/docs/blockml/fields/measure).
You need to specify measure "type" and "sql_key" that will be used to avoid
counting duplicates.

~~~
infinite8s
How do the underlying queries look for symmetric aggregates? Sadly SQL never
supported the ability to compute an aggregate based on the unique values of
another column.

~~~
akalitenya
Mprove does it the way the Looker did it before:

    
    
       CREATE TEMPORARY FUNCTION mprove_array_sum(ar ARRAY<STRING>) AS
      ((SELECT SUM(CAST(REGEXP_EXTRACT(val, '\\|\\|(\\-?\\d+(?:.\\d+)?)$') AS FLOAT64)) FROM UNNEST(ar) as val));
    

...

    
    
       SELECT COALESCE(mprove_array_sum(ARRAY_AGG(DISTINCT CONCAT(CONCAT(CAST(a.id AS STRING), '||'), CAST(a.population AS STRING)))), 0) as a_cohort_size
    

Recently, Looker began to do it differently, most likely to improve bigquery
performance:

    
    
      COALESCE(ROUND(COALESCE(CAST( ( SUM(DISTINCT (CAST(ROUND(COALESCE(lesson_5_cohorts.population ,0)*(1/1000*1.0), 9) AS NUMERIC) + (cast(cast(concat('0x', substr(to_hex(md5(CAST(lesson_5_cohorts.id  AS STRING))), 1, 15)) as int64) as numeric) * 4294967296 + cast(cast(concat('0x', substr(to_hex(md5(CAST(lesson_5_cohorts.id  AS STRING))), 16, 8)) as int64) as numeric)) * 0.000000001 )) - SUM(DISTINCT (cast(cast(concat('0x', substr(to_hex(md5(CAST(lesson_5_cohorts.id  AS STRING))), 1, 15)) as int64) as numeric) * 4294967296 + cast(cast(concat('0x', substr(to_hex(md5(CAST(lesson_5_cohorts.id  AS STRING))), 16, 8)) as int64) as numeric)) * 0.000000001) )  / (1/1000*1.0) AS FLOAT64), 0), 6), 0) AS lesson_5_cohorts_m_sum_distinct

------
verdverm
Link to the open source part?

I can't seem to find any on the site...

~~~
rahimnathwani
There's a github link at the top of the page:

[https://github.com/mprove-io/mprove](https://github.com/mprove-io/mprove)

~~~
verdverm
Oh, had to rotate my phone, does not show up in the menu on a smaller screen
layout.

~~~
pugworthy
Wow, very different menus on mobile based on orientation. Vertical has
“Pricing” and no open source mention. Horizontal gives a GitHub link and no
mention of pricing.

I assume there was a time when this was meant to be a non-open source project.

~~~
akalitenya
thanks, i just fixed it

------
yantra_ml
Do you support on-prem?

~~~
akalitenya
Source code is open. Anyone can deploy Mprove to his server and use it for
free.

