
Show HN: SQueaLy – Fast track analytics for business - swapnil95
https://github.com/hashedin/squealy
======
thejosh
Good work with the README, actually explains exactly what the project is, with
a very clear outline and screenshots.

Well done.

~~~
swapnil95
Thanks Josh. Hope to get more feedbacks from you.

------
joshstrange
Looks cool but unless I can run this locally it's a non-starter for our
business. I opened an issue for clarification on how best to do that.

~~~
swapnil95
Hey, I have replied to your issue on github

------
timsayshey
Very nice -- looks like an open source Klipfolio alternative. Which would be
great because Klipfolio has some serious limitations.

~~~
dev2302
Yes, also with contrast to klipfolio, squealy lets you deploy the application
within your own heroku account which eliminates the need to share database
credentials with any third-party product.

------
sologoub
If I read the docs correctly on JinjaSQL, it essentially allows you
parameterize your queries, but doesn't let you build them dynamically like
SQLAlchemy would.

Can someone comment on what's the benefit of this, as opposed to using
parameterization and SQL-injection protections offered by say SQLAlchemy or
other similar tools?

~~~
ksri
Author of jinjasql. The whole point of jinjasql is to dynamically build
queries.

There comes a time when you need the power of sql, and an ORM gets in the way.
Think unions, sql functions, group by, more complex sql. Think dynamically
generated sql queries based on some data structure.

In such cases, you have to keep track of how many variables you are capturing
in the query, and then manually bind them.

Jinjasql is just a query generator. At the end of the day, it doesn't actually
execute the query. You take the query it generates, and the array of
parameters it gives - and then execute it using traditional means using bind
parameters.

This approach gives you the power of a template language to generate the
query. You can create reusable macros, conditionals and other features that a
template language provides.

~~~
sologoub
Thank you for your response! Very interested to understand if JinjaSQL could
have made the following easier?

I just prototyped a very expressive reporting API that takes database driven
configurations for the reporting columns (dimensions and metrics), validates
inputs and allows very expressive filtering (including partial string matches,
etc.) in Flask and SQLAlchemy.

Did not use the ORM part, but rather the query API:
[http://docs.sqlalchemy.org/en/latest/orm/query.html](http://docs.sqlalchemy.org/en/latest/orm/query.html)

Queries included a with statement, the main query had 4 nested subqueries (4
level cascade basically) for ranking, filtering and ordering.

The prototype supports queries like this: Give me best performing shows and
their stats in my top 5 countries by the number of viewers I have in each of
these top countries, but I want shows not broken down by country, but instead
by season and episode, including counts of uniques at each level. (Sorry don't
have a copy of the resulting SQL on this computer, but it's fairly long.)

Queries I designed up front, and reverse engineered into being dynamically
generated by SQLAlchemy query API. However, by not having a fixed template, I
can simply update my reporting table configurations and not have to worry
about updating the templates or the query structure.

When the user requests something that does not have top dimension and topN
results, I can just omit that part of the query and build valid syntax without
it.

All-in-all, the entire API is less than 1k lines of python.

~~~
ksri
Jinjasql is meant for reporting use cases, so yes - it should be able to help
you. However, when we made JinjaSQL, we didn't want to build a super-
expressive reporting API. Instead, for every report/chart we wanted to make,
we wrote a comparatively simpler query and used jinjasql to get it working.

If you have a single complex query that is generating all your reports based
on database configuration - I'm guessing you need the full power of python. So
while JinjaSQL could have save some effort, I'm not sure it'd save you a lot.

Hope that helps!

~~~
sologoub
Thanks!

The reports are all API driven. User specifies dimensions and metrics, as well
as filters, etc. The API returns the JSON response.

I can't predict what users will want to visualize in the future, so dedicated
templates for each of the visualizations are not practical for my use case.

The goal is to minimize dev work on the API and let front end config dictate
what columns the API returns.

~~~
ksri
Let's assume you have a python object "userquery" that has the user
specifications - including dimensions, metrics, filters and the cube to query.
I'm also assuming you have whitelisted all the parameters in this object per
your database model, otherwise you are open to security issues.

Here's how you could write that in jinjasql. I'm writing this out of memory,
and it has errors/issues - but I hope you get the idea.

    
    
        SELECT
        {% for metric in userquery.metrics %}
            {{ metric.id }} as '{{metric.description}}'
        {% endfor %}
        FROM {{ userquery.cube }}
        WHERE 
        {% for filter in userquery.filters %}
            {{ filter.key }} {{ filter.operator }} {{ filter.value }}
        {% endfor %}
        GROUP BY 
        {% for dimension in userquery.dimensions %}
            {{ dimension }}
        {% endfor %}
    

This way, your template doesn't have to map to exactly one visualization.

Now, another interesting this in the security aspect. You can also pass the
logged in user data model, and then append a where clause that restricts the
rows based on the logged in user id or his role or whatever else is your
application security model.

I'm not sure how much effort it would have saved you, but it does help us a
lot - mostly because we don't have to translate between SQL and the way SQL
ALchemy / ORM works.

~~~
sheeshkebab
Is that safe from sqli prospective? It would inject table name and all, which
I believe you can't bind...

~~~
ksri
You're right. That's why I said upfront - whitelist values in the userquery
object.

------
swayamjeet
Are you guys planning to give support to Athena? In general to other databases
also.

~~~
swapnil95
Yes SQueaLy does support Athena. As of now we support Athena, Redshift, Mysql,
Postgres and sqlite

Soon we will be adding support for Mssql and Cassandra also

------
kyyd
Pretty cool! We've been using Periscope Data at work and I've found the SQL ->
Charts way pretty amazing, but the cost of that program makes it impossible
for individuals to use. I'm psyched to try this out for my personal projects.

~~~
dev2302
> One of the co-developers here:

Thanks kyyd, Periscope Data was definitely on our minds while planning this
project. Please share your feedback when you use it.

------
richa100994
Interesting stuff . Is there any demo site available to play around with the
tool ?

~~~
dev2302
No there isn't one yet, but we are planning to have a marketing site which
will include the demo as well.

For now, all you need is a publicly accessible database and a free heroku
account. Squealy supports one-click deploy to heroku. You can deploy within
minutes and play around with the tool.

Please share your feedback with us once you use it.

------
mclemme
I'm working on something a bit similar, it's generating around 200 different
recurring reports at the moment.

Did you do anything to make it work with large datasets? Is the web-frontend
still responsive if a query returns 1 million rows for example?

~~~
swapnil95
On the web frontend, we paginate the reports and so it is supposed to be
responsive with large datasets. We tested this with around 10K rows, we'll
surely scale-up our tests for this.

BTW which charting library are you using for your reports?

~~~
mclemme
On the few places in the frontend where charts are use I am using chart.js,
which I'm happy with so far.

Most of the recurring reports simply spits out excel or csv files with no
graphs.

------
ponytech
How does it compare to metabase ?

~~~
auston
This was my question as well. Metabase is continuously getting better - why
would one divest themselves from metabase for this? What are the primary
advantages?

~~~
swapnil95
\- Fine-grained authorized REST APIs

\- Cron based scheduled emails embedded with reports

\- Auto-generated documentation for all the REST APIs using swagger

\- More than just SQL - SQueaLy uses JinjaSQL behind the scenes which gives
you power to use JINJA inside the SQL query.

You can refer the readme for more details

------
ashtavakra
Interesting. If someone here have used
[https://openbouquet.io/](https://openbouquet.io/) before - how is SQeaLy
different for Boquet?

~~~
swapnil95
I have not used Boquet but going through their documentation it looks like
they help developers to get rid of SQL by providing some Javascript plugin in
order to filter the reporting data from the database.

SQueaLy's approach is a bit different. It is rather focused on writing
powerful and dynamic SQL queries. SQueaLy uses JinjaSQL behind the scenes
which gives you the power to use JINJA inside the SQL query. For more details,
check out JinjaSQL here -
[https://github.com/hashedin/jinjasql](https://github.com/hashedin/jinjasql)

With JinjaSQL in the scene, you can access real-time variables related to

\- A logged-in user

\- Web session

\- URL query parameters

inside your SQL queries.

You can then use Jinja templating inside the SQL query to handle conditional
statements or create macros based on these real time variables

------
jordanthoms
Interesting, lots of solutions popping up in this space. We're using re:dash
at the moment, quite happy though this looks to have a bigger focus on APIs
rather than built in visualization

~~~
swapnil95
Yes you got it right. The focus of SQueaLy is on providing fine grained
authorized REST APIs with auto generated documentation for all the REST APIs
using swagger.

------
guguli
nice software are you planning to support mssql ?

~~~
ksri
We hadn't thought about it honestly, mostly because we don't use MSSQL as much
with Python. However, it shouldn't be too difficult to add support if we have
the right drivers. I've created a ticket here to keep track of it -
[https://github.com/hashedin/squealy/issues/223](https://github.com/hashedin/squealy/issues/223).

~~~
guguli
there are several MSSQL drivers, but i could not understand from the project
file structure support should be added to jinjasql or SQuealy ?

~~~
ksri
You'd add that to Squealy.

JinjaSQL simply generates a SQL query with an associated array of parameters.
It isn't even aware of the SQL query that's written. JinjaSQL would work just
fine with any database.

Squealy is a django project, so if you can get the right database driver
configured and installed - in theory, squealy would just work fine with MSSQL.
After all, the user is writing the query, so squealy doesn't care much about
the underlying differences in the way SQL is written.

