Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: SQueaLy – Fast track analytics for business (github.com/hashedin)
163 points by swapnil95 on May 11, 2017 | hide | past | favorite | 41 comments



Good work with the README, actually explains exactly what the project is, with a very clear outline and screenshots.

Well done.


Thanks Josh. Hope to get more feedbacks from you.


Looks cool but unless I can run this locally it's a non-starter for our business. I opened an issue for clarification on how best to do that.


Hey, I have replied to your issue on github


Very nice -- looks like an open source Klipfolio alternative. Which would be great because Klipfolio has some serious limitations.


Yes, also with contrast to klipfolio, squealy lets you deploy the application within your own heroku account which eliminates the need to share database credentials with any third-party product.


If I read the docs correctly on JinjaSQL, it essentially allows you parameterize your queries, but doesn't let you build them dynamically like SQLAlchemy would.

Can someone comment on what's the benefit of this, as opposed to using parameterization and SQL-injection protections offered by say SQLAlchemy or other similar tools?


Author of jinjasql. The whole point of jinjasql is to dynamically build queries.

There comes a time when you need the power of sql, and an ORM gets in the way. Think unions, sql functions, group by, more complex sql. Think dynamically generated sql queries based on some data structure.

In such cases, you have to keep track of how many variables you are capturing in the query, and then manually bind them.

Jinjasql is just a query generator. At the end of the day, it doesn't actually execute the query. You take the query it generates, and the array of parameters it gives - and then execute it using traditional means using bind parameters.

This approach gives you the power of a template language to generate the query. You can create reusable macros, conditionals and other features that a template language provides.


Thank you for your response! Very interested to understand if JinjaSQL could have made the following easier?

I just prototyped a very expressive reporting API that takes database driven configurations for the reporting columns (dimensions and metrics), validates inputs and allows very expressive filtering (including partial string matches, etc.) in Flask and SQLAlchemy.

Did not use the ORM part, but rather the query API: http://docs.sqlalchemy.org/en/latest/orm/query.html

Queries included a with statement, the main query had 4 nested subqueries (4 level cascade basically) for ranking, filtering and ordering.

The prototype supports queries like this: Give me best performing shows and their stats in my top 5 countries by the number of viewers I have in each of these top countries, but I want shows not broken down by country, but instead by season and episode, including counts of uniques at each level. (Sorry don't have a copy of the resulting SQL on this computer, but it's fairly long.)

Queries I designed up front, and reverse engineered into being dynamically generated by SQLAlchemy query API. However, by not having a fixed template, I can simply update my reporting table configurations and not have to worry about updating the templates or the query structure.

When the user requests something that does not have top dimension and topN results, I can just omit that part of the query and build valid syntax without it.

All-in-all, the entire API is less than 1k lines of python.


Jinjasql is meant for reporting use cases, so yes - it should be able to help you. However, when we made JinjaSQL, we didn't want to build a super-expressive reporting API. Instead, for every report/chart we wanted to make, we wrote a comparatively simpler query and used jinjasql to get it working.

If you have a single complex query that is generating all your reports based on database configuration - I'm guessing you need the full power of python. So while JinjaSQL could have save some effort, I'm not sure it'd save you a lot.

Hope that helps!


Thanks!

The reports are all API driven. User specifies dimensions and metrics, as well as filters, etc. The API returns the JSON response.

I can't predict what users will want to visualize in the future, so dedicated templates for each of the visualizations are not practical for my use case.

The goal is to minimize dev work on the API and let front end config dictate what columns the API returns.


Let's assume you have a python object "userquery" that has the user specifications - including dimensions, metrics, filters and the cube to query. I'm also assuming you have whitelisted all the parameters in this object per your database model, otherwise you are open to security issues.

Here's how you could write that in jinjasql. I'm writing this out of memory, and it has errors/issues - but I hope you get the idea.

    SELECT
    {% for metric in userquery.metrics %}
        {{ metric.id }} as '{{metric.description}}'
    {% endfor %}
    FROM {{ userquery.cube }}
    WHERE 
    {% for filter in userquery.filters %}
        {{ filter.key }} {{ filter.operator }} {{ filter.value }}
    {% endfor %}
    GROUP BY 
    {% for dimension in userquery.dimensions %}
        {{ dimension }}
    {% endfor %}
This way, your template doesn't have to map to exactly one visualization.

Now, another interesting this in the security aspect. You can also pass the logged in user data model, and then append a where clause that restricts the rows based on the logged in user id or his role or whatever else is your application security model.

I'm not sure how much effort it would have saved you, but it does help us a lot - mostly because we don't have to translate between SQL and the way SQL ALchemy / ORM works.


Thanks for the example, this helps a lot in my understanding of this!

It's pretty similar to what I did, but the template would be a lot more complex due to various column definitions and calculations, for example uniques require the base query to have the IDs of whatever I'm counting, but all in all I see the use case.


Is that safe from sqli prospective? It would inject table name and all, which I believe you can't bind...


You're right. That's why I said upfront - whitelist values in the userquery object.


Are you guys planning to give support to Athena? In general to other databases also.


Yes SQueaLy does support Athena. As of now we support Athena, Redshift, Mysql, Postgres and sqlite

Soon we will be adding support for Mssql and Cassandra also


Pretty cool! We've been using Periscope Data at work and I've found the SQL -> Charts way pretty amazing, but the cost of that program makes it impossible for individuals to use. I'm psyched to try this out for my personal projects.


> One of the co-developers here:

Thanks kyyd, Periscope Data was definitely on our minds while planning this project. Please share your feedback when you use it.


Interesting stuff . Is there any demo site available to play around with the tool ?


No there isn't one yet, but we are planning to have a marketing site which will include the demo as well.

For now, all you need is a publicly accessible database and a free heroku account. Squealy supports one-click deploy to heroku. You can deploy within minutes and play around with the tool.

Please share your feedback with us once you use it.


I'm working on something a bit similar, it's generating around 200 different recurring reports at the moment.

Did you do anything to make it work with large datasets? Is the web-frontend still responsive if a query returns 1 million rows for example?


On the web frontend, we paginate the reports and so it is supposed to be responsive with large datasets. We tested this with around 10K rows, we'll surely scale-up our tests for this.

BTW which charting library are you using for your reports?


On the few places in the frontend where charts are use I am using chart.js, which I'm happy with so far.

Most of the recurring reports simply spits out excel or csv files with no graphs.


I'm also doing something similar and using c3.js for charts.


Can you post your project?


You mean the source? I am not in a position to post the source, it's a product I'm developing with one client (who has it running) and another prospective client.

We haven't had time to make a marketing/demo site for it yet, sales has been through networking so far.


How does it compare to metabase ?


Metabase dev here.

Metabase is a general purpose BI tool that anyone in your company can use (whether they know SQL or not) to ask questions about your data, make charts and dashboards, etc.

It sounds like Squealy is primarily focused on generating APIs from SQL for integration in another applications.

That isn't Metabase's primary focus right now, but we do have the ability to embed charts or dashboards in other applications, as well as JSON and CSV APIs for each question (though that's not well documented).

Each question or dashboard can be parameterized, for example by a customer ID, and authenticated via JWT.

This works for questions built using our UI as well as SQL. We don't have a full template system in SQL yet, but we do let you add parameters to your SQL.

If you feel like we're missing a feature that would be useful, please file an issue! https://github.com/metabase/metabase/issues


This was my question as well. Metabase is continuously getting better - why would one divest themselves from metabase for this? What are the primary advantages?


- Fine-grained authorized REST APIs

- Cron based scheduled emails embedded with reports

- Auto-generated documentation for all the REST APIs using swagger

- More than just SQL - SQueaLy uses JinjaSQL behind the scenes which gives you power to use JINJA inside the SQL query.

You can refer the readme for more details


Squealy focuses more on APIs rather than giving out-of-the box visualizations. With custom APIs, you can easily embed the reports within your own dashboard that can be shared with all your customers.

Also squealy lets you write Jinja templated SQL queries that's helpful in extracting some complex insights from the data.


Interesting. If someone here have used https://openbouquet.io/ before - how is SQeaLy different for Boquet?


I have not used Boquet but going through their documentation it looks like they help developers to get rid of SQL by providing some Javascript plugin in order to filter the reporting data from the database.

SQueaLy's approach is a bit different. It is rather focused on writing powerful and dynamic SQL queries. SQueaLy uses JinjaSQL behind the scenes which gives you the power to use JINJA inside the SQL query. For more details, check out JinjaSQL here - https://github.com/hashedin/jinjasql

With JinjaSQL in the scene, you can access real-time variables related to

- A logged-in user

- Web session

- URL query parameters

inside your SQL queries.

You can then use Jinja templating inside the SQL query to handle conditional statements or create macros based on these real time variables


Interesting, lots of solutions popping up in this space. We're using re:dash at the moment, quite happy though this looks to have a bigger focus on APIs rather than built in visualization


Yes you got it right. The focus of SQueaLy is on providing fine grained authorized REST APIs with auto generated documentation for all the REST APIs using swagger.


have not seen comparable solutions other then metabase maybe ?


nice software are you planning to support mssql ?


We hadn't thought about it honestly, mostly because we don't use MSSQL as much with Python. However, it shouldn't be too difficult to add support if we have the right drivers. I've created a ticket here to keep track of it - https://github.com/hashedin/squealy/issues/223.


there are several MSSQL drivers, but i could not understand from the project file structure support should be added to jinjasql or SQuealy ?


You'd add that to Squealy.

JinjaSQL simply generates a SQL query with an associated array of parameters. It isn't even aware of the SQL query that's written. JinjaSQL would work just fine with any database.

Squealy is a django project, so if you can get the right database driver configured and installed - in theory, squealy would just work fine with MSSQL. After all, the user is writing the query, so squealy doesn't care much about the underlying differences in the way SQL is written.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: