
Ask HN: Creating a query builder for end users - systematical
Our product is starting to get lots of requests for custom reports from our customers. I&#x27;ve taken some steps to really stream line the creation of the reports and exporting reports to PDF &#x2F; XLS format. This has really helped and most reports can be created fairly quickly with a very full feature set.<p>The other day I began thinking about giving users the power to create their own filters essentially creating their own queries through a  UI. This isn&#x27;t so bad because our framework, CakePHP, stores all the relations to the data and its easy to build on top of this.<p>The tricky part is striking a balance between usability and power. A concern of mine is that this sort of thing might be too complex to our non-techie customer base. Have you done something like this before? Do you have any great examples? How did you handle (or why did you not handle) complex things like: OR, AND, NOT IN, IN, &lt;, &gt;, BETWEEN etc... Specifically I&#x27;m interested in your thoughts on building an intuitive and beautiful UI.
======
stephenr
I faced the same basic concept a number of years ago, and we had to make it
work for barely computer literate police.

Basically, we used a single piece of software's single feature as our
reference, for describing it to others and for how to handle the experience.

iTunes Smart Playlists.
([https://www.evernote.com/shard/s136/sh/7f5f39a1-6c85-4853-aa...](https://www.evernote.com/shard/s136/sh/7f5f39a1-6c85-4853-aa86-970931ea73f3/6c8638c37d303d492f263d396ae27b3e))

I have yet to see a better interface (than iTunes) for building what are
essentially custom queries.

~~~
mneary
I think that this can be improved by removing the All and Any operators and
instead letting segmentations be nested. So for example:

    
    
      All
        a = b
        Any
          All
            c = d
            d = f
          c = e
    

would be

    
    
      a = b
        c = d
          d = f
        c = e
    

I think that it is more intuitive to segment the set as you go.

------
d4nt
I built QueryTree ([http://querytreeapp.com](http://querytreeapp.com)) as a
response to this sort of problem. It's a domain specific visual programming
language designed for working with tabular data. So people who want to build a
custom query on a database can do it just by dragging and dropping the right
"tool" onto the "worksheet".

It's a desktop app rather than web, although it is actually built using HTML,
canvas and JavaScript.

It can do _almost_ anything SQL can. It can't do "IN" queries yet but I have a
plan do that in a way that's consistent with the general drag and drop
approach of the app. It doesn't explicitly handle BETWEEN queries but they are
easily achieved by using two filter tools one after the other.

I think there are two types of complexity to be dealt with when enabling users
to build reports/queries. One is the complexity of the concepts involved, the
other is the complexity of communicating what you want to happen to the
computer. I built QueryTree because of a belief that many people understood
they the concept of a filter or a join, but just didn't like text based
programming languages. I replaced the SQL syntax but left everything else in
place. If I didn't think my target users were up to understanding the
conceptual complexity I would have restricted it a bit more and would probably
have ended up with something like IFTTT.com but for data. Mostly, people seem
to "get" QueryTree though, especially when they're using it on data that
they're familiar with.

~~~
j_s
Looks nice! What technologies are you using for cross-platform support in your
app?

~~~
d4nt
After evaluating all the desktop html frameworks I could find, I settled on
"node-webkit".

------
aaronblohowiak
I've done this when building the reporting side of a survey product.

The vast majority of your users won't use the complex filters.

Ask your customers what kinds of reports they want to build, usually it will
be a series of AND requirements.

You can look at email filter builders for inspiration.

Usually it is something like building a series of :

Field (drop down) operator (drop down) value (text)

The tricky part in the ui is combining these (and/or) and if you really want
to get fancy, supporting parena to change the precedence of the operators. A
AND (b or c).

You must not convert this to SQL on the client side. You must validate the
fields submitted are in your approved list on the backend (indexes/security).

~~~
capkutay
'You must not convert this to SQL on the client side. You must validate the
fields submitted are in your approved list on the backend (indexes/security).'

Interesting point. Can't you get around this by only exposing certain
tables/fields in the UI based on the users permissions?

~~~
shebson
You shouldn't trust it just because of your UI. With developer tools,
regardless of your UI, it would be trivially easy for a malicious user to send
any arbitrary SQL query. This would allow them to retrieve sensitive data or
modify/delete records from your production database. You really should only
build this query server-side, and, as the OP is using PHP, I should also add
the caveat that they should use PDO.

------
jwdunne
This is interesting because I'm building the exact same thing, which also has
the be plugged into a Cake app.

For choosing the tables to query, my idea is to present a grid. Each cell has
the table name and tables available through relations. When a table is
selected, the cells either remain "on" or are disabled based on relations
available. These will typically change depending on tables selected. Relations
are discovered as defined, say hasOne, hasMany, etc. this is step one.

Step 2 involves selecting the fields to include in reports. These are
typically prefixed with the model name and with underscores replaced with
spaces. So User.first_name becomes "User's first name", User's country name or
Country name, which I haven't worked out fully yet. I suppose some intuitive
way to combine these into virtual fields would be nice.

The hairy bit is selecting conditions. I get requests to hard code queries
such as finding the number of product ordered by users who have taken an
online course or an in-person course so I have to be flexible. This part is
tough to express in a UI simply, because X AND (Y OR Z) isn't the same as (X
AND Y) OR Z. However, having some way of grouping conditions and requiring
that such groups also have a binary operator by dragging and selecting seems
the way to go, a la selecting icons on a desktop.

All of this can be saved as a "virtual table" for reuse. Output is typically a
CSV for marketing purposes but my intention is to have this so I can plug in a
PDF renderer with nice charts and what not, like Google Analytics.

Though something tells me building all this may be overkill if I can find a
package that handles analytics and custom-defined events, along with the
ability to upload backdated data!

------
nairteashop
> The tricky part is striking a balance between usability and power.

Yeap, this is indeed the big issue. One thing that helps is to tune the UI for
your _specific domain_ so that it has just enough power to be usable in that
domain, but no more. We had to build a query tool for retailers in my last
gig, and tuning the UI just for retail helped a lot.

For example, make it easy to specify a BETWEEN operation in your UI if that
tends to be used often, provide commonly-used domain queries as templates so
your customers can learn by looking at and modifying these templates, etc.

For inspiration, you may also want to look at other products that have query
building as core feature. Two that stand out to me in terms of usability are
mixpanel and chart.io:
[https://mixpanel.com/segmentation/](https://mixpanel.com/segmentation/) (see
"Goodbye SQL" section)
[http://chartio.com/product/tour](http://chartio.com/product/tour) (see "Drag
and Drop Chart Creation")

------
ideaoverload
I was working on such system for performance measurements. Basic assumption
was that user would never see any mention of tables, joins, column names nor
any other SQL concept.

1.Table selection - there is predefined set of joined tables we call views.
Views have descriptive names e.g. 'Temperature measurements', no underlying
tables are shown.

2.Column selection - user selects columns from views using descriptive names ,
not actual column names . UI clearly indicates columns that land in 'group by'
section by calling them dimensions. There is predefined aggregation strategy
for all columns that are not dimensions - e.g. average for temperatures or sum
for number of measurements. In some cases more that one aggregate is available
e.g. average or maximum temperature. User just selects maximum or average
temperature not aggregation operator.

3\. Filters: user can select simple filters on columns: =<> and string
matching for text. All filters are ANDed

4.Sorting: user can select columns to sort on.

5.Display: results are displayed as tables or charts. More that one section
may be placed on single report.

The real system has tons of additional features but basic design as design as
described above has worked great for years.

------
grdvnl
I have worked on an app that gave the exact functionality to the users. The
users could create queries using the query builder, and also schedule when the
queries could be run.

The challenges of course was training the users to use the operators correctly
and in a way that made logical sense. It did take a while for the end users to
learn to build queries.

On the UI front, most of the fields and tables could be chosen by double-
clicking or drag and drop. Certain operators could not be used with fields
based on their types. We also provided a AST like tree to help users
understand the operators.

On the optimization front, letting users build their own queries could also
lead to very costly queries. We had to profile the user created queries
regularly and fine tune our query generator to handle such uses. We ended up
creating a lot of materialized views in Oracle to handle specific frequent
uses of joins in the table.

The users who were using this tool were top level finance and accounting
folks, who wanted to produce reports for the CFO has he dreamt up different
data points to work on!

------
mingabunga
We've been doing exactly this for our antispam app, we started off with a
working custom filters web app (Angularjs) which you can see at
[http://209.213.221.169:8080/](http://209.213.221.169:8080/) using text or
regular expressions (the regexp has a bug in it). This also shows the query in
plain english as you build the query. Original design credits for the test app
to [https://github.com/kindohm/knockout-query-
builder/](https://github.com/kindohm/knockout-query-builder/)

This is the design we've come up with so far, which incorporates grouping of
queries. We've tried to make it readable, but we also display the query in a
sentence as you build it.
[http://cdn.firetrust.com/images/misc/customfilter.jpg](http://cdn.firetrust.com/images/misc/customfilter.jpg)

------
korzh
Take a look at EasyQuery.JS:
[http://devtools.korzh.com/easyquery/javascript/](http://devtools.korzh.com/easyquery/javascript/)

Live demo: [http://demo.easyquerybuilder.com/asp-net-
mvc/](http://demo.easyquerybuilder.com/asp-net-mvc/)

------
petercooper
My advice regarding the usability vs power issue is to have multiple stages.
You'd have a DSL/query language that you parse, is secure, etc. And then you'd
have a UI over the top of that which automatically creates the right queries.

The win here is that you then have an easy interface for 90% of users but the
remainder with more bizarre or complex requirements can still get to where
they want to go by either writing a query themselves in an "advanced" field of
some sort, or you could even do it for them for $$$. Embedded JavaScript is an
option here.

I would contrast this with something I would not advise, having a visual
interface go straight to a database query behind the scenes.. because then
you're basically limited to what you exposed in the UI.

------
mkal_tsr
You could start with an intermediate stage, with say a query-builder wizard
that walks them through the process.

1\. What type of report would you like to create? (define the domain)

2\. What are you looking for? (define the topic)

3\. What limits are there? (define the data boundaries)

4\. How do you want it displayed? (report output format like excel, pdf, etc).

See how a select group of customers like it and get feedback on it ... some
might immediately say "yes we want something more powerful" or what you have
will be just fine. From there you can decide if you want to pursue exposing
the query-builder more openly than a guided wizard, and if you want to charge
extra for it.

------
santialbo
Mac Finder's query builder is, in my opinion, the friendliest you can get.
[http://i.stack.imgur.com/YdUSq.png](http://i.stack.imgur.com/YdUSq.png)

------
jschmitz28
I've done this before in dojo framework + dgrid. Our system is based off of a
dynamic metamodel where users define classes (e.g. Task) and attributes under
those classes (e.g. State, owner, project reference) from a set of predefined
types (e.g. Enum, user, reference). The defined metamodel ends up as a class
hierarchy and is presented as a tree for the query builder. After defining a
class, you can then instantiate that class and set attribute values on it.

For building queries, you drag classes and attributes from the metamodel tree
into either the filter tree or the report fields section. The filter tree is a
tree of AND's and OR's constructed as you drag attributes into it. Selecting
an item in the filter tree then brings up a value editor based on the
attribute type that allows you to filter your query based on values that make
sense for that type. So if you define your tasks state as enum(unassigned,
assigned, started, completed) you get a combo box with those values and
options to select in, not in, null, etc.

The report fields section is a list of attributes you want to see in the
result set. Basically just a select on that attributes value for the query
system. Imagine a todo list created instantly by filtering on tasks with owner
== current user and state not in completed, but also wanting to see and sort
on the priority, and not showing the owner since you know it's you. The report
fields area allows setting of sort order, and ascending or descending.

As you drag attributes into the filter or fields sections, a results pane
underneath automatically gets new columns added (for fields added of removed)
or new results based on a modified filter tree. After saving a query, users
can then register for updates of items that match the query. If any data is
changed that affects query results, the updated item or removal is pushed via
websocket and that item is updated or removed from the results grid. With
this, users never have to refresh the query since all the data is live.

Screenshots are slightly edited and a little out of date, but you get the
idea.

Query builder:
[http://i.imgur.com/AV071JV.png](http://i.imgur.com/AV071JV.png)

Dashboard (powered by query/report building):
[http://i.imgur.com/BvBW7L3.png](http://i.imgur.com/BvBW7L3.png)

------
camus2
there are a lot of query builders out there, i'm pretty sure you dont need to
write one. Doctrine has one :

[http://phpcr.github.io/](http://phpcr.github.io/)

which deals with pure PHP objects.it follows a java standard,you can write SQL
like queries with it.

[https://github.com/phpcr/phpcr-
docs/blob/master/tutorial/Tut...](https://github.com/phpcr/phpcr-
docs/blob/master/tutorial/Tutorial.md)

------
j_s
This blog post demonstrates grouping values under the associated filter
expression:

[http://kindohm.com/posts/2013/09/25/knockout-query-
builder/](http://kindohm.com/posts/2013/09/25/knockout-query-builder/)

There are also number of projects implementing JavaScript query builders that
can be found from there, including Red, YUI, jQuery and Ext Grid query
builders.

------
ecolner
I've done this when I was working on the Playstation Network so business users
could define storefront categories that contain products based on their own
custom logic. We didn't get a lot of requests to change the feature and it was
stupid simple. I wouldn't get too fancy unless your users get really excited
about it and ask for it to do more.

------
olalonde
MagentoCommerce (also PHP) has a query builder, maybe you could have a look
for inspiration?
[http://www.magentocommerce.com/wiki/_media/welcome_to_the_ma...](http://www.magentocommerce.com/wiki/_media/welcome_to_the_magento_user_s_guide/catalog_price_rule_conditions_combination.jpg?cache=cache)

~~~
aaronem
Having spent over a year working with Magento's community edition, it seems to
me that, for a developer unfamiliar with it, the reaction it's likeliest to
inspire would be panicked flight.

------
bryanh
You might look at how Wufoo's rule system works. Designed for not-very-
technical users and it is pretty powerful.

------
HeyImAlex
Check out HTSQL, should be able to act as your API.

------
eitally
I spent ten years of my career building and then altering something like this
and would be happy to talk with you about it.

~~~
progmanos
I'm actually looking to work on something similar for an intranet tool.

------
petilon
Take a look at Pebble Reports. It is exactly what you are describing. Although
it is a Windows app, not web.

------
bra-ket
SQL is pretty good

