Hacker News new | past | comments | ask | show | jobs | submit login

Does anyone have insight into how painful it is for non-technical people to query their data warehouses?

I'm building a tool that allows business people and non-technical analysts to query their data warehouses using natural language. (Currently, you must ask a technical person to write ad-hoc queries for you, or build you a dashboard. This bogs down your data people.)

Does anyone have insight into the demand for such a product?

[edit: I'd love to chat with anyone with insight into this topic. Reach me at Joseph at metaoptimize dot com]

Most of the time its usually easier to have people learn a touch of SQL and ask developers for harder queries. We used Tableau and after a couple of weeks, they had every query they every wanted saved.

Non-technical people don't query.

They call me up, ask me to do a "quick report across the inventory db with the project cost data." I send it off to them. If they like it, we push a report (maybe with a couple of parameters) into production.

My gut is that we aren't lacking for good technical options in analytics and data warehousing. To be honest, the lion's share of my work in data warehousing is helping the users know what questions to ask.

But there is lots of room and probably several excellent lifestyle to 8-digit businesses for good BI.

Does anyone have insight into how painful it is for non-technical people to query their data warehouses?

Depends. Back when I did DW stuff my general workflow was to speak with the analysts about what they were trying to accomplish. From there I would create the cubes and additional metrics. I would also set up all the processing schedules at this time. The analysts would then use an Excel plugin that provided a pivot table interface to any cubes for which they had access. It worked pretty well.

For straight data access I would teach the them basic sql and/or build sql templates for them that they could extend.

My goal was always teach a man to fish and get out of the way.

The closest things i've seen are exploratory data visualization products such as tableau (which is pretty awesome). The downside (or partial downside) is it can end up writing some nasty non-performant queries in certain aspects.

Yes, it can be crazy painful to the point that Non-technical people just don't do the query unless it is business burning critical. In large part because the technical people often are tasked on projects from IT, and to get them to do a query required middle management department to department deal making which is slow and painful.

At ExxonMobil, a place I worked, you're going to have VP's asking eachother and IT is going to hedge with, yea if we do this then project X will be late (it's going to be late anyway but they've kept quite about it and no one knows).

My personal solution when I needed a query was to bring a six pack of beer down to IT friday afternoon, mostly because I wouldn't be given access to write queries because we had BI software.

I would suggest reading some books on the topic of Dimensional Modeling [1] such as "The Data Warehouse Toolkit" [2]. The critical thing you need to expose to your users is the ability to ask for things which make sense in their world that are actually really difficult for even an engineer to code. Things like: "Show me average 9am-12pm sales on Mondays, Wednesday and Fridays for 1st quarter, 2012"

  [1] http://en.wikipedia.org/wiki/Dimensional_modeling
  [2] http://www.amazon.com/Data-Warehouse-Toolkit-Complete-Dimensional/dp/0471200247

Speaking as someone who does his fair share of dimensional modeling, I would just point out that the example you cite could only involve two tables in a well designed dimensional model (sales fact and time/date dimension, I reckon). The challenge is in getting to that point.

To speak to OPs point about difficulty in querying data warehouses, most business intelligence tools that I'm aware of provide semantic layer[1]-type capabilities, whereby the user interface of the tool is presented in the language of the business domain. Nevertheless, I still agree that this is still difficult work, unfortunately. That it is getting more complicated in some respects, such as through unstructured data, doesn't help either.

[1] http://en.wikipedia.org/wiki/Semantic_layer

I guess I wasn't clear enough if I came across like my example was complex. It's one easily solved via DM and one that's extremely hard to execute in most non-dimensionally-modeled setups. That's exactly why I'm a huge advocate of DM instead of just throwing a ton of servers, hadoop & MR at everything.

> Does anyone have insight into the demand for such a product?

Enormous, and there are dozens of such tools available.

Most of them work best if you build an actual data warehouse -- dimensionally structured, not normalised. This is because they can easily build query forms using the DW dimensions in a language that makes sense to end users.

You might want to look into rjmetrics or chart.io and see what they offer. I've been integrating with both, and it seems one of their goals is to (after the connection and datasources are set-up - that still requires technical knowledge) allow non-technical people access to analyze the data.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact