
Ask HN: Startup idea feedback – user-friendly AutoDataScience - vcidev
I would love to know what people think of this idea. Here is the hypothesis&#x2F;pain point:<p>ML and more broadly data science are very useful, but even some of the most recent &quot;easy&quot; data science tools (e.g. Google AutoML tables) have too high of a learning curve to be useful to the average consumer.<p>Normally if you were learning a new tool, you might learn through a combination of study, and trial and error. However, many people don&#x27;t have a lot of time to sit down and learn something complex in this manner. (They need a bit of an extra push to minimize their error&#x2F;guide them toward reasonable use cases.) The result is something like this:<p>1 - get excited to try something easy and get new value out of their data<p>2 - get frustrated because the tool is not easy enough, or they don&#x27;t know what questions are answerable with the available algorithms<p>3 - search the internet for guidance on what the algorithms do, get overwhelmed<p>4 - abandon tool<p>Solution:<p>1 - send us your data (probably a spreadsheet&#x2F;CSV&#x2F;Excel file)<p>2 - we analyze the data, and send you a list of questions that we can answer&#x2F;insights that we can derive<p>3 - you select which of the questions you want answered<p>4 - we run our analyses and send you the results, including an explanation of the algorithms that were used to derive the results<p>The key here is that the &quot;learning&quot; takes place after value is delivered to the user. Even though a tool may allow you to do things with the click of a button, the hidden complexity still presents a learning curve to the user.<p>Footnotes:<p>- I&#x27;m not claiming to have a large amount of data to back this up, hence why I said this is a &quot;hypothesis&quot;. I&#x27;m offering the idea up for feedback and am interested in hearing what people say!<p>- This certainly does not apply to people who are used to self-directed learning and enjoy a healthy challenge
======
psv1
> Solution:

> 1 - send us your data (probably a spreadsheet/CSV/Excel file)

> 2 - we analyze the data, and send you a list of questions that we can
> answer/insights that we can derive

> 3 - you select which of the questions you want answered

> 4 - we run our analyses and send you the results, including an explanation
> of the algorithms that were used to derive the results

This order of points isn't quite right. The overwhelming majority of companies
will already have a question that needs answering or a problem that needs
solving. They will then want to know which parts of which datasets are
relevant. If the existing datasets aren't enough, they consider collecting
and/or purchasing more. Knowing what to collect and/or buy is another problem.
Then you need to set up systems for extracting any 'insight' from what they
have and continuously managing and processing data from multiple sources, and
so on and so on.

No one is really sitting with a single csv file open in front of them,
thinking "Hmm if only someone would tell me what I can do with this".

~~~
vincentinverso
> Then you need to set up systems for extracting any 'insight' from what they
> have and continuously managing and processing data from multiple sources,
> and so on and so on.

Agreed, this is quite tricky. The solution I described lends itself much
better to one-off analyses, thus limiting its usefulness.

> No one is really sitting with a single csv file open in front of them,
> thinking "Hmm if only someone would tell me what I can do with this".

Yea, most people don't do this. I think the target user would likely be
someone who has an ok idea of what they might want to do with their data, but
needs some guidance as to what the latest and greatest techniques can actually
do, given what they have. There may be a gap between what they think they can
do, and what is actually possible. Or they might discover new things that they
didn't think of. Perhaps this is more of a personal frustration that I've had
in the past.

The "single csv" issue is a concern I have as well. Many data-science-type
problems of course cannot be reduced to a single csv.

Thanks for the feedback!

------
ian0
I think it's a great idea! However, I think you may find that success with
this product will depend more on people being _able_ to use your service -
rather than people _wanting_ to use your service.

In my case I would require (1) confidence in the security of our data, (2)
some way to continue using the service without it being manual (eg latest
months data is reflected somewhere I can log into), (3) where a model is being
created a way for existing systems to interact with it via API.

PS Personally I love the insertion of #2 on the solution points. Yes I would
have questions going in but would appreciate validation that they can be
answered effectively and would appreciate a list of potential questions that I
may have missed myself.

~~~
vincentinverso
Thanks for the interesting thoughts ian0. I honestly think Google has a good
shot at all the things you describe with their AutoML tables, _if_ they spend
a lot of time on the UX. The problem that I'm guessing most services run into
though is how to tune (up or down) the informational/educational aspects of
the app depending on the prior knowledge of the user. IMO us hackers don't
spend enough time on UX, maybe because of time constraints, or maybe because
it's not something that's widely taught at university.

------
seektable
Step (2) involves humans, isn't it? Because for simple cases this scenario is
already covered with:

\- Google Sheets suggests pivot tables / charts that can be built on your
worksheet data

\- with our BI tool ([https://www.seektable.com](https://www.seektable.com))
everyone can upload even rather large (up to 500mb) CSV file, then engine
suggests dimensions/measures automatically and even suggests some typical
reports (suggestions are very simple as for now, just set of heuristics rules
based on CSV column names). More than this, for CSV file user can 'ask' data
with search-like queries and get an answer in form of pivot table.

~~~
vincentinverso
Maybe initially it involves humans if you are just validating the idea. Nice
tool!

------
eb0la
Good Idea? yes and no.

The biggest problem I find is makes expectations _high_ from the beginning,
which is a bad idea.

Most data projects are about managing expectations.

Also, it is _very_ hard to demonstrate that your model works if the customer
does not have (yet) some graphics to compare against.

Your first step should be _showing_ the data just to have a visual baselite to
compare against.

------
vincentinverso
Thanks so much for the feedback everyone. It was fun and engaging hearing your
thoughts. Just wanted to let future readers know that I likely won't come back
and read this thread any time soon, so that I can actually go and get some
work done ;) Of course if you'd like to discuss without me, feel free..

------
temp_dr
That sounds close to what Displayr does -
[https://displayr.com](https://displayr.com)

~~~
vincentinverso
Seems like it! Very nice product.

------
andreshb
You should launch it and find out, could be as simple as the wizard behind the
curtain where you take their data and pass it through existing tools out there

~~~
vincentinverso
Definitely considering!

------
nudpiedo
isn't that what consulting is about?

~~~
vincentinverso
You could think about it via that lens, however this would still be more
"choose your own adventure" than hiring a consultant. On the spectrum of
problem solving tools, let's say that at one end you have do-it-yourself data
analysis tools (not a lot of hand holding). At the other end you have
consulting, where your hand is held pretty tightly. This idea would be
somewhere in between, but further toward the do-it-yourself end of things,
injecting some hand holding and education into the automation in order to get
rid of some of the more painful aspects of the learning curve.

------
nikalras1
we should talk - hit me up on twitter @nikalras

~~~
vincentinverso
Followed you on twitter, though I have to admit I'm not much of a Twitter
user! Feel free to email me at the address in my profile if that's easy.

