
Visualize data instantly with machine learning in Google Sheets - pmcpinto
https://www.blog.google/products/g-suite/visualize-data-instantly-machine-learning-google-sheets/
======
mbesto
When I worked for SAP back in 2007 (I was a fresh grad at the time), I was
working in the business intelligence (reporting, analytics, and data
warehousing) group and noticed how cumbersome it was for organizations to
simply create and view reports (we're talking millions of dollars). I once
said to my boss "you realize that in the future we'll simply just write 'show
me a line graph for sales in the northeast'".

And so here we are now.

~~~
tryitnow
I'm currently evaluating BI vendors for my company and just about every major
contender has this functionality or is developing it. Read the Gartner report
from 2017 and they basically just come out and say that this will be the new
standard in BI.

The much harder problem is data management and preparation. Anyone with half a
brain and a decent visualization tool can create basic graphs - but that
doesn't mean they should, especially if the organization doesn't have good
data management processes in place.

Issues like data governance, data prep, and data modelling are the major pain
points for me. And honestly, developing a useful BI solution is more about
culture change than it is technology. If a company has poor data governance,
it doesn't matter how whiz-bang their technology is, they're still not going
to get useful insight from their numbers.

~~~
ozataman
And data management / preparation is where major mistakes are made - get a
join wrong and you may easily be missing data or double counting something
just obscure enough to go unnoticed like "ancillary sales".

There is something potentially harmful, or perhaps that needs addressing,
about end-user tools growing in expressive power. A good friend who does
statistical genetics work once told me "but I don't want every user running
their own regressions and drawing nonsensical conclusions from badly prepared
data!"

~~~
shostack
Yep. Wake me when an AI can tell me "hey, that monthly trending conversion
report you asked me to pull...yeah, you're missing two days worth of data when
tracking broke, so it will just make your numbers look lower when rolled-up
monthly and be hard to notice."

BTW, that is also the reason I not only set alerts, but review data at a daily
level when pulling any rolled-up reports of significance.

~~~
kyleschiller
Why can't AI tell you you're missing data?

~~~
massaman_yams
For most use cases, you don't even need AI. Once you reach a certain scale,
there are certainly ML-based solutions available: [https://medium.com/netflix-
techblog/rad-outlier-detection-on...](https://medium.com/netflix-techblog/rad-
outlier-detection-on-big-data-d6b0494371cc)

------
harshaw
I played around with this the other day. I have a spreadsheet with a bunch of
columns. It wasn't immediately obvious how to use the explore feature
intuitively. It graphed data but not really the ones I wanted. I was also
hampered by it using only about 200 pixels on the right side of the screen.

I started typing in a question but it couldn't guess what I was interested in.
YMMV. Perhaps with a fairly simple spreadsheet you can intuit things? Back 10
years ago I built a google spreadsheet competitor called Numbler (well, I
didn't know if was a competitor, google sheets came out a couple of months
later). But one of the things I learned is that people use spreadsheets for
just about everything, and it can be in the wierdest format.

~~~
neovive
"people use spreadsheets for just about everything"

That is so true. I used to do consulting for small businesses and was amazed
at how non-programmers used Excel to solve so many problems outside of finance
and accounting. The CRM/HR solutions were very common and interesting (e.g.
Lead/prospect management, sales, timesheets, vacation schedules, etc.).

------
teej
Can we talk about getting data into Google Sheets? Is there a standard way to
build a pipe from, say, a reporting database to dump aggregates into Google
Sheets?

I built a private Add-on for my company that surfaces specific aggregates as
Sheets functions (i.e. getSalesByDay(...)) and I have found so many bugs with
that whole ecosystem. Deploys are completely manual and require copy-paste,
you can't reliably tell what version is being invoked in a sheet, invisible
cell-level caching that caches error state, concurrency limits that are too
low and impossible to work around, and more. It all kinda sorta works but
Google doesn't make it easy.

~~~
yeldarb
Yes, so much this. I've encountered this same issue.

For our non-technical employees I write an endpoint that gives them a CSV with
the most current info (often with a super simple front-end that they can use
to query for specific date ranges or with filters). They download this CSV and
upload it into their sheet manually to update the underlying data when they
need new info.

It would be so great if you could just say "here's a URL; keep my data fresh
from this source" and it would automatically do it.

~~~
jkaptur
Does the IMPORTDATA function work for that?

~~~
dopamean
It does. I use heroku dataclips to get data out of a database and import it
into google sheets using importdata. It's really simple and for my use case
incredibly useful.

------
inthewoods
Cool stuff - someone else mentioned Thoughtspot - that was my initial thought
as well. Very similar idea.

I wish that Google would take the same sort of "embed" idea further in
G-Suite. I find it amazing that I can't (as far I know) reference slides from
another deck in Google Slides. The use case would be putting together a series
of "core" slides that are updated across your organization as they change.
Given the web nature of G-Suite, this, to me, would seem like a no brainer.

Also, inserting charts from Google Sheets into Google Presentations looks
pretty terrible. I often revert to Excel because the charting is fair superior
imho (though just as challenging to wrangle).

------
wyck
They are solving a problem that doesn't really exist, the challenge is not the
last step of a data report, it's the steps involved in the beginning, getting
good data in, formatting, joining multiple sources, automation, dealing with
junk data, procedures,etc.

I don't understand the example, what's the difference between typing "Show me
a line graph" and clicking a button in excel that does the same thing.

~~~
Nagyman
Did you try it? Yes, getting good data is a challenge but that's not one they
can solve in Sheets. Making a graph is also not always one button, unless you
have very simple data.

I found it immediately useful. They've solved some UX and discovery issues
around creating charts. And it's not _just_ charts... they're answering
questions and identifying trends within the data. e.g. I threw some pretty
basic data at this and it told me “Flights” contains a yearly cycle: “Flights”
increases until May 1, decreases until October 1, and increases until December
1.

That's pretty handy. YMMV, but this is awesome.

~~~
qooleot
Tableau did some research in this area, and found NLP queries were much
faster:

[https://pasteboard.co/dPeoSHM1s.png](https://pasteboard.co/dPeoSHM1s.png)

------
vgt
Mandatory shameless plug - Google Sheets integrates with Google BigQuery.

You can:

\- Query data in Google Sheets from BigQuery

\- Create virtual views in BigQuery that are powered by Google Sheets

\- One-click export data from BigQuery to Google Sheets (< 20k rows or so)

\- Using AppScript, build dashboards and reports in Sheets that query BigQuery
for results.

(work at G)

~~~
dagss
I tried to rely on your first bullet point and randomly end up with BigQuery
surpassing some API treshold towards Drive (paying customer). Have to
carefully manage doing copy-queries over to BQ, which is a pain, but better
than nothing I guess.

~~~
vgt
Thanks for the feedback! I'd love to get more detail on your experiences so
that we can iterate and improve and would be eternally thankful if you emailed
me.

------
taylorwc
Oh, wow. I love where this is headed. Spreadsheets are one of the most abused
products in a normal business--used for _everything_ , and then some poor
excel jockey ends up being forced to create a semblance of order from the
chaos.

~~~
cube00
Someone needs to create Spreadsheets Anonymous and share the crazy stuff
people do in Excel, the best one I've seen is a full GUI wizard (back, next
etc.) in VBA.

~~~
oaktowner
I once worked at a large company (70K employees) in which the check request
software (that is, if you needed to pay a vendor with a written check) was
written in VBA embedded in Excel, using Outlook (the client, not Exchange) as
Workflow.

You opened the "spreadsheet", which hid Excel itself and showed a form. You
filled it out, clicked Send and it used your local Outlook client to email a
copy of itself to your boss, who then opened it and was presented with another
form for approval, etc, up through finance and finally to the person who wrote
the check.

It was amazingly ugly and fragile, and of course, there was no way for anyone
to see the state of the system (because the 'state' was distributed among the
entire workforce's Exchange inboxes!).

Want to know the status of your check request? No problem, simply call
everyone who might have been in the approval chain and ask them to search
their inboxes. Hope you find the most current person!

(updated to correct the name of the software)

~~~
Declanomous
God, I work in an environment like that. We are a non-profit, and our primary
CRM is The Raiser's Edge. However, data is never manipulated within The
Raiser's Edge. Instead every process involves exporting data from The Raiser's
Edge, manipulating it with an MS Office application, and then re-importing it.

When I started, almost every process was based around Access, using queries
made using the query builder. With some basic scripting I managed to cut a
couple of man-months of work down into about an hour or two of work a month.
The entire thing is more fragile than a stack of cards though.

At this point I could literally write a 200 page dissertation on all of the
reasons why this is a bad idea. But it works, and it is cheap. So there you
go.

------
uberneo
Well charts is a good addon but just wanted to understand how they are able to
do this ... i mean Machine Learning part , for example if somebody asks "Show
me sales of X product in last year" , from machine learning perspective how
this gets interpreted in actual SQL query ..

~~~
spinlock
I wonder if it's actually machine learning or if they just implemented a query
language that tries to guess what you mean.

The definition of machine learning that I use is: an algorithm which improves
its performance through experience.

So, if charts doesn't get better the more you use it, it's not machine
learning.

~~~
lightbyte
If I had to guess I'd say the machine learning part is analyzing a sentence
("Show me a line graph for x, y, and z") and determining what exactly they
want to see.

------
andrea_s
Kind of goes in the direction of what Thoughtspot
([https://www.thoughtspot.com/](https://www.thoughtspot.com/)) is doing
([https://www.youtube.com/watch?v=D-y_EjFsDuk](https://www.youtube.com/watch?v=D-y_EjFsDuk))

~~~
amelius
Looks like a good idea. But, where does it get its data from?

Does it perform NLP on company documents?

~~~
andrea_s
Honestly I don't know, I do not work for Thoughtspot.

My assumption is that it works on metadata coming from a relational schema
with a rule parser on top.

Honestly I don't see how it would work well enough if it was based on NLP of
unstructured data.

------
pbreit
This is very neat.

But the best, still-mostly-hidden feature I've found recently is App Scripting
and especially the ability to do a UrlFetch.

I use it as an "API Runner" to run various batch jobs against APIs.

[https://developers.google.com/apps-script/reference/url-
fetc...](https://developers.google.com/apps-script/reference/url-fetch/url-
fetch-app)

------
darwhy
I'm wondering how Microsoft is responding to this. Do they expect their
current Excel dominance to continue despite competitors constantly catching up
to feature parity and even extra goodies, like this one?

~~~
tryitnow
Actually, this has been available in Power BI for a while. I just went to a
meetup last night and saw it in action.

It's fairly trivial to get this to work with well-formatted data.

I'm currently evaluating BI solutions for my company and just about every
single one has something like this.

------
froindt
I wonder if we will see more software including query based input like their
charts, and what sort of speed improvement we could see? At first I was not
excited to type something where I could click a couple buttons, but then I
recognized the other enhancements such as applying a filter right away.

I'm not convinced it's better just because it has machine learning on the back
end, but if excel would learn how I want my graphs made from how I manually
adjust the graphs (adding axis labels and a title, color preferences, never a
3d bar or pie chart), that'd be a nice enhancement. I'm sure there's a
setting, but I haven't searched for it.

------
zitterbewegung
I thought Explore in google sheets has had this feature for awhile? I remember
it suggesting visualizations in sheets a few months ago.

~~~
fakename
I've been using it for a while, too. I'm guessing it's been available in
Enterprise/g suite, but is now rolling out to all users.

~~~
zitterbewegung
From my experience the public versions of g suite run a more bleeding edge
version and they introduce stabilized features into g suite.

------
f00_
This is similar to statsbot.co 's Slack bot

You would message the bot something like "sessions for this month", and it
would send back a graph.

wonder if you could make a similar bot with google sheets if they provide an
api

[https://statsbot.co/slack](https://statsbot.co/slack)
[https://medium.com/slack-developer-blog/bots-you-can-
count-o...](https://medium.com/slack-developer-blog/bots-you-can-count-on-
meet-the-developers-behind-statsbot-1ddb2961ead0)

------
gsvclass
There's a lot of basic stuff like column titles, moving columns about,
filtering, search that I found had a quite a learning curve with sheets. I
built and use this instead. Bell+Cat
[https://bellpluscat.com](https://bellpluscat.com)

~~~
aeorgnoieang
I love your "Made with and ️ on the 3rd rock from the sun." tag.

~~~
gsvclass
Thanks, its a little sad how I survive on taco's and coffee :)

------
sandGorgon
does anyone know how this kind of stuff gets built ? I'm considering a
spreadsheet-y internal admin dashboard for my startup. I was looking at
[https://github.com/JoshData/jot](https://github.com/JoshData/jot) to be able
to sync stuff on the client side to the server.

has anyone worked on something like this ? the big challenge is
synchronization - between server and multiple clients - while being able to
offload a lot of computations on to the client.

I wonder how is the security built ? if i maliciously change the formulas in
my browser.. will the backend datastore still accept the data ?

~~~
fiatjaf
The big challenge is implementing the spreadsheet UI.

~~~
TomMarius
There's a React component for that. React-datasheet or something like that.

~~~
fiatjaf
I know there's a React component. There are many. They don't come even closer
to the functionalities of an actual Excel-like spreadsheet.

Implementing 70% of the most basic of all functionality of Excel is easy.
Implementing 90% is insanely difficulty.

I would say Handsontable reaches the mark of 78%.

I've tried twice, one with React + Mori[0] and one in Cyclejs[1]. In the first
I may have reached 82%, in the second probably 85%. But in both cases the
latest features I added started to conflict with the oldest ones and I
abandoned everything.

[0]: [https://github.com/fiatjaf/react-
microspreadsheet](https://github.com/fiatjaf/react-microspreadsheet) [1]:
[http://sheets.alhur.es/](http://sheets.alhur.es/)

------
blazespin
This sort of thing it's easy to get to 80% but good luck getting that last 20%
without formalisms. Might be useful for getting a quick feel for a data set to
confirm some intuitions, but not really useful beyond that.

------
shostack
Is there any way yet to sort values in a pivot table? Kind of ridiculous that
we still need to resort to the query function...

------
synaesthesisx
The next major of Tableau should be implementing NLP in similar fashion.

------
KaoruAoiShiho
Wow amazing, when will this be available as a javascript library?

------
2_listerine_pls
still no tables. How hard is it?

------
p90puma
404 for me on the link.

------
mariogintili
how many corporate secretes will be leaked into this?

~~~
romanovcode
Leaked into what? If you are using Google Spreadsheets your "secrets" are
already on Google servers and are "leaked" long time ago.

~~~
krick
While technically true, this is the first time I'm actually thinking "maybe
I'll start using Sheets". I hate Google and giving my data away, but lately I
often feel that I'm missing pleasure of easy use by avoiding all these hateful
botnets. So, eh, maybe I just die before it gets actually massively harmful,
and then fuck it.

We need open source implementations for stuff like that.

~~~
romanovcode
We have open source implementations for nearly everything that is provided
commercially. The problem is that no company wants to use said
implementations.

