
Google makes BigQuery available in Sheets - truth_seeker
https://techcrunch.com/2019/04/10/google-makes-the-power-of-bigquery-available-in-sheets/
======
spaceheretostay
All this fancy stuff, but where's my Sets at? Google took down the website
like 15 years ago. I used that thing _daily_. Then they put it into Google
Sheets. You could type "apple" then "pear" then "orange" and then select
cells, drag, and you'd get a list of fruits! Some you never heard of before!

I looked around and Google has a patent on it, and seemingly no public
implementation. This makes me super sad because I honestly considered Sets one
of the most useful exploration and ideation tools ever created.

BigQuery in Sheets, cool I guess? I want Sets back.

~~~
steve19
Try this:

[http://wordgrabbag.com/](http://wordgrabbag.com/)

Here is Google's Patent:

[http://patft.uspto.gov/netacgi/nph-
Parser?OS=pn/7,350,187&RS...](http://patft.uspto.gov/netacgi/nph-
Parser?OS=pn/7,350,187&RS=PN/7,350,187&S1=7,350,187.PN.&Sect1=PTO2&Sect2=HITOFF&d=PTXT&f=G&l=50&p=1&r=1&u=/netahtml/PTO/search-
adv.htm)

~~~
appleflaxen
that's the same functionality, but the results are not as good.

I put in: fat obese porcine

I get: wild-type hyperactive amnesic misshapen narcissistic slimy cancerous
motile putrid gravid

------
dalbasal
For all the complaints about spreadsheets, attempts to replace them and
such... They are a very reslilient paradigm.

The way the paradigm and learning curve work just seems to work for a lot of
people. I reckon it's still the most popular plaform for coding, data
engineering, analytics & data science... using liberal definitions for all
those terms.

It also seems like something that's in google's wheelhouse.

~~~
StevePerkins
Earlier this week, someone was passing this tweet around the office, which I
believe captures the cold truth succinctly:

> _" All enterprise software competes with Excel._

> _All productivity software competes with emailing things to yourself. "_

[https://twitter.com/pavelasamsonov/status/110653418815977881...](https://twitter.com/pavelasamsonov/status/1106534188159778817)

~~~
munchbunny
This is disappointingly true.

You'd think we would have come up with better answers. I'm constantly looking
for ways to better manage bug tracking, todo lists, and even managing job
searches and professional networking, because I'm still not really happy with
anything I've found.

As I bounce back and forth between different tools, I always land back on just
using Excel or Google Sheets. I'm never entirely happy with those either, but
I find that they're no worse at the job than the dedicated tools.

~~~
Nurw
Have you given notion a try? It sort of feels like one of those tools that are
so general they can actually compete with spreadsheets and emails and simple
lists. And pretty user-friendly to boot. I have started using it for more and
more simple lists and other personal stuff.

~~~
rabidrat
I just looked at notion, and it's unfortunately not general enough to warrant
making a version that works on a non-mobile device. I'll check it out when I
can use a version on my desktop (or at least on the web).

~~~
Veen
Notion has MacOS and Windows versions:

[https://www.notion.so/desktop](https://www.notion.so/desktop)

------
mises
I wish spreadsheets didn't get as much unnecessary hate as they do. They are a
pretty great tool, and can do almost anything. They are to finance/accounting
guys as shells are to many of us.

~~~
davemp
My hate for spreadsheets come from working in institutions that copy/paste
around hideous VBA infested sheets for metrics/reporting instead of using a
database.

There is a lot of painful excel abuse in business.

~~~
alexhutcheson
Most of the time that happens because the internal tools/databases built or
maintained by IT are unusable or just don't do the necessary reporting. Most
companies don't have in-house dev teams that are capable of building usable
business-relevant tools.

------
wodenokoto
Googles own blog [1] announcing the feature for G Suite Business, from back in
January. This is a great feature that I have been searching for, for a long
time.

[1] [https://cloud.google.com/blog/products/g-suite/connecting-
bi...](https://cloud.google.com/blog/products/g-suite/connecting-bigquery-and-
google-sheets-to-help-with-hefty-data-analysis)

~~~
lmkg
That looks different. The Data Connector, which you linked, lets you write a
query to BQ and have the results in Sheets. Connected Sheets sounds like it's
actually a front-end to the underlying data.

Notably, the Data Connector has a row limit of 10,000 results, while the
Connected Sheets claim to surpass the regular row limits of Google Sheets in
general.

------
polskibus
How does pricing work here? Say I have a pivot table on underlying 10TB data.
I share the sheet with 10 people, each one refreshes it 100 times a day. Do I
pay for each data refresh, ie 10TB * 10 * 100= 10000TB/day, that would be
$50000/day if I read the pricing table right. Can someone more familiar with
bigquery verify that?

~~~
dajonker
That depends on how many columns your pivot table uses. Less columns = less
data = lower price. If the implementation is smart enough it could use
BigQuery partitions for further cost savings.

~~~
polskibus
Is the data size calculated based on raw data on which queries operate (for
example all transactions with amount column), or on query result size (for
example sum(amount) -single row) ?

~~~
lmkg
BigQuery bills based on amount of data read. BigQuery is a column-store
database, so a read will always touch every row of a table, but only the
columns actually used. So e.g. "SELECT * LIMIT 1" will read the entire table,
while "SELECT SUM(transaction.total)" will only read 1 column out of
potentially hundreds. The latter query will be billed identically as a
filtered query like "SELECT SUM(transaction.total) WHERE transaction.total >
10." Filtering on a different column will be billed more, because it needs to
read the second column as well, regardless of how many rows (if any!)
contribute to the result set.

Idiomatic BigQuery will make use of partitioning, so that a large dataset will
span multiple tables in a way that you only read the tables of interest. (E.g.
the Google Analytics integration partitions by date, so reporting on 1 month
of data will only read 30 tables out of the 1,500 you might have.)

~~~
noer
It's worth noting that depending on the amount of Data you capture in GA, each
date table could be anywhere from 1s to 10s of GBs, and it can get pretty
pricey to query from those tables.

~~~
minimaxir
1s to 10s of GB is pretty much nothing in BigQuery-land. (unless you are using
them in production, which as noted elsewhere is a bad idea)

Especially with column-based querying and partitioning.

------
cosmie
Similar functionality called PowerPivot/PowerQuery[1][2] exists in Excel. It's
hands down one of my favorite things to use Excel for, and also one of the
least understood/known features by most Excel users.

I can't wait to see how Google's flavor of it for Sheets works.

[1] It's officially called "Get & Transform" now, but when it came out one
Excel 2010 it was called PowerQuery, and still gets referenced that way
everywhere (including the official help docs).

[2] [https://support.office.com/en-us/article/get-transform-in-
ex...](https://support.office.com/en-us/article/get-transform-in-
excel-881c63c6-37c5-4ca2-b616-59e18d75b4de)

~~~
mritchie712
This was one of the things I missed when moving from Excel to Sheets and led
to me building [https://seekwell.io](https://seekwell.io). It lets you grab
data from your database and drop it in Sheets using just SQL (Postgres, MySQL,
SQL Server and Snowflake).

We've also built a full SQL and Python IDE with a shared code repository, but
solving the simple problem of "write SQL -> get data in a Sheet" was where we
started.

------
danielecook
I’ve had trouble getting more than 10k rows in google sheets. It will be
interesting to see how well this works.

~~~
ramraj07
I have a couple sheets that are very close to the 2 million cell limits, in
which almost every cell either does a vlookup or is being queried or is
displaying some Google sheet SQL result.

Sheet takes time to load but still works fine!

~~~
richjdsmith
Boy, I'm not sure quite why, but the idea of a spreadsheet with nearly 2
million cells stresses me out...

Why that over a DB?

~~~
prepend
Spreadsheets are a DB. Do you mean over a DB server? Or a relational DB? Or a
DB without a bundled Model view controller?

------
falcor84
I've tried using the current version of Sheets App Script for some advanced
business functionality, but it's horrendously slow - taking over 10 seconds to
loop over just 5000 rows. And they're using an ancient version of js, making
development annoying.

I really hope that this announcement would herald a renewed interest on their
side in improving Sheets as a whole.

~~~
pcmaffey
Yes, I found it much faster to do everything in memory, minimizing the number
of reads/writes.

And it was a bit of shocker writing es5 js again.

------
minimaxir
It's worth noting that there has been a Sheets connector for BigQuery since
January, although only for G Suite customers:
[https://cloud.google.com/blog/products/g-suite/connecting-
bi...](https://cloud.google.com/blog/products/g-suite/connecting-bigquery-and-
google-sheets-to-help-with-hefty-data-analysis)

This announcement seems like an expansion on what's above (e.g. pivot tables).
I've tested what's above for my org and there's a lot of promise. (only
criticism is creating formulas in the query don't work automatically since
they are returned as Text cells; workaround is to cast as Automatic)

------
savrajsingh
Last time I checked, it was still hard to get a Google CloudSQL DB into
BigQuery, so I’m surprised they did the sheets integration first. We used a
third party that just copied data from our CloudSQL to BigQuery.

~~~
faizshah
Yea this is surprisingly annoying. I did CloudSQL -> csv -> dataprep ->
bigquery. For about 10 tables with ~20 columns each.

A job that should be less than an hour took over 6 hours due to various
annoying datatype issues.

Another interesting thing is that BigQuery doesn't have Full Text Search so
you need to go back into postgres or use elastisearch to use FTS which is
beyond annoying.

------
vvern
If you find this interesting you should take a look at
[https://www.sigmacomputing.com/](https://www.sigmacomputing.com/)

------
2_listerine_pls
I wish Google would let individual users upgrade to GSuite Business to try
these tools, rather than forcing the whole organization to upgrade. The
pricing model is impractical.

~~~
deckar01
I use GSuite as an individual. I just had to use a personal domain and forward
my other account to it.

~~~
2_listerine_pls
Yeah, but I really want to use it in my organization. It's too expensive to
upgrade for a single user.

------
polskibus
How's bigquery performance in comparison to clickhouse? When is it worth
switching from one to the other?

~~~
wikibob
BQ is considerably more performant.

~~~
msquog
This is uninformed and unsubstantiated.

The answer depends massively on the intended use case.

~~~
polskibus
What are the use cases where one would shine and the other wouldn't? I'm
interested in real life usage of any of those on >100 GB datasets.

~~~
monsieurbanana
With bigquery the cost is per query. If you do a "SELECT * FROM AAA" you'll
pay whatever the size of the table AAA is, price being 5$ per TB.

If you do a "SELECT field1, field2 FROM AAA" you'll pay only for the total
size of field1 and field2 rows.

So you usually want to use BigQuery in situations where you don't need to
query data all the time, but rather a fixed number of times a day.

Now about performance: BigQuery queues queries, so you don't have a guaranteed
time. It can take a couple of seconds before your query starts running, it can
take longer. If you need something that responds in < xxx ms, BigQuery is not
it.

But the queries themselves are fast. If you need to query across petabytes of
data, as a simple BigQuery query will gladly run on however many dozens or
hundreds of instances it needs, at no additional cost for you (since you only
pay by the size of your data queried).

It's really a great example of serverless. You can run your query across 100
instances, but you only use those instances for a few seconds.

~~~
m0zg

      > SELECT * FROM AAA
    

Can't think of a realistic use case where such a query would be appropriate,
even from just the performance standpoint. In fact, for a very long time the
internal counterpart of BigQuery didn't even support "SELECT star", and nobody
complained too badly. If you'd like to give Google a gift, however, sure
"SELECT star" all you want. :-)

    
    
      > "SELECT field1, field2 FROM AAA" you'll pay only for the total size of field1 and field2 rows.
    

Moreover, if you also use a WHERE clause, you'll pay even less.

~~~
hodgesrm
This can happen in data cleaning/loading where you load unclean data into a
table that is ready for analysis. I have loaded data through staging tables
regularly. There may be multiple stages.

Another example is materialized view creation. It's common for these to scan
large quantities of data to compute aggregates.

~~~
m0zg
That's not the recommended way of loading data into BigQuery, though.

[https://cloud.google.com/bigquery/docs/loading-
data](https://cloud.google.com/bigquery/docs/loading-data)

------
crystaln
Cool. I wonder when they will shut it down.

I largely ignore Google products announcements these days. Why bother getting
excited about things that won't last?

~~~
mda
This is getting tired and adds nothing to the discussion.

~~~
coldtea
If one is not familiar with Google's background on the matter, it's the number
one thing they should know.

~~~
jasonvorhe
No, the first thing you should know is that Google consumer products are not
Enterprise features.

~~~
puszczyk
Is Sheets consumer or enterprise?

~~~
joshuamorton
Sheets integration with bigquery that has an enterprise pricing plan is
enterprise.

There's also a decent argument to be made that sheets is an enterprise product
too. It certainly is for enterprise customers.

Disclosure: I work in Google Apps, but not on sheets or docs.

------
paulcole
> The beta of connected sheets should go live within the next few months.

In case anyone’s looking for it, there’s the buried lede. Whenever Google
announces something like this, don’t hold your breath. Remember that robo-
calling assistant they demoed awhile back? I remember that should’ve gone live
within a few months, too.

~~~
jasonvorhe
Why do you spread such negativity if you haven't even fact checked that what
you're saying is true?

~~~
lopuhin
That's the quote from the article. Do you have any additional info on when
that would become available? The link from another reply is completely
unrelated, as of writing this.

~~~
tacomonstrous
It's already working on Pixel devices in the US. Source: I used it.

~~~
lopuhin
Sorry, just to clarify, is that BigQuery integration into Sheets your are
talking about? Can't find it in the web UI.

~~~
tacomonstrous
No I was referring to the robocalling feature. It's an actual product now.

