

Google BigQuery brings Big Data analytics to all businesses - boundlessdreamz
http://googledevelopers.blogspot.in/2012/05/google-bigquery-brings-big-data.html

======
haberman
Hi everyone, I work on the BigQuery team. From a technical perspective, think
of this as an append-only cloud SQL service built on Dremel
(<http://research.google.com/pubs/pub36632.html>). You upload huge data sets,
we run SQL queries over them in seconds, even for billions of rows. And
without having to specify/build any indexes (we actually scan the billions of
rows, though only the columns we need to).

All interaction with the system happens through REST interfaces. Even our own
UI uses only our publicly-available REST APIs.

There's a certain amount of free quota available. If you sign up you can try
queries against public data sets like Wikipedia edits. Also it looks like the
GitHub guys have been experimenting with analyzing GitHub data with BigQuery:
<https://github.com/blog/1112-data-at-github>

I've just joined the team recently, but I really believe in what we're doing.
I'd be happy to answer any questions I can.

~~~
DenisM
So, what are the publicly available data sets? I see there is wikipedia in one
of your screenshots, but short of that I couldn't find a list. I think if I
saw something enticing I would sign up just to play with it.

~~~
haberman
You can find the list of public data sets and descriptions of them here:
<https://developers.google.com/bigquery/docs/sample-tables>

~~~
DenisM
Thanks.

It's a bit thin, so I suggest you guys pump a lot of public datasets into it,
and then do a series of blog posts about "look what you can discover from
these public datasets with out awesome Q engine in a matter of seconds".

------
oliverkofoed
With all the 'spring cleaning' going on recently at google, my main concern
would be the likelihood of this service staying available permanently.

~~~
pgrote
As a pay service focused on businesses, I think they would keep it going.

I cannot get to it right now. :)

"Error: Server Error The server encountered an error and could not complete
your request. If the problem persists, please report your problem and mention
this error message and the query that caused it."

~~~
batista
_> As a pay service focused on businesses, I think they would keep it going._

Unless if not enough businesses end up paying for it, so if yours does use it
and they cancel it, you're screwed. Or if Google decides that while it makes a
nice revenue, they'd rather killer to concentrate on something else...

That's the problem with putting out tons of products (including highly touted
stuff like Wave) and then killing them, nobody trust you to maintain a product
their business will depend on anymore... Contrast that with Amazon AWS.

------
rabidsnail
Who wants to upload the CommonCrawl corpus as a public dataset? :P

------
JPKab
This unquestionably lowers the barrier to entry for crunching large data sets.
I'm looking forward to messing around with it. Are there any other
alternatives to this service? Something like a PigAsAService or HiveAsAService
offering?

~~~
pappnase12
Well, we are working on a project that provides Hive (and Hadoop Streaming) as
a service. It's <http://www.hadoopondemand.com> and uses amazon ec2. We have
just started our private beta and you are very welcome to join. And there is
also amazon's offering EMR (<http://aws.amazon.com/elasticmapreduce/>) which
also provides an interface to Hive and Pig.

EDIT: link to amazon's offering

~~~
JPKab
Thanks. I look forward to checking out your project.

