
Why We Chose Redshift - alishiu
http://blog.amplitude.com/2015/03/27/why-we-chose-redshift/#.VRWOLj6Ra6A.hackernews
======
edwintorok
The title should say 'Amazon Redshift'. At first I thought its going to be
about redshift vs f.lux:
[http://jonls.dk/redshift/](http://jonls.dk/redshift/)

Edit: Why the downvote? redshift (and flux) exist since before 2010, whereas
Amazon Redshift got introduced just in 2012. I think it is reasonable to
assume that someone who has never heard of Amazon Redshift would think of the
open source project first (that exists in various distributions as packages),
and not the Amazon service.

~~~
untog
If we took a poll I suspect the majority would be thinking of the Amazon
service - I know I was. The date the projects were introduced isn't
necessarily relevant.

~~~
O____________O
The UI colorizer is what I thought of immediately, too.

 _If we took a poll I suspect the majority would be thinking of the Amazon
service_

That's just personal projection, and is as irrelevant as an argument beginning
with, "I think most people would agree that..."

Personally, regardless of Amazon vs UI hack, I'm really tired of ambiguous
naming in tech projects.

~~~
chc
I'm much more tired of comments on ambiguous naming. There are at least two
other people in my city who have my name, and many more who share either my
first or last name. Somehow life goes on and this is not a topic of major
controversy. But when two pieces of software have similar names, people just
can't resist commenting endlessly and upvoting this content-free bikeshedding
at the expense of actual discussion.

~~~
O____________O
_There are at least two other people in my city who have my name_

I'm presented with information to parse about technology topics daily.
Sometimes, I have to search for them and have all sorts of name collisions.

I never, ever search for information about you.

 _people just can 't resist commenting endlessly and upvoting this content-
free bikeshedding at the expense of actual discussion_

One man's bike shedding is another thousand men's irritating trend.
Personally, I very rarely see anyone called out for the trendy names and
awful, buzzword-laden non-descriptions that infest projects.

------
ecaron
I wish he would talk about how they protect one customer from running a query
that brings down the full stack. When we permitted Tableau to start talking to
Redshift, we frequently encountered "Oh crap, Peter is running that query and
and that's why everything is at a stand-still..."

~~~
omgbear
You can set up Workload Management[1] to restrict the amount of compute /
query_slots each query/user can use. It splits the memory/compute into slices,
and queries can use multiple slices, so you can get some fine-grained control,
but it takes a bunch of work.

[1] [http://docs.aws.amazon.com/redshift/latest/dg/cm-c-
modifying...](http://docs.aws.amazon.com/redshift/latest/dg/cm-c-modifying-
wlm-configuration.html)

------
ernestipark
Curious if your funnels are just queries directly in Redshift or if there's
more going on behind the scenes.

~~~
silverrc21
Amplitude here - Most of our dashboards are powered separately from Redshift.
We offer Redshift access as a way for our customers to answer more complex
questions not offered by the dashboards.

~~~
ripberge
Why not power your dashboards with it? What do you use?

I am considering using a columnar data store (maybe redshift) with a BI tool
like bimeanalytics.com specifically to do dashboards.

~~~
nemothekid
My guess is latency - using Redshift for short lived, small queries might not
be the best.

------
fsaintjacques
That extra order of magnitude you pay in pricing you gain in response time.

~~~
exelius
Yeah, but a data warehouse isn't supposed to have great response times. Data
warehouses are for large, low-value sets of historical data that you don't
always know how you want to use.

If you want to use data in real-time, you should be driving it from your
transactional systems. Redshift and other data warehouse solutions are for
doing reporting and dashboards, not triggering real-time reactions.

~~~
luckydata
Well, used to be true, but now those systems are converging. -- Full
disclosure, I work for a company working on exactly that problem called
Treasure Data.

~~~
exelius
Most companies are generally more concerned about reducing their data
warehouse costs than they are about improving the performance of their data
warehouses. Many companies implement a multi-tiered DW structure to get a mix
of the two, but the core driver is managing the cost of storing petabytes of
data while keeping performance acceptable.

------
luckydata
How do you guys handle the constant shifting of analytic schema that happens
when handling a fast iterating application?

~~~
silverrc21
We update the table schema as we run into new fields, up to a limit. We also
store the unstructured part of the data in a column that can be queried via
json_extract_path_text.

------
bnastic
O.T. but what the hell is a "Director Of Customer Success"?

~~~
blumkvist
It's pretty straightfoward I think.

You have a complex product/service, with very diverse application scenarios.
=> Customer adoption is hindered by this complexity. => Customer is not
getting value => Customer is angry and stops paying

You hire a person who is familiar with the applications of your technology. He
talks to customers to figure out what they want to do, how they plan to
achieve it, what the hurdles. He helps them. Writes best practices,
implementation plans, helps marketing to position and sales to close.

It turns out so good that you hire many such people, who specialize in
particular customer segments. Those people need management.

You need Director of customer success.

It's something between account manager/service/marketing.

------
jlintz
Is each customer given their own redshift cluster for their data?

~~~
sskates
No, clusters are multi-tenant. We have a cap on the number of customers per
cluster and we monitor usage to make sure no one customer is hammering the
cluster.

------
deeviant
Redshift is like a prison, but with excellent accommodations. It's a great
platform but it pretty much the perfect example of vendor lock-in.

~~~
paladin314159
Compared to the lock-in of the AWS ecosystem in general, Redshift honestly
isn't that bad. You can unload all of your data into S3 and then do whatever
you want with it. I'd be surprised if most data warehousing solutions had such
an easy way of exporting the data.

~~~
vosper
In addition, if you store your data in S3 and have Redshift load it from there
then you don't even need to do an export - just leave your source data in S3
after Redshift's loaded it, and you're all ready to switch to another
platform.

------
blumkvist
Greenplum (and associated tech) is partly open source now.

