

Be cautious about using Chartio (or at least, don’t follow their directions) - aiiane
http://codingkilledthecat.wordpress.com/2013/05/10/please-dont-use-chartio-or-at-least-dont-follow-their-directions/

======
thingsilearned
Hi, founder of Chartio here. Other than the title (which makes me a little
sad) I liked your post. Its great to fully inform people of the security
tradeoffs and you've done a nice job of laying out the levels and options of
security that we've spent a lot of time developing.

In anything that is cloud based there is going to be some level where some
hacker could get in and destroy everything. Most people on this site use cloud
hosted servers, all of which would be at risk if Amazon or Rackspace got
hacked. BI in the cloud is a new space and will be cautiously entered by some,
but benefits will outweigh the potential risks and just as has happened in
every other segment of cloud computing.

(will write more soon)

~~~
aiiane
Let's go for less sad. I've changed the title a little to better reflect the
intent of the post.

With regards to cloud - you're right, Amazon and Rackspace and so on are a
single point of failure for a lot of businesses... but they also have a lot of
people dedicated specifically to keeping their systems secure. The average
startup, on the other hand, doesn't.

~~~
thingsilearned
I'm curious, what was the reason you chose to highly Chartio out of all the
companies in the cloud BI space? We actually feel that we have the best
security practices in the space, mostly due to the fact that we're the only
ones not doing data warehousing, where you're required to upload a copy of
your database to the provider.

~~~
aiiane
Luck of the draw, actually. Someone in an IRC channel I'm in mentioned it [in
the context of "someone asked me to set this up, I told them heck no"], I
glanced at the page, did a double-take.

It's debatable whether DWH is more or less secure than your approach - and it
also depends heavily on how the DWH is done. Having to explicitly move data
around also gives an opportunity to scrub it.

For the record, the proactive approach you're taking with your responses here
is heartening. The goal of my posts is always, in the end, to push for
something better, not just tear down what's there. Glad to see that you've an
open mind towards improvements.

------
kevinpfab
I'm a founder of Emergent One, a startup that uses similar agent-based
technology to Chartio to access production databases and build out RESTful
APIs. Some of our APIs are write-enabled, which makes the proposed risk even
higher than that of Chartio's.

We've spent a _lot_ of time thinking about security risks and writing code to
reduce them. I thought I'd share a few things we do and that we've learned
from our experience:

* The agent approach is the most popular because it allows for a system administrator to easily sever the connection from the database server without having to worry about writing queries to revoke user access.

* We never run unindexed queries without an explicit request from a customer and a manual entry from an Emergent One employee.

* We're currently looking into security consultants to continuously test our production environment.

* We're building an appliance version of our software much like Github Enterprise in order to accommodate the customers that aren't comfortable with their data hitting the cloud.

* We strive to have very quick and personal customer service directly from engineers. The vast majority of the responses are within the hour.

and last but certainly not least...

* The very best thing we can do is be honest and straightforward about the inherent risk behind our platform. Being able to build and maintain a pristine level of trust is the only thing that will keep us in business.

I'm sure Chartio does things very similarly. Direct-database access technology
is not for everyone, but it's also proving to be extremely valuable for both
Chartio's customers and ours. The cloud advantage that makes most SaaS
software great is still there.

~~~
thingsilearned
Hi Kevin, thanks for the great response! We should continue to share notes on
best practices in all of this.

------
sehrope
Having a product that also involves connecting to other people's databases
we're pretty well versed in this problem. I will admit its a bold step to
allow an external service access to your production databases but as a trade
off of convenience vs security the former does win more often then people
would think. This is especially true for people using PaaS/DBaaS providers
which by default allow access from all inbound IPs (ie not white listed).

One thing we try to do is be open about all this and we explicitly mention it
in our docs[1]. This includes instructing users to explicitly limit the
permissions that they grant.

Another poster mentions loading a pre built VM. thats actually our goal for
the enterprise as there will always be systems are not (and should not!) be
accessible to the open Internet. In the meantime though there plenty of folks
happy with the convenience of living in the cloud.

[1]: <http://www.jackdb.com/docs/#security>

------
peteforde
While there are clearly some significant security considerations, the author
writes Chartio off without considering applications where it could be a good
fit.

Also, the oozing, patronizing tone of the author is really annoying. Just make
your points and be nice about it.

~~~
aiiane
From the article: "It’s quite possible that the risks I’ve highlighted above
are ones that you feel it’s okay to take, and in that case, go for it – as
long as you’re respecting your end users’ interests as well."

~~~
stdbrouw
That's not "considering applications where it could be a good fit".

------
joevandyk
What I've done when using chartio with postgresql:

* make a new schema called chartio_only

* add a chartio database user that can only access that schema

* add views to the chartio_only schema

Now chartio can only access the views in the chartio_only schema.

------
capkutay
Chartio certainly isn't the only company guilty of this.

Take Splunk, a $4 billion dollar (in market cap) machine log analytics
service. They rolled out a new service called DB connect which allows you to
integrate database analytics with their platform. Their implementation of DB
connect is equally intrusive. However, I can still see these services being
useful for visualizing operations that are less critical.

~~~
cerales
One could write a whole essay on the inability of this $4bn company to fix the
most basic bugs in their product. I'd love to see evidence of people actually
usefully using the features like this that Splunk provides.

------
cerales
I work in the BI / analytics space, with a fair amount of what's called "big
data". The security of something like cloud IO is typically not too much of a
concern on the kind of projects I work on: more often than not, our clients
deploy dedicated analytics databases. Perhaps I'm biased as we tend to work
from the outside, but I'd be surprised to find many sysadmins allowing chart
IO to connect to the production DB. The ETL process for analytics databases
tends to either obscure user data or aggregate it to an impersonal level of
detail.

I'm more concerned about the more enterprise-y products aggressively
advertising 'enterprise' security features like Active Directory integration,
smoothing over their total lack of transparency in vulnerabilities and bugs.

------
gkoberger
Of course giving out access is probably a bad idea -- so what would be a good
alternative for them to do instead?

~~~
swisspol
In our case we set up a dedicated cheap database server only for ChartIO. Then
we have a script that copies to it analytics data from our production servers
every few hours (while also anonymizing what needs to be just to be extra
safe).

~~~
thingsilearned
Great use case! We don't support an API that you can push data to, but that's
because whatever API we build, it would never be easier to learn or have as
much language support than just spinning up a PostgreSQL instance and writing
the data to it. Quite a few of our customers use us this way.

~~~
swisspol
You guys should really suggest this approach in the setup guide (it can be
phrased in a way not to scare potential users). The alternative of directly
accessing production database servers is really risky.

~~~
thingsilearned
Agreed. I'll work on it.

------
mfeldman
The options typically are: 1) Run this package on your own (SQL/HDFS) server
and pay use for a licence. Keep your own data and maintain your own servers.
2) Send us the data and we will store a replicate which we serve back to you
in dashboards. 3) Let your users use your data on your servers through our web
interface.

Each involve giving away some type of control.

------
markhelo
What we have done is use Chartio but connect it to our Slave database. I like
this approach for many reasons. Our charts implicitly test that our slave
connections are working well. If compromised, it wont _immediately_ impact our
site. You could argue it will eventually. All the load from Chartio is on our
slave which again does not impact our production.

------
cpncrunch
A Y-combinator startup asked me to set up chartio for them. The script
generated some random error during setup, so I logged a support ticket with
chartio. A month later they still hadn't bothered replying. I'll forgive you
having a buggy setup script, but I won't forgive you when you don't even
bother replying to my support query about your buggy product.

~~~
thingsilearned
Super odd... We use Zendesk and I don't have a record of any open support
tickets. Feel free to send me a note directly to dave at chartio.com if you're
still having an issue.

~~~
cpncrunch
I see it is closed now. I opened the ticked on June 25 2012 and it was
resolved on April 1 2013 (although he didn't actually resolve the problem).
Anyway, the client used google analytics instead.

Also your support system is annoying: why do we need to register for support
if we're already registered on your site? And having to have a password with
uppercase and symbol is overkill for a support system (and is pointless
anyway, as you'll probably have to write the password down).

------
bifrost
I've seen stuff like this pretty regularly, and I'm always dumbfounded by how
people don't even think about how bad this is.

This is the exact definition of a "crunchy center", and I wouldn't be
surprised if they do other horrible things. You know for certain there are no
passphrases on those SSH keys either.

