
Kite: Thoughts on Security - adamsmith
https://kite.com/blog/thoughts-on-security/
======
tptacek
Something people should consider when thinking about tools like Kite:

If you're a contract developer, or a developer working full-time for a
consulting firm, you might not have the authority to determine for yourself
whether it's contractually allowable to upload code to Kite's servers. But if
you're working for a pro shop, you can bet every dollar in your pocket that
the contracts your firm has with its clients technically prohibit it.

Kite is really neat, but I'm a little uncomfortable with the idea that I'd
have to remember to remind consulting vendors not to let their developers use
it when working with my codebase.

~~~
adamsmith
This is great feedback, and definitely something we need to address.

Initially we've been focused on giving users control and transparency. We need
to extend this to employers.

One quick idea that we'd love your feedback on: a .kiteignore file that can
exclude Kite from responding to files that match a certain pattern (e.g.
"secrets.py"). Presumably an employer could put a "[^.]*" in a repo-level
.kiteignore file, and anyone working with that repo would have to explicitly
delete that file to use Kite with it.

We'd love to hear any other ideas folks might have as well!

~~~
fweespee_ch
The only way to truly solve this problem is on-premise devices that do not
push data to Kite, honestly.

I'd need to ignore every file I have.

~~~
LeifCarrotson
Why does the device need to be on-premise? Would a Github Enterprise-like
setup, where you have a dedicated Kite instance/VPS at their datacenter, work
for you?

~~~
fweespee_ch
Contractual agreements, NDAs, etc. might make it impossible to disclose work
product and/or source code to a 3rd party which would include Kite.

> If you're a contract developer, or a developer working full-time for a
> consulting firm, you might not have the authority to determine for yourself
> whether it's contractually allowable to upload code to Kite's servers. But
> if you're working for a pro shop, you can bet every dollar in your pocket
> that the contracts your firm has with its clients technically prohibit it.

As tptacek said, you might be prohibited and/or not have the authority to do
so.

> Why does the device need to be on-premise? Would a Github Enterprise-like
> setup, where you have a dedicated Kite instance/VPS at their datacenter,
> work for you?

We run GitLab on-premise so I'm not familiar enough to answer. However, if by
"their" you mean Kite? Yeah, that won't work as its essentially the same
thing. [e.g. Disclosing it to a 3rd party outside of my control]

------
dangoor
I had the top comment on the original post and was critical of Kite's
approach. I still think that Jetbrains has proven that you can be very
effective locally without consuming your entire machine, because the needs of
one project do not require making sense of all Python libraries. You only need
to make sense of the fairly small subset of libraries that a given project
actually uses.

That said, I think this response is terrific. The value proposition was
outlined well. Feedback loops are important and cloud services naturally have
much tighter loops. Security considerations are no different than they are for
GitHub, so those will not be insurmountable for many.

So, neat looking product and very nice response to initial feedback!

~~~
Zombieball
Expanding upon this line of thought I am not sure their comment RE: storing
data indexed in 32GB of RAM holds any meaning considering it requires a hop
over the wire to access. Locally stored indexes on disk would likely be much
faster.

I would imagine a smooth update mechanism could allow you to update file
indexes without loosing too much agility on feature development but come with
huge security gains.

~~~
MichaelGG
Well they say "machines" so it might be much more than 32GB. (Though on disk
it might be strongly compressed.) And on HDDs, a few seeks is actually slower
than a network round-trip.

~~~
Zombieball
Uncompressed might indeed eat up tons of disk space.

Admittedly I was thinking of an SSD + some caching in RAM when thinking of
this timing. But I was always under the impression an HDD seek would be on the
order of 10's of ms? I'd assume a query to a cloud service would be on the
order of 100-200ms.

~~~
MichaelGG
The network latency should be somewhere around 40ms assuming you're in the US
on a proper connection. If a query takes 10 disk seeks (10ms each) vs 10 RAM
accesses + HTTP parsing...

------
stegosaurus
It's not really about 'security being addressed', we all know that. It's about
marketers creating the illusion that security has been addressed for end users
that don't understand it anyway. In the case of stuff like Windows, it's about
eventually forcibly removing control from the users by pushing updates.

I think it's dishonest to not just simply post that 'if you're afraid of the
cloud, you are not our target market, go away'. I guess it's a capitalism
thing. Wouldn't look good to investors, or whatever.

I am sad because I have a beefy machine, and I want to use this, but I can't.
I'd pay for it, you know? But I don't 'do' SaaS, the reasons are too long to
list here.

More concretely, 32GB RAM is trivial, and preselecting my languages is... I've
already pre-selected them! It takes months, years to learn a language :P

Kite looks really super cool.

------
nikolay
You "launched", really? Maybe internally, but I haven't received even an email
confirmation that you've got my "signup request". As far as most of us are
concerned, you launched a video.

~~~
polartx
second that. I never got so much as a confirmation email when I signed up.
Signed up again today--I'd really love to have a tool like this while I
attempt to learn programming.

------
educar
Many people complaining about security because of the nature of the 'cloud'
here. This is what I thought when github came out as well. But look today,
everyone has their code (supposedly their IP) on github. Ulitmately, kite's
track record on the cloud will trump any security considerations.

~~~
welder
The difference here is GitHub doesn't auto-upload Python files without me
knowing. If I open my secrets.py file I know it's not getting uploaded to
GitHub because I see it's absent from the staged files.

I don't think Kite prompts you before uploading every time you open a file.

~~~
franciscop
Maybe add a .kiteignore then?

------
richard_mcp
This sort of open communication is great way to start to build trust of users
(both relating to security and otherwise). I was thrilled to see the devs
responding to comments positively in both the HN and reddit threads.

I'm looking forward to seeing Kite expand their security support and I can't
wait to try it out on Linux.

------
zuck9
Reposting since it wasn't answered earlier:

Even though I trust you, there's no way anyone can guarantee that a hacker
won't get into your database and get my proprietary source code.

I'm no security expert but one way I can think of is creating an encryption
system which works like this: all my source code will be stored encrypted on
your (non-ephemeral) databases. The decryption key will be stored on my
computer, and it'll be transferred to the server when I run Kite and destroyed
as soon as I quit Kite. The key will be stored in your server only in an
ephemeral storage (in-memory database etc.) Do you have something like this in
the works?

~~~
nickles
The approach you described is largely security theater. Supposing an attacker
has compromised a machine and is capable of retrieving stored data, it
reasonably likely that the attacker will be capable of either capturing the
key as it is transmitted or reading the key while it is stored in memory.

If you start with the assumption that a machine is compromised, then there's
not really a way to guarantee secrecy of anything done on the machine.
Homomorphic encryption resolves this, but (as far as I'm aware) it is too
computationally expensive to be viable at present.

------
CameronBanga
Some things I didn't seem mentioned (I may have skipped over) which would be
awesome:

* Let us delete all of the data we have stored on your servers, whenever we want. * Let us see all of the data we have stored on your servers, whenever we want. I really don't care how you're manipulating it, but would like to see (and additionally delete) any information you have stored on me that I'm uncomfortable with.

~~~
LeifCarrotson
While those would help with consumer confidence, the reason the EULA probably
disclaims all that is because the data is very hard to track down.

It may be on magnetic tape backup in archives. Do they have to dig out each
roll of tape each time for each customer who wants to erase a segment of their
data?

It may be indexed or used as input to shape a larger or disconnected part of
the software. It shows up in logs, and therefore in archived statistics
analysis. It may have been copied and modified incrementally by a dozen other
users. Should all these developments from user data be destroyed because they
are based on old data with a delete request?

Perhaps an opt-in to a feature that doesn't back up or analyze your data for
an hour could help prevent "committed the password to version control"
accidents. But it would need to be opt-in, because the whole point of Kite is
that your code is indexed in real time, right?

------
Alex3917
Not (directly) related to the security, but I was wondering if you were
inspired by any research from academia when creating this. I only ask because
I just watched a talk from a Stanford professor from 2012 where he talks about
anonymizing and aggregating everyone's code in the cloud to create better
documentation, albeit as a one sentence aside at the end of an otherwise
unrelated talk.

~~~
adamsmith
The initial idea came from thinking about live contextual search and
community. This thinking intersected with programming because of my
background, and once that happened there was a lot of iteration—informed by
research, and mostly conversations with friends—to get where we are today.

------
mbrock
I'd like to hear a statement like this that explicitly acknowledges something
like, "as a private company funded by Silicon Valley investors, we need a
clear way to capture more and more value, and that's a big reason why we want
to collect your data on our servers and use a client model for our proprietary
algorithms."

That's how it works, we all get it!

------
borski
For the record, we went through this too. Kite is awesome, and I have no doubt
it will succeed, but we fought the enterprise virtual appliance train for
years. The cloud grew, we got more customers, but there were certain verticals
we could never reach: finance, government, etc.

We recently built a virtual appliance. It's growing infinitely faster than our
cloud solution ever did. Those numbers speak for themselves.

Certain products just need to have the virtual appliance option. I'm sure Kite
will get there one day.

For now, I'm going to use it for personal projects because it's still a badass
solution to a problem I have. :)

------
chinathrow
"Some folks still use Garmin GPS due to privacy concerns, but most of the
world uses internet-connected navigation for its many advantages: fresher
maps, more coverage, better tuned navigation algorithms, better user
experience because iteration is 10x cheaper, etc."

There is one thing missing: people use Google Maps, Waze etc simply because
it's free.

------
andy_ppp
Just out of interest is there a plan to make Kite work on an iPad rather than
just a window on my computer - even some kind of screen sharing with (basic)
touch support would be excellent.

I will probably never use it because it scares the crap out of me that I'll
type my password in the wrong window though :-)

------
w8rbt
This is more about Thoughts on Privacy than security.

------
nikolay
Unfortunately,

    
    
        Convenience > Security
    

Laziness is both a curse and a blessing.

------
pbreit
So how do I sign up?

~~~
stronglikedan
[https://kite.com/](https://kite.com/)

You sign up for an invitation. I am curious to know how long that invite
typically takes. I just signed up about an hour ago.

~~~
pbreit
I "signed up" last week. Nothing yet.

~~~
maxpupmax
Same. I have heard no anecdotes of people being invited yet either.

------
franciscop
> "we believe we will set industry standards that will be adopted across
> multiple categories of tools such as continuous integration and code review
> systems"

Excuse me? Why is that? It sounds like either they think they are programming
wizards or they believe the CI folks are incompetent, none of which signals a
company I would like to trust.

