Hacker News new | past | comments | ask | show | jobs | submit login

From the copilot telemetry docs [0]:

> The GitHub Copilot collects activity from the user’s Visual Studio Code editor, tied to a timestamp, and metadata.

[...]

> This data will only be used by GitHub for:

[...]

> - Improving the underlying code generation models, e.g. by providing positive and negative examples (but always so that your private code is not used as input to suggest code for other users of GitHub Copilot)

I'm inclined to believe this. After all, why would they taint the training data with code from a random guy who is asking for help when they have more than a hundred thousand repos with 100+ stars?

[0] [https://docs.github.com/en/github/copilot/about-github-copil...]




Sure, perhaps they won't use it for training, but the fear is they would use it for corporate espionage, market research, etc. Compared to training, this would be both more useful and much easier to keep deniable / under wraps.

I've had previous employers that were highly concerned about far more innocuous data leaking to competitors, e.g. autocomplete search terms. This means that a Copilot installation at any company that competes (or might compete) with MS should be considered a security breach, IMO. Given Microsoft's presence in so many markets, in general I think it would be foolish to risk this at any company.


You're still sending your intellectual property to Microsoft and hoping that they do only what they say they do with it, and that whatever they do with it will never change.


From context, I thought you were suggesting the strategy "just type pirated proprietary code into the IDE and the Copilot plugin will automatically include it in the training data", since my earlier comment was about the difficulty of training Copilot on such code. I don't believe they won't abuse your work in other ways either.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: