Hacker News new | past | comments | ask | show | jobs | submit login
FauxPilot – An open-source GitHub Copilot server (github.com/fauxpilot)
499 points by mlboss on March 22, 2023 | hide | past | favorite | 52 comments



It seems “Codepilot” is being used as a general name for a coding focused language model. A more literal interpretation of the title suggests you can host Microsoft’s model yourself, which does not seem to be the case (unsurprisingly).

I guess I’m just surprised to see it used this way but “coding focused language model” doesn’t exactly roll off the tongue.


It's called that because it's a implementation of the Copilot API


I always assumed it a name for the application of a language model, ie I’m using this language model as a programming copilot. Some models making better copilots than others.


Could always take the facebook approach and call them CoFLaM.


Promising. Combine that with llama or BLOOM, perhaps finetuned on code, and perhaps your own codebase/documentation, and you cold have an interesting solution.


What is wrong with the CodeGen model that they are using?

It is a reasonably large model (up to 16B params) that has already been trained on both natural language and code. I would expect it to underperform larger models, including GPT-3.5 and GPT-4, but this should still be very useful for autocomplete and boilerplate in simpler cases. It is a bit under trained compared to Chinchilla, T5, or LLaMA, but it still performs well.

According to the paper[1], this model is competitive with the largest Codex model that was the basis for Copilot originally.

[1] https://arxiv.org/pdf/2203.13474.pdf


I haven't ran any of these models yet, I had just assumed CodeGen was less performant for "understanding" prompts. You are right that it's probably enough, especially if fine-tuning is an option.

Now, I wonder: as the code-base grows, how often, and how, should such tuning take place?


Would love to see a side by side comparison of the completions from this with completions from Copilot


this is likely much worse, if copilot uses gpt3.5 turbo or gpt4. who knows though what they're actually using.


Copilot before today was based on GPT-3


Copilot uses "OpenAI’s Codex model, a descendent of GPT-3".

Soon to be GPT-4, though.


How is this possible? They don't even use the same tokens, right? Whitespace is different IIRC.


I think the GPT-3 thing is wrong, but it is not impossible. You can always train an existing neural net to do new things right? Not sure how to handle different whitespace tokens.


the release today for copilot X says it's using gpt4 "With chat and terminal interfaces, support for pull requests, and early adoption of OpenAI’s GPT-4" ~ https://github.com/features/preview/copilot-x


They only positively say that they are using GPT-4 for the new pull requests feature (and one other feature that I forgot). It’s unclear what model they are using for the main copilot code generation feature. It may be that they are only using GPT-4 for features that require the larger context size.


I'm pretty sure that the actual Copilot auto completions are still on GPT3 for speed and cost reasons


Yes, wait list for access to copilotX


I'd also be interested in a "bring your own OpenAI API key" Copilot X clone if anyone knows of one :)


I feel like that's going to be way more expensive than just using Copilot?

Esp since copilot is updating to GPT4 soon anyway


They cost about the same so you might as well have one.


api is quite expensive, but I don't code daily. so no fixed cost is a big plus for me.

(gpt4 api has no base costs like chatGPT plus)


There is an option here: https://codeium.com/


that's just another provider (free though)


I wonder if FauxPilot's models (Salesforce Codegen family) can be quantized and run on the CPU. I was able to run the 350M model on my machine but it wasn't able to compete with Copilot in any way. Salesforce claims their model is competitive with OpenAI Codex their github description[1]. Maybe their largest 16B model is, but I haven't been able to try it.

[1] https://github.com/salesforce/CodeGen


We will add quantized CodeGen for fast inference on CPUs up on cformers (https://github.com/NolanoOrg/cformers/) by later today.


> by later today

Wow, that's the timeframe things are moving at right now, we better get used to it!


Whoa is there a PR or wiki about this


4bit GPTQ maybe?


Are there alternatives that don't require an Nvidia GPU and works with AMD? And more on-brand being hosted on a different code forge?


Should be possible to make it work on AMD. CUDA is just easier to work with.


This makes me glad to have chosen an nvidia gpu on linux. You get a whole lot of hidden benefits / QoL improvements that just aren't there with amd and nobody really mentions.


You also have to deal with proprietary drivers. This can be mostly alright until it's not. Look at the hidden costs of dealing with Nvidia for Wayland for instance. CUDA lock-in is like DirectX lock-in. It should be preferable to have something platform agnostic, not just to support the current alternatives, but for new players to be able to join by supporting the common, open APIs as well.


Careful - If Microsoft doesn't like your wording (open-source GitHub Copilot server ) you'll be quickly kicked off Github


FauxPilot's a project created by moyix, aka Brendan Dolan-Gavitt, a professor at NYU Tandon.

https://www.theregister.com/2022/08/06/fauxpilot_github_copi...

https://engineering.nyu.edu/faculty/brendan-dolan-gavitt

Moyix is worth following, because he's always up to something cool.


From the setup.sh, the VRAM requirements are:

    echo "[1] codegen-350M-mono (2GB total VRAM required; Python-only)"
    echo "[2] codegen-350M-multi (2GB total VRAM required; multi-language)"
    echo "[3] codegen-2B-mono (7GB total VRAM required; Python-only)"
    echo "[4] codegen-2B-multi (7GB total VRAM required; multi-language)"
    echo "[5] codegen-6B-mono (13GB total VRAM required; Python-only)"
    echo "[6] codegen-6B-multi (13GB total VRAM required; multi-language)"
    echo "[7] codegen-16B-mono (32GB total VRAM required; Python-only)"
    echo "[8] codegen-16B-multi (32GB total VRAM required; multi-language)"
So I could try the 350M models on a laptop with an NVidia card of 2GB.

Another factoid, I noticed vultr is offering fauxpilot images in their GPU instance provisioning menu.

The model used (Salesforce Codegen) is on huggingface: https://huggingface.co/models?search=salesforce+codegen


Anyone try some quantized model with this like llama?


I wonder... is MS likely going to make it harder to swap the URL on the official copilot client in light of this? Will they continue hosting the "competition"?


There’s an alternative client that already exists for neovim, it’s actually better than the official client IMO, and should be easy to change the URL there!


I'm pretty sure VS Code Extensions aren't compiled.


They can still lock them down, like Mozilla did with Firefox extensions...


VS Code is an electron app so much harder


I don't think so, in both places have code that's running the extensions that you can change.


Except one of them is a C/Rust app, and one is a Javascript app. So one is harder to lock down extensions on than the other.


The integration with Copilot doesn’t quite compete with MS, because, well, using the extension requires a functioning subscription. You’d be paying MS, regardless of whether you point it to fauxpilot or copilot.

But there are other extensions that are specifically made for fauxpilot which work well.


It has existed since October 2022, so...


This is the first I'm hearing of it. I wonder how many others as well are seeing this via HN the first time? I would imagine MS has been aware of this since more or less inception, however, any product threat isn't really about existence, but how viable and how well known/used it is.


+1 for the name


So much potential in this! Would love to run this locally!


Anyone knows how this compares to Tabnine?


Copilot is a million times better then Tabnine. Tabnine was promising, but totally stopped making any improvements to the model after it got bought out years ago.


Is Tabnine still going (and is it any good?) I used to use Kite back in the day.


The free version is average at best. It only really autocompletes a line (if you're lucky). It usually autocompletes a few characters in an expression




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: