Hacker News new | past | comments | ask | show | jobs | submit login
CodeTF: One-Stop Transformer Library for State-of-the-Art Code LLM (arxiv.org)
95 points by pabo on June 7, 2023 | hide | past | favorite | 6 comments



Link to the github repo: https://github.com/salesforce/CodeTF

It would be helpful to see some Colab notebook examples of how I could use this or incorporate my own codebase with these open source coding models.

The examples show some smaller interesting prediction task & translation between csharp and java, but probably easier to try it out in Colab than having to install locally.

I would also want to be able to compare Github Copilot's autocomplete with what CodeT5 would return as a prediction.


salesforce really seems to invest a lot of resources into coding focused applications for LLMs, which is of course great, especially as they seem very transparent, sharing both papers and usable implementations[0]. However, I feel that I am really starting to lose track over the differences in their releases (T5 vs T5+ vs Gen vs this), especially as they come so fast.

Perhaps this reflects poorly on me, but I find it hard to really stay up to date with the consistent stream of preprints and releases (not just from salesforce) as it tends to take me a while before fully internalizing what makes the newest developments so special, so I was very happy to find that they added an overview that compares different supported models (including their releases) to the newest repo[1].

Of course, size does not correlate with performance, but it still helped me to get a better grip on what they mean by "one-stop Python transformer-based library for code large language models (Code LLMs) and code intelligence" and how that relates to existing models.

CodeTF very grossly oversimplified, intends to make working with models, in a multitude of ways, easier.

[0] https://github.com/orgs/salesforce/repositories?q=llm

[1] https://github.com/salesforce/CodeTF


Im hacking some really dumb code this week & just tried to get StarCoder.cpp to give me some help, but I don't have any idea how to prompt it to work with code I already have.

I was really surprised that all the HuggingFace stuff needed an account. I didn't have any faith my data would stay local, I didn't understand what that was all for. Which sucks a bit because StarCoder seems to have a fairly friendly vscode extension, Im just too scared to use it.

I think maybe the trick is to just write code comments & ask for help in them? The vscode extension seems to just upload the file, wrapping everything before your cursor in {start token}/* your code here */{end token}.

I'm obviously a total newb here, but new a little tiny bit about LLM, how they are tokenizing systems. It still stuns me a bit seeing that these systems absolutely have the most minimal ability to capture context/hints from the rest of the project, from typescript definitions.


I'm also struggling a bit to get some decent code output (TS) from local LLMs via oobabooga. I've been getting mixed results and I think it's pretty sensitive to the exact combo of model/parameters/prompt. If any one is off or not tuned to work with the others then results aren't great.

E.g. some models expect very specific formatting of prompts. Some models give nonsense output at temperature=0.7 while others work great. Some models have been fine-tuned on chatbot-bot like instructions (ala chatgpt, so-called "Instruct" models) while others just autocomplete the given prompt. Very important to use the right combo to get the best results.


Seems no support for C, C++, Rust out of the box, sadly.


Is there one complete example of effective code generation comparable to OpenAI?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: