I'm super excited to show you newly published DocsGPT llm’s on Hugging Face, tailor-made for tasks some of you asked for. From Documentation-based QA, RAG (Retrieval Augmented Generation) to assisting developers and tech support teams by conversing with your data! (basically the same thing tbh, all started by 2020 Retrieval Augmented Generation for Knowledge-Intensive NLP Tasks paper)
Fine-tuned with 50k high-quality examples using the Lora process! Took around 2 days for smaller ones and 4 for a large one, 2 epochs each.
Check them out:
DocsGPT-7b-falcon
DocsGPT-14b
DocsGPT-40b-falcon
Why I think its useful?
Improved explicit info extraction from sources
Reduced hallucinations
No repeating at the end
Name Base Model Requirements (or similar) GPU
Docsgpt-7b-falcon Falcon-7b 1xA10G
Docsgpt-14b llama-2-13b-hf 2xA10
Docsgpt-40b-falcon falcon-40b 8xA10G
You can also use bitsnbytes to run the with less memory
A snippet to jumpstart:
python
model = "Arc53/docsgpt-7b-falcon"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto",
)
License? Apache-2.0
Will publish gglm versions if you guys like them, im also hoping a can tune a nice 3b sized model in future too.