Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Retrieval Augmented Generation Optimised LLM's (huggingface.co)
2 points by alextttty on Aug 17, 2023 | hide | past | favorite
I'm super excited to show you newly published DocsGPT llm’s on Hugging Face, tailor-made for tasks some of you asked for. From Documentation-based QA, RAG (Retrieval Augmented Generation) to assisting developers and tech support teams by conversing with your data! (basically the same thing tbh, all started by 2020 Retrieval Augmented Generation for Knowledge-Intensive NLP Tasks paper)

Fine-tuned with 50k high-quality examples using the Lora process! Took around 2 days for smaller ones and 4 for a large one, 2 epochs each.

Check them out:

    DocsGPT-7b-falcon

    DocsGPT-14b

    DocsGPT-40b-falcon

Why I think its useful?

Improved explicit info extraction from sources

Reduced hallucinations

No repeating at the end

Name Base Model Requirements (or similar) GPU Docsgpt-7b-falcon Falcon-7b 1xA10G Docsgpt-14b llama-2-13b-hf 2xA10 Docsgpt-40b-falcon falcon-40b 8xA10G

You can also use bitsnbytes to run the with less memory

A snippet to jumpstart:

python

model = "Arc53/docsgpt-7b-falcon"

tokenizer = AutoTokenizer.from_pretrained(model) pipeline = transformers.pipeline( "text-generation", model=model, tokenizer=tokenizer, torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto", )

License? Apache-2.0

Will publish gglm versions if you guys like them, im also hoping a can tune a nice 3b sized model in future too.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: