The embeddings are computed using llama-cpp, but langchain makes a nice convenience wrapper to directly get them, so I use that. The embeddings are 4096 dimensional vectors.
And no, I haven’t benchmarked them against OpenAI’s embeddings. I should point out that this code will work for any model in GGML format, so if there are fine-tuned Llama2 versions that are optimized for embedding, you could use those instead very easily (or any other model). This project is more about making it easy to go from model to embeddings on demand via an API and then letting you do useful things with those embeddings easily.
And no, I haven’t benchmarked them against OpenAI’s embeddings. I should point out that this code will work for any model in GGML format, so if there are fine-tuned Llama2 versions that are optimized for embedding, you could use those instead very easily (or any other model). This project is more about making it easy to go from model to embeddings on demand via an API and then letting you do useful things with those embeddings easily.