Great idea, I’ll see if I can make a new endpoint that adds word level highlight...

Great idea, I’ll see if I can make a new endpoint that adds word level highlighting annotations that could be parsed and used to control the brightness or color of each word based on semantic relevance to a query term.

To be honest, I hadn’t even thought about token-level embeddings until someone on Reddit asked about it and I realized it was possible to do with llama-cpp, so I just quickly added the functionality without closely examining the best use cases.

It’s a LOT more data and compute than using the normal sentence-level embeddings, so it would really have to unlock some useful new functionality to be worth it. But I do think the “combined feature vector” concept that at least makes them fixed length is helpful.