Hacker News new | past | comments | ask | show | jobs | submit login

This may be a dumb question but with OpenAI embeddings do we need to use cosine similarity or is the simple distance equivalent? I used cosine similarity before but not sure.



Blog author. You can choose to use any distance metrics. One reason cosine similarity is popular (and used) is that for many of these higher dimensional datasets, it gives a better representation of "nearness" across all the data basd on the nature of "angular" distance. But depending on how your data is distributed, something like L2 distance (Euclidean) could make more sense.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: