Hacker News new | past | comments | ask | show | jobs | submit | woolr's comments login

Can't repro some of the numbers in this blog post, for example:

  from sentence_transformers import SentenceTransformer
  from sentence_transformers import util

  model = SentenceTransformer('all-MiniLM-L6-v2')

  data_to_check = [
    "I have recieved wrong package",
    "I hve recieved wrong package"
  ]
  embeddings = model.encode(data_to_check)
  util.cos_sim(embeddings, embeddings)
Outputs:

  tensor([[1.0000, 0.9749],
        [0.9749, 1.0000]])

Your data differs from theirs - they have "I have received wrong package" vs "I hve received wrong pckage", you misspelled "received" in both and didn't omit an "a" from "package" in the "bad" data

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: