Those are typically called “stop words” and the article removed them for the exa... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

pamelafox 36 days ago | parent | context | favorite | on: Cosine Similarity

Those are typically called “stop words” and the article removed them for the example. Many NlP algorithms do remove stop words as a first step.

ubutler 36 days ago [–]

Although worth noting that more recent techniques (eg, Transformers) need stop words for context.

Havoc 36 days ago | [–]

Does it actually help though? I would think embeddings of cat and the cat would be functionally similar

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact