Read the article. It claims they have found a better estimate of word importance... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		ntoshev on April 6, 2009 \| parent \| context \| favorite \| on: Quantum mathematics could improve web searches Read the article. It claims they have found a better estimate of word importance in a document than tf-idf, which would be very significant. Also it doesn't seem to need text segmentation, which means no need for language-specific tokenizers. The research paper is here (haven't read it yet): http://bioinfo2.ugr.es/publi/PRE09.pdf

nopassrecover on April 7, 2009 | [–]

Hmm I have no background but how can they determine what a 'term' is for the purposes of frequency without some form of tokenization unless they are using an arbitrary maximum length on 'term' sizes and are eliminating small terms.

eru on April 6, 2009 | [–]

tf–idf: term frequency–inverse document frequency

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact