Hacker News new | past | comments | ask | show | jobs | submit login

My company, Luminoso, uses Conceptnet Numberbatch as one component for interpreting the meaning of text. When the benchmarks went up, the understandability of its results did too. You get better search results, better topics, clearer visualizations.

I'm not just trying to squeeze out extra performance, I'm trying to make computers understand text better. The benchmarks are the evidence that it's better. I do consider myself a researcher despite leaving academia, and having some respect for evidence is part of that identity.

When academia decides to disregard evidence because evidence is for stupid Kagglers (I don't use Kaggle but I respect a good result when I see one), that's how they end up lagging behind open source software.

I understand that it's not worthwhile to chase every evaluation. For example, some evaluations are unrepresentative. Some evaluations, like parsing according to the Penn Treebank, were once useful but have been squeezed dry in a way that doesn't reflect real-world performance. And some tasks chase these unhelpful evaluations.

But I would credit Kaggle with making academics realize, slowly, that they should use random forests as a baseline when evaluating a classification method. People were content to publish classifiers that were worse than random forests until Kaggle presented overwhelming evidence that random forests worked better than many techniques.

In text understanding, the fact that seems not to have taken hold -- one that I think should be obvious, even -- is "computers understand text better when they know more facts about words". This is what ConceptNet (not the whole ensemble, but ConceptNet itself) is about.




This is all good. It justifies a technical report, a blog post, a workshop paper, or publication in venues looking for this kind of owrk.

It doesn't necessarily justify publication in a venue looking for totally new ideas.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: