
Ask HN: How to categorize block of text - throwawayML
Hello HN,<p>First some background: I have working knowledge of CS and programming, but I don&#x27;t know anything about machine learning, neural networks, natural language processings and related topics.<p>Now to my issue: I have a list of categories, every category specified by some characteristic keywords. I need to make a program, which will take ~1 page of text, analyze it and returns the category to which it fits the best. I&#x27;ll be implementing it in Python3 and I already got some hints, like to use word2vec etc., but I need some more specific strategy. I&#x27;m kind of under time pressure here, so even though I find it interesting, I don&#x27;t have time to study complex documentations or dig into ML&#x2F;NLP theory.<p>If anyone here could point me to some code&#x2F;tutorial already doing what I need (or something close to it) and easy to grasp without much theoretical knowledge, I&#x27;d be very grateful. Thank You!
======
ethiclub
Is ML / complex algorithmic solving really needed here rather than simply
keyword scanning and tagging?

If so: [http://scikit-learn.org/stable/](http://scikit-learn.org/stable/)?

Used in [https://www.taggernews.com/](https://www.taggernews.com/)

See [https://techcrunch.com/2017/05/14/building-a-smarter-
hacker-...](https://techcrunch.com/2017/05/14/building-a-smarter-hacker-
news/). One of the creators posted Tagger News to HN recently, so you might
want to find that and ask them about data classification & clustering.

