
Ask HN: What approach would you suggest for Text classification? - gerenuk
Hey everyone!<p>We are trying to solve a problem where we need to classify the articles into the right categories.<p>Currently, using a FastText to train a model with 100,000 articles categorized into 600 categories. The loss seems to be converging but the precision is not going up, another thing that requires clarification is that can we use pre-trained Wikipedia English embeddings to categorize text.<p>What would you recommend using FastText or some other algorithm&#x2F;approach towards this problem?<p>Any suggestion&#x2F;ideas would be appreciated.<p>Thanks.
======
smithmayowa
FastText is state of the art when it comes to word embedding due to its
ability to generate embedding for even words it has not seen, so perhaps your
problem lies in your model's architecture, are you using convolution neural
nets or just basic feed forward networks I have had great success using CNN
for text classification, and in your words pre-processing are you filtering
out stopwords(very common words in English that throw confusion to a models
ability to correctly classify text's).

