
Show HN: Active learning and model explainability for document classification - Der_Einzige
https://github.com/Hellisotherpeople/Active-Explainable-Classification
======
Der_Einzige
Idea: My model takes in documents from the user using Standard Input - Then
the model classifies, explains why it classified the way it did, and asks the
user if the predicted label is the ground truth or not. User supplies the
ground truth, the model incrementally trains on the new example, and that new
example (with human supplied label) is appended to my dataset and the cycle
continues. This is called active learning

I want to take in massive sums of articles from a news API which will be
placed in their corresponding file based on where my classifier says I should
put them. I have to generate my own labeled data for this. That is a problem.
Most people don't realize that the sample efficiency in models which utilize
transfer learning is so great that AI-assisted data labeling is extremely
useful and can significantly shorten what is ordinarily a painful data
labeling process.

We need a way to quickly create word embedding powered document classifiers
which learn with a human in the loop. For some classes, an extremely limited
number of examples may be all that is necessary to get results that a user
would consider to be successful for their task.

I want to know what my model is learning - so I integrate the word embeddings
available with Flair, combine with Classifiers in Sklearn and
Tensorflow/Keras/PyTorch, and finish it off with a nice squeeze of the LIME
algorithm for model interpretability (implemented within the ELI5 Library)

