
Understanding and explaining accuracy in machine learning systems - prabhatjha
https://engineering.wootric.com/understanding-accuracy-in-ml-systems
======
sharkenstein
What are other metrics that can help validate the F1 score? I'm asking because
I believe having a small sample size can skew the number or hide a flaw in the
classification algorithm it's scoring.

In other words, what else should I ask to validate that a 70% F1-score is
better than a 90% F1-score but on a smaller data set?

~~~
rsmith49
That is a good point. It helps to think of Precision and Recall (so by
extension F1-Score) from your test data as random variables sampled from a
distribution modeling the probability of getting each value in your sample
based on a "True" precision/recall value. I won't go too deep into the math,
but this was part of the approach in the confidence calculations towards the
end of the paper: being able to factor in the uncertainty of your
classification metrics to confidence calculations.

To formally answer your question, the main things that matter in determining
how stable your F1-Score from your test set is are: \- Size of the test set \-
% of test set that has the label (in our case feedback tag) \- the values
found for precision and recall

------
MallikaSinha
If you have a large dataset and you train your data accuracy till 100%, what
about the noisy data? Will this classification system will reduce the noisy
values in the system? Machines during training or runtime prediction do have
some amount of noises. Can we reduce those through this classification?

~~~
rsmith49
That depends on what you mean by noise. In the case of text classification, a
lot of the noise in training is disagreement between human labelings.
Unfortunately, classification systems will only be as good as the labels they
are trained on.

However, if you are referring to noise as in typos and misspellings - then
yes, depending on the training and/or preprocessing steps, classification
systems could potentially reduce the noise in the input data to still achieve
good results.

~~~
MallikaSinha
By noise I mean to say that overfitting your data which means it starts to
take your wrong input data and plot the graph involving those input data as
well. Anyway, I got your point. Thankyou

