So about 50% more words for French. The very extreme tail is less interesting because it captures the size of the dictionary, and words very rarely used.
I think the exponent of the tail would be the most relevant metric, but I can't open those pdf. Can someone plot the inverse CDF and make a log-log plot?
So about 50% more words for French. The very extreme tail is less interesting because it captures the size of the dictionary, and words very rarely used.
I think the exponent of the tail would be the most relevant metric, but I can't open those pdf. Can someone plot the inverse CDF and make a log-log plot?