
Gender Distribution in North Korean Posters - fcambus
http://digitalnk.com/blog/2017/09/30/gender-distribution-in-north-korean-posters/
======
groceryheist
Hackernews likes this as a practicable application of transfer learning. Fans
of machine learning want to see transfer learning as more than a cool trick.

Unfortunately, there is really no good reason for someone seriously interested
in accurate information, like a researcher or journalist, to use machine
learning for this particular task. Labeling a couple thousand images yourself
or with friends is not that big of a task. Do it over a few evenings while
watching TV and drinking beer. You could have mechanical turk workers do it
for you for a few hundred dollars. In either case you will get extremely
reliable information. If you use multiple judges you will have a good estimate
of uncertainty for every classification. There is no way transfer learning can
provide this uncertainty information.

The main advantage of this technique remains the ability to quickly label very
large amounts of data on the order of hundreds of thousands of rows, or
thousands of columns. For smaller data, machine learning can sometimes achieve
marginal improvements in predictive performance through model complexity.
However prediction in smaller data regimes is mainly useful for out-of-sample
prediction. The machine learning paradigm offers limited support for measuring
uncertainty for out of sample predictions, which is super important if you are
a researcher.

One capability of transfer learning could be to support many many applications
from one model, but I have yet to see demonstrations of this in practice. The
problem is that knowing how well learning has transfered requires measuring
generalizabity and so cannot be done blindly.

~~~
anigbrowl
I don't think extreme accuracy matters that much when dealing with large
sample sizes. If the margin of error is under 3% (like opinion polls) then
it's good enough for many purposes.

~~~
groceryheist
The "margin of error" reported in opinion polls is typically just a function
of the sample size and sample variance. The error is assumed to be due to
statistical variation, not to problems with measurement. Measurement error is
a bigger problem than statistical variation because statistical variation can
often be assumed to be unbiased.

My comment above is not about statistical variation, it is about measurement
error. Inaccuracies in models for labeling images are a form of measurement
error. Having humans label a (often stratified) sample of images is required
to understand the measurement error. This requirement is the main difficulty
when it comes to generalizing learning from one context to another. If you use
multiple independent humans to label all your images, you can understand the
errors made by the humans very well.

~~~
anigbrowl
Quite, but my point is that good enough is sufficient for many real-world
purposes, as opposed to theoretical ones.

------
Mz
It might be better if the title were this line from the subtitle:

 _A brief analysis of gender distribution in visual representations of
everyday life in North Korea_

The actual title gave me the impression this was a run of the mill, shallow
complaint about sexism (with, possibly, some ugly politics thrown in to boot).
It took me a while to get curious about it based on other cues that this might
not be the case. This piece is interesting for reasons having nothing to do
with the sexism angle. It is a rich piece about history, culture, AI and
possibly other things, given that I don't have time to read all the way
through it at this time.

------
tomphoolery
This makes sense to me, because North Korea is still technically at war with
South Korea. During WWII, when all members of the male-dominated workforce
were overseas fighting, the propaganda posters were mostly centered around
female participation in the workforce.

If your country has been at war for over 60 years, I suppose your society
would develop a kind-of "gender-agnostic" workforce, or perhaps a total
reversal of demographics from a country who is at peace.

~~~
emodendroket
So why don't you see the same dynamic in South Korea?

~~~
maskedSlacker
Because, while North Korea is still at war, South Korea is not.

~~~
emodendroket
I don't know how you figure.

------
autokad
I like how the bar chart has a shadow 'total' behind the two groups

~~~
yathern
I'm glad it has 'total' \- but for this type of data, I'd prefer it it looked
something more like
[this]([https://www.livepopulation.com/images/chart_sex_ratio_india....](https://www.livepopulation.com/images/chart_sex_ratio_india.png))
where distribution and total size are both easily viewable

~~~
autokad
that has its benefits, but you also loose the ability to see the difference
between men and women across the distribution. like the ratio at 70+ vs the
ratio at 0-4

~~~
yathern
True - I tried to find an example where it was centered horizontally - but
couldn't quickly. Ideally I would like this (ignore the terrible paint job and
alpha channel messing things up):

[https://imgur.com/B6enQhw](https://imgur.com/B6enQhw)

Where you see both volume, distribution, percent, as well as the trendline
(white) where you can see how the distribution changes - how the total count
changes, and easily understand any data point at any given spot.

------
anigbrowl
Most interesting. It'd be nice to see this sorted by age as well, to infer
whether the distribution of gender had changed over time. And...well it's easy
to imagine many other contexts that it would be good to see similar analyses
applied to.

I wonder if/how this will change now that Kim Jong Un has promoted his sister
to more visible public roles.

------
alvil
I like North Korean look of Ubuntu UI :)

------
zitterbewegung
It would be interesting to take multiple regimes and see if there are
similarities between them.

