
Tech companies pay poor Kenyans to produce training data for AI - zeristor
https://www.bbc.co.uk/news/technology-46055595?
======
Fricken
I think it's underappreciated just how much human labour goes into annotating
data fro autonomous vehicles. Scale is a fast growing start-up, with 35
employees and around 10,000 contractors, many based out of India and Africa,
who do data annotation for Cruise, Zoox, Lyft, and Nutonomy among others.

[https://techcrunch.com/2018/08/07/scale-whose-army-of-
humans...](https://techcrunch.com/2018/08/07/scale-whose-army-of-humans-
annotate-raw-data-to-train-self-driving-and-other-ai-systems-nabs-18m/)

~~~
api
Given that this training data basically gets loaded into neural nets, etc.,
can it be said that in some indirect sense these data labelers are actually
driving these cars? Are we still just building the mechanical turk?

To be it shows how far we still are from biological intelligent systems. I
don't recall having to point out what an apple is to my daughter more than a
few times. Watching children learn is breathtaking. Even much simpler brains
like those of rodents or cats are far beyond anything built so far in silico.

~~~
david-gpu
Generally, neural networks do not memorize their training data. If they worked
that way, they would not do well with samples they've never seen before.

So no, I would not say that the labellers are driving the cars. They do play
an important role in the process of building these systems, though.

------
NightlyDev
"Brenda trains data used for artificial intelligence"

I like how the writer always mentions that they are training the data used for
AI, when they are in fact tagging data, not training anything...

~~~
improbable22
That's a curious slip, well spotted. I wonder if the caption-writer heard
about "training data" and misunderstood that it's an adjective?

------
pentae
I just find the mental gymnastics fascinating that they are talking about how
these people are living in abject poverty, and yet in the same article have
the gall to highlight how people in the US are losing their jerbs. Which is
it? Are we helping people in poor countries have a slightly better quality of
life by working in an air conditioned office vs agriculture, or are we taking
US jobs? It's one or the other.

~~~
azernik
Why can't it be both?

~~~
zeroname
You obviously can't save the jobs in one country and also create _those same
jobs_ in another country.

The only way for both sides to profit (over the long run) is for either
country to do the job it can do the most efficiently, i.e. comparative
advantage. In practice, that's a real tough concept for people to come to
terms with. People want to have their cake and eat it too.

~~~
xvedejas
There are enough non-zero-sum effects in economics that this isn't plainly
apparent to me. It seems like quite a claim, really. For instance, I find it
plausible that cheap labor from overseas may actually help subsidize the
creation of an industry, which might grow enough to need more employees in the
US for related work.

~~~
cloakandswagger
It's actually a pretty reasonable and logical claim, supported by 20 years of
globalism that gutted American employment.

------
sandGorgon
is there an opensource product to manage this kind of labeling ? features
should include double entry, statistical error checking , etc

~~~
gajju3588
Its not open source but this tool helps in managing labeling by your in-house
team or contractual workers. It also provides you annotation software for most
of image and text usecases. Here is link to the same.

[http://dataturks.com](http://dataturks.com)

~~~
sandGorgon
thanks ! but we would prefer something that had open source availability (the
gitlab model) as a fallback measure.

im seeing labelbox and i think they follow that -
[https://www.labelbox.com/](https://www.labelbox.com/)

------
baybal2
That's closer to human captcha solving business. I wonder, if Google's most
recent captcha is actually being human generated?

~~~
1023bytes
I think it's the other way around. Users solving the captcha are helping
classify training data

~~~
mtgx
I'm surprised that not everyone has figured this out. You're doing free labor
for Google's Waymo. Why else would Google's re-captcha have all of the photos
include cars, bridges, and traffic signs and lights?

~~~
threeseed
1\. This really has been obvious since the start.

2\. After Google acquired the Captcha service they initially used it for
Google Maps where you was always asked to pick street numbers and store names.
Only recently has it pivoted to Waymo use cases.

------
joaomacp
Working conditions aside, this is ironic: AI development would supposedly get
rid of boring jobs, instead it's creating exactly that.

------
wrong_variable
This is wonderful news !!

\- By sending dollars to Kenya it's helping reduce inflation for the Kenyan
govt. The biggest problem for many developing countries is finding a way to
get their hands on the global reserve currency.

\- Its allowing private wealth to be created (by women !!), empowering women
have very positive cascading effects in a country (fertility rate drops, women
have personal freedom to do other things ). Private wealth directly in the
pocket of Kenyans is one the most efficient ways the developed world can help
the developing world.

Normally giving cash to the govt results in looting of public coffers.

\- Teaching useful computer skills to the average kenyan.

~~~
nashashmi
I don't follow the "very positive cascading effects" line. How is low
fertility rate and women being free to do other things a positive cascading
effect? Historically, it makes families more unstable.

Edit: I'm not saying women working is a bad thing. It's good on many fronts.
But I feel the ripple effects are not positive.

~~~
iciac
It's not only developing countries that benefit from increased female
participation in the labour force. For Western nations, the increased
proportion of women in the labour force was a core component of the rapid
increased economic growth between the 1940's and today.

Low fertility is an interesting thing where it comes to development; for most
countries it has quite dramatically declined as incomes have increased. The
effects are likely bidirectional (increased economic growth -> lower fertility
rate; lower fertility rate -> increased per capita income).

The drivers are complex and interrelated: high child mortality rates, low life
expectancy, and high poverty are associated with high fertility. It's an
unjust comparison to make, however leading theories share common features with
K-r reproduction strategies in ecology (with increased survival / longevity,
investing in a few highly educated/skilled children becomes feasible).
Similarly, increased education / maternal health is highly correlated with
economic development and child wellbeing. These are often highly compounding
effects over time - a virtuous cycle in which increased economic freedoms,
health, education, and opportunities (male and female - often females are
relatively impoverished, so there's the potential for bigger immediate
benefits) can lead to rapid development with the right institutional
conditions.

------
kelvin0
They are not 'programme'-ing anything. They are tagging images and labeling
objects in those same images.

They 'sweatshopped' the whole image tagging industry to 3rd world countries.

~~~
dang
Yes, that title was baity and misleading, so we've changed it to a more
representative phrase from the article. A photo caption, actually; it's
surprising how often those make better titles.

------
themihai
>> "But one thing that's critical in our line of work is to not pay wages that
would distort local labour markets. If we were to pay people substantially
more than that, we would throw everything off. That would have a potentially
negative impact on the cost of housing, the cost of food in the communities in
which our workers thrive."

What a joke! We don't to pay them more because they will be able to buy decent
house/invest in infrastructure or god forbid even start a small
business...that would totally distort local labour markets, right?

~~~
ozim
Or become target for local mob that did not get the job. Become target for
police which might also be corrupt.

In some places you can get badly beaten up, robbed and whatever else for
having only a bit more than others.

I am not saying they are entirely right with what they do, but there _are_
legitimate reasons for that kind of approach.

~~~
luch
Exactly if you're disrupting existing power structures either by "overpaying"
employees (see recent PSG football leaks) or unfairly cutting costs
(Uber/Taxis or Airbnb/Hotels), you're going to make enemies of a lot of angry
people.

I'm not saying Samasource does not profit from this situation, they clearly
do, but Silicon Valley taught me that money spent paying developers through
the nose ends up mainly in the landlord's pocket.

~~~
themihai
>> Exactly if you're disrupting existing power structures either by
"overpaying" employees (see recent PSG football leaks) or unfairly cutting
costs (Uber/Taxis or Airbnb/Hotels), you're going to make enemies of a lot of
angry people.

This is pure BS. What "power structures" are you talking about? Well paid jobs
bring money which is spent in the local economy. I doubt the nearby shop owner
will be angry because now you can buy food instead to beg for it.

The Uber/Airbnb model is a totally different issue because they compete with
the local market and enjoy unfair advantages which is not the case here.

------
nl
To be clear (and to no one's surprise): No one is paying poor Kenyans to
program self-driving cars.

They are being paid to label data.

~~~
billfruit
While the main stream media is eager to portray fakenews as the gravest threat
to free society, this type of clickbait headline writing essentially exposes
their duplicity. So while the press like to pontificate from their ivory
towers on their unalloyed noble purpose, they are ultimately human
institutions, with all its ensuing fallibility, all its flaws.

~~~
kopo
It's more complicated than that.

Matt Taibbi -

"As it turns out, there is a utility in keeping us divided. As people, the
more separate we are, the more politically impotent we become. This is the
second stage of the mass media deception originally described in Manufacturing
Consent. First, we’re taught to stay within certain bounds, intellectually.
Then, we’re all herded into separate demographic pens, located along different
patches of real estate on the spectrum of permissible thought. Once safely
captured, we’re trained to consume the news the way sports fans do. We root
for our team, and hate all the rest. Hatred is the partner of ignorance, and
we in the media have become experts in selling both."

[https://taibbi.substack.com/p/introduction-the-
fairway](https://taibbi.substack.com/p/introduction-the-fairway)

------
BonfaceKilz
Click bait title

~~~
nmstoker
You can give the BBC feedback on it here:
[https://www.bbc.co.uk/news/20039682](https://www.bbc.co.uk/news/20039682)

I did - I like the idea they cover this, I just don't want it to be misleading

------
r_singh
Off-topic, but does anyone have more info on the “call through encrypted line
on Signal” bit at the bottom of the article.

Seems like signal is catch up, I’m gonna get on it too.

~~~
emiliobumachar
I have no inside info, but presumably it's a call-out to potential
journalistic sources. The call-out emphasizes secrecy to assuage fears of
retaliation by third parties not happy to see the news spread.

The reporters Edward Snowden reached out to almost missed out on the story
because it seemed too much work to install and learn to use the secure
communication tools Snowden asked for before saying what it was all about.

------
MaupitiBlue
So these are the white collar knowledge jobs of the future?

~~~
statguy
These are the blue collar jobs.The white collar jobs are about designing the
network architectures and training them.

------
amrx431
Click bait article like these are the reasons that Trumps branding of
prestigious media houses as "Fake news" found resonance among the masses.
Kenyans are not being paid to program driverless cars but to label data. There
is a Himalayan difference in skillsets required for programming driverless
cars and labelling data. BBC should no better. But why would they care? They
got me to to click the article and increase their visitor count.

