
What A.I. Learned from the Internet - Don_Patrick
https://artistdetective.wordpress.com/2020/03/07/what-ai-learned-from-the-internet/
======
kristerv
best paragraph is the last one:

> When will AI researchers learn? There is a saying in computer science:
> “Garbage in, garbage out”. The most remarkable thing about these stories is
> that the biggest companies, IBM, Microsoft, Amazon, all chose the worst
> corners of the internet as teaching material. Places that are widely known
> as the bridges of trolls. One can scarcely believe such naivety, and yet
> they keep doing it. Perhaps they are only “experimenting”, but that does not
> ring true for commercial products. More likely their goals are only feasible
> with current AI by prioritising quantity over quality. Or perhaps these
> stories are not entirely accurate. After all, I only learned them from the
> internet.

------
rwnspace
A straightforward point that comes to mind regards reusing the output of GPT-n
as part of the training input to GPT-n+1: Andrej Karpathy recently tweeted
about it in terms of pollution, which has wonderful connotations:
[https://twitter.com/karpathy/status/1284660899198820352](https://twitter.com/karpathy/status/1284660899198820352)

~~~
Don_Patrick
So far most of this pollution is effectively tagged through accompanying
mentions of "GPT", but filtering that from future training data would mean GPT
can never learn about itself.

------
grillvogel
reminds me of this rather horrifying article i read recently:
[https://www.rand.org/blog/2020/01/artificial-intelligence-
an...](https://www.rand.org/blog/2020/01/artificial-intelligence-and-the-
manufacturing-of-reality.html)

~~~
perl4ever
Well of course the government is covering up whatever's going on in North
Dakota. Just look at what they did in Kentucky:

[https://en.wikipedia.org/wiki/Bowling_Green_massacre](https://en.wikipedia.org/wiki/Bowling_Green_massacre)

------
visarga
cherrypicking to the finest

