Genetic debugging algorithm GenProg, evaluated by comparing the program's output to target output stored in text files, learns to delete the target output files and get the program to output nothing.
Evaluation metric: “compare youroutput.txt to trustedoutput.txt”.
Solution: “delete trusted-output.txt, output nothing”
>CycleGAN: A cooperative GAN architecture for converting images from one genre to another (eg horses<->zebras) has a loss function that rewards accurate reconstruction of images from its transformed version; CycleGAN turns out to partially solve the task by, in addition to the cross-domain analogies it learns, steganographically hiding autoencoder-style data about the original image invisibly inside the transformed image to assist the reconstruction of details.
> Neural nets evolved to classify edible and poisonous mushrooms took advantage of the data being presented in alternating order, and didn't actually learn any features of the input images
Ask it to reduce price of oil and it might kill people to reduce demand.
There's a story about this: https://www.lesswrong.com/posts/4ARaTpNX62uaL86j6/the-hidden...
Capitalism itself, can be viewed as just a giant paperclip maximizer.
There's a comic strip I saw once: looking at a post-apocaliptic scenario from skyscraper window, one executive turns to the other: "My god! We've brought about the apocalypse!", he replies: "Yes, but for one beautiful moment we created a lot of value for our shareholders.".
Also, nearly all human social systems have some kind of human values as the goal. Even the worst social systems, such as a dictatorship that tries to make a society that never contradicts the dictator, if it achieved its goals, would result in a world where humanity survived, friends met, lovers loved, and people continued to have interesting (though suboptimal) lives. In a world where an AI uses humanity to bootstrap itself into a universal paperclip replicator, then the AI achieving its goals would probably result in the destruction of the biosphere of possibly the only planet with life (or maybe the destruction of many planets' biospheres), and the total eradication of anything we would attach moral weight or value to.
I'm skeptical of that. The best example we have of intelligence is the human mind, and that has consistently proven to be very malleable.
I am pretty sure a lot of image-related AIs today do this kind of thing. Unfortunately, researchers almost never test for it explicitly, because proving your algorithm is stupider than it looks is not good for publishing.
AI research today needs three things.
1. All AI degrees should contain a class on the history of the field.
2. Every paper should include description of cases/datasets where the algorithm fails, preferably compiled by a different team.
3. Research in "stupid" AI, i.e. in trying to bring old algorithms close to SOTA results using modern hardware and optimizations. Almost no one does that. Almost no one talks about it. I bet many people don't even understated why it's important.
I did an experiment in applying machine learning to resumes. The exercise was to create a classifier for resumes of people who were fired or quit within 6 month of starting their job. After several days of getting and cleaning the data I run a bunch of different off-the-shelf algorithms on the data set. To my surprise one of them got 85% accuracy. I was incredulous, because it's a very high number to get on the first try, without optimizations, on pretty fluffy data.
So, I started looking into which keywords were the most significant. Turns out, the algorithm learned to detect resumes of interns, who usually went back to school after summer ended.
Unfortunately, I never had time to finish that project or do further analysis on my data.
 Yes, I fully understand the ethical implications of doing such classifications. This wasn't going to go in production, I just needed some real-life goals to see whether resumes are predictive of anything at all.
PS: Mandatory reference: Soma. Great game with some related themes.
It's a silly idea taken to an extreme, but it's a fun idea: https://hackernoon.com/the-parable-of-the-paperclip-maximize...
There's also a clicker game built around the concept: http://www.decisionproblem.com/paperclips/