Hacker News new | past | comments | ask | show | jobs | submit login
Slight Street Sign Modifications Can Fool Machine Learning Algorithms (ieee.org)
194 points by itcrowd on Aug 5, 2017 | hide | past | favorite | 131 comments

Whenever this sort of issue comes up, there are a number of responses pointing out that humans can be fooled too, and I am guessing that the generally unstated implication is that therefore it is not a big deal. Two important differences that this point of view overlooks are that humans are usually able to tell when they are not fully understanding their visual input, and they are able to respond appropriately, which includes acting cautiously and taking actions that will help resolve the uncertainty. Artificial visual systems that have, at best, only a rudimentary understanding of what they are looking at, are in no position to act this way, and can assign high confidence values to what we would regard as ludicrous interpretations of the scene.

These things will be fixed in time. There is nothing to be gained by pretending that they are not problems.

They aren't real problems because nobody smart is trying to directly hook up a road sign classifier to a steering wheel. They are trying to build complex systems where this is just one signal.

If they were trying to directly use this info, this would be the least of the serious issues. For example, there are plenty of stop signs well hidden by trees. Such a car would probably last about a mile in a non freeway setting.

A trivial model of "hey, does this make any sense appearing at this place on a map" defeats this, etc.

Of course this stuff is it is infancy and can do dumb things. But this is just isn't that big a deal.

Humans also act much worse than you imply. Look at the number of people who accelerate into crashes instead of braking, etc. Again, doesn't mean AI should get a pass, and humans are infinitely more complex, but we shouldn't be held up as amazing at this either.

(That's actually the part that worries me. That we aren't good at it)

The fragility of AI decision-making in general absolutely is an issue. Feel free to post a submission when it has been solved to a level that matches or exceeds human capabilities.

I build AI/ML systems for a living and it's not as big of a deal as you think. Production machine learning systems built by experienced teams will almost always have business logic protections built-in to stop the machine from doing something totally stupid. For example, an auto-bidder may have capped velocities or auto-shutoff mechanisms (those examples of bidders spending billions of dollars in minutes are exceedingly sloppy).

This is not unlike the human body's deeply-evolved reflexes - before the neocortex can even process that a pan is boiling hot, the nervous system bypasses the brain and and tells the hands to drop it.

In the case of driverless cars, DannyBee is absolutely correct: truly driverless cars will not make decisions from a single data point. I have some past experience with unmanned aerial vehicles and those things had insane amounts of redundancy in almost every signal and decision making process; driverless cars are more complex and will come decades later and will probably take this even further.

As another such practitioner, why oh why would we ignore hard-learned unambiguous signals and wisdom because we jumped on the machine learning bandwagon? We embraced machine learning to detect and uncover the things we don't see, not to replace the things we already know. There's no conflict here whatsoever.

> Production machine learning systems built by experienced teams will almost always have business logic protections built-in to stop the machine from doing something totally stupid.

In the case of driving, however, the problem of deciding what behavior to veto is itself a hard problem - perhaps the hardest of all. When the system monitoring the AI is itself AI, the argument becomes circular - which is not to say it cannot be solved, but I think it makes it harder to dismiss the concerns.

> They aren't real problems because nobody smart is trying to directly hook up a road sign classifier to a steering wheel. They are trying to build complex systems where this is just one signal.

What are the guarantees that those higher-level "complex systems" aren't going to have some weird behaviors as well?

The problem isn't just that ANNs misclassify adversarial examples. The problem is that it's counter-intuitive to an average observer and that no one clearly knows why those examples generalize so well. It's a remarkable property that a lot of AI "enthusiasts" try to downplay.

It's one thing to wire up unreliable but simple and fully understood components into a more reliable system. It's an entirely different level of challenge if the "unreliable components" are complex and poorly understood.

> Humans also act much worse than you imply. Look at the number of people who accelerate into crashes instead of braking, etc.

Except we know very well how and how often people make mistakes. For most of those mistakes we have a pretty good idea why they happen (at the high level, I'm not talking about neuroscience). Our roads, our cars and our laws are designed to handle these failures. Also, we have a pretty good model of how other people behave on the road, so we cal react accordingly.

All of this goes out of the window with self-driving cars.

Hell, most people naively assume that if a single self-driving gets into 50% less accidents than an average person than replacing all drivers with self-driving cars will reduce the global accident rate by at least 50%. This assumption doesn't take into account the fact that many accidents are caused by complex interactions between several vehicles and the environment. So introducing many self-driving cars can lead to some nasty emergent behaviors and some mass accidents that just aren't possible with human drivers.

Hw difficult would a GPS spoofer be, i.e. to transmit the wrong GPS coords via very cheap devices to at least confuse the heck out of any automated car?

Oh look:


Again, this assumes this is the only input, and things can't detect when some inputs seem completely insane with high probability and raise an error

Seriously. Y'all realize how much in your daily life would pretty much explode and kill people if that wasn't true, right?

It's worth mentioning that car GPS navigation explicitly calculates how plausible each GPS reading is, based on a combination of other position signals and basic physics modeling, and discounts anything that sounds too crazy. If it didn't, simple things like tall buildings would be huge problems.

Tall buildings are huge problems for GPS. Multipath reflections will inflate the uncertainty and can put you some distance from your true location.

Yes, they're a serious problem. What I'm saying is that they'd be a much bigger problem for GPS-based car navigation if there weren't logic in place to combine the GPS info with accelerometers and gyroscopes.

Yeah, but that wasn't my point.. my point is that these attacks will be tried as soon as opportunity>cost-to-attempt

Tesla did it for their billboard detection. Didn't turn out so well and their "fix" is a horrid hack.

I think another big issue is that humans can tell when a sign is manipulated or damaged to the point where it will be problematic to identify for other humans and it now can be corrected before something goes wrong. If it's only misleading to computers, chances are humans won't fix it till something went wrong.

Well, given the example in the article, any human with functioning vision would be able to identify that stop sign. A driverless car with inadequate detection would run through the intersection without stopping.

I'd say that's a pretty big deal.

I think the problem with the driver-less car is that it doesn't have goal-oriented general perception like human beings do. Humans filter the world through goals. The world is interpreted as a set of affordances and threats to your goals, your brain is predisposed with distinguishing the two.

As it stands, this is not how driver-less cars are being designed, they are being designed to specifically recognize the presence and content of street features like signs and other moving bodies. A human driver might see a tall parked car with a big gap in front, and recognize a pedestrian affordance which he should steer clear of in case somebody pops out. The way I see people talk about driver-less cars, this "common sense" type of judgement would have to be deliberately integrated.

Is the important thing that humans and machines classify them differently? Or that one or both might be fooled? Because the latter already exists as a problem but doesn't seem to be a major issue. You can already, with physical access to a sign, change what it looks like to fool any observer.

"These things will be fixed in time."

... and the solutions will be written in blood.

Yes, as happened with trains, automobiles, and airplanes.

But I suspect that driverless cars will start out pretty safe and improve from there.

"Yes, as happened with trains, automobiles, and airplanes."

I think there is an interesting distinction that is worth discussing ...

At the advent of train/plane travel, you could opt out of those novel risks by eschewing those modes of travel.

However, you could be driving along in your driver-piloted car, minding your own business and find yourself involved in an accident caused by the software in a driverless car you are sharing the road with.

Consider that trains cannot stop quickly or turn and safety consists almost entirely of making sure people get out of the way. And yet, lots of people (even kids) would walk along tracks, etc. And yet there are still urban areas with active train tracks running along surface streets, for example near Jack London Square in Oakland.

Before cars, people routinely walked in the middle of streets. "Jaywalking" wasn't a concept. It was entirely up to early cars to stop in time, and they didn't have good brakes. Fortunately they weren't very fast.

Also, early pilots, besides the obvious risks, were sometimes daredevils who would do things like buzz houses.

I think the main point here is that society's attitude towards risk has changed; we take small risks much more seriously than they used to when earlier forms of transportation were introduced. Accidents involving driverless cars make the national news and manufacturers are being very cautious. With earlier attitudes towards risk, I expect they'd be in widespread use by now.

I don't think that is an important distinction because that is still very true of driver-piloted vehicles as well. Also people still die in train and plane accidents. You still have to eschew any form of travel to be safe from its risks.

As opposed to bunnies and in kittens like they are now?

We are talking about cars, where over a million people die every year. We have already accepted a huge death bill for convenience is ok.

> Artificial visual systems that have, at best, only a rudimentary understanding of what they are looking at, are in no position to act this way, and can assign high confidence values to what we would regard as ludicrous interpretations of the scene.

It's not a good idea to generalize like that. There are many AI systems that use neural nets for perception and then use different models, such as conditional random fields (CRF) to see if entities make sense in context.

Not to mention that these adversarial examples don't work well from all distances and angles. As a car changes position slightly, the adversarial example disappears. It's very hard to create an adversarial example that works from all angles on all networks. There is also a special training regimen that fortifies neural nets against adversarial examples.

The problem of adversarial images is closely related to GANs and has been brought into attention by the same researcher, Ian Goodfellow. So it's not a neglected problem - some of the top minds are on it.

The adversarial examples don't need to work from all distances and angles.

Many street signs especially in rural areas might have a pretty small visibility window and they will only be visible from a pretty fixed angle until you pass them.

So while you might not be able to develop an adversarial example that works for all signs from all distances and angles it wouldn't be hard to develop one for a specific sign. The angle for that sign is known, the distance in which most cars could resolve the sign is also known, the average speed on the road can also be taken into account which gives you a rough idea of the time span the autonomous driver would have to identify and resolve the sign.

Heck if you know the algorithm it wouldn't be that hard to build an app that basically allows you to take a picture of the sign with your phone from the road, select a car or subset of cars you want to jam and it would generate a likely pattern to apply to the sign to jam those cars.

I should also add that in this case some of the examples that do seem to work look to be of the shape and size of common bumper stickers, this is pretty worrisome since these are not that uncommon on street signs, especially in rural areas where they are more or less at reachable heights.

I also wonder how well do these systems deal with graffiti.

Wait until AI puts most truckers out of a job and you suddenly have a substantial number of people in rural America feeling resentful towards fully automated freight trucks... adversarial examples are one thing, but I think the US military's experience in Iraq has taught us that a determined, intelligent, and constantly innovating human adversary is a continuing challenge.

Especially given that the cost of crashing an unmanned truck will be 100% capital and 0% human, I expect we'll start seeing some pretty serious criminal penalties passed once the tech gets there.

Unless that truck crashes into people :)

I think that scenario isn't as likely as unintentional screwups at the moment.

Of course. But there's a lot of road between population dense areas...

> So it's not a neglected problem - some of the top minds are on it.

That's why I believe it will be fixed. It is the "it's not a problem" crowd that I take issue with, and I would be surprised if many of the top researchers held that position, unless they are confident that they are close to a solution.

I think you are under-generalizing here. The risk here is not of a sign being mistaken for a gorilla specifically, or even about signs specifically, but the fragility of image recognition in general. Similarly, because the issue is not (just) about adversarial images, the fragility of adversarial images is not much of a mitigating factor.

At the point when the AI operated cars can handle all kind of situations better than humans, they will be intelligent enough to be classified as human beings.

The fact is that these adversarial examples are incredibly rare. 1 in a trillion trillion or more chance that they would occur in natural data. We dont know that human brains aren't vulnerable to something similar. I bet if we could back propagate through the visual cortex, we would find similar things. They can also be mitigated a bit by training on them.

Optical illusions certainly suggest that our visual processing involves trade-offs.

I think you are overstating what "understanding" is. It is well known that birds see more of the EM spectrum than humans. We can make -- and indeed there are -- designs that are only visible to birds but are invisible to humans in the sense that our three-cone visual system conflates certain inputs from the physical world. That stop sign can be designed in a way to look like a stop sign to a bird but make us think it is something else entirely. Are we not understanding the stop sign?

No. It's just a fact of parsimonious signal representation that sometimes you get null spaces with which you can manipulate to maximally separate the performance of two very different receivers.

No, that's nothing like analogous. Adversarial examples aren't giving machines information that is invisible to humans.

Related, but different: Autonomous Trap 001


They used salt to construct a circle with a solid line on the inside ("do not cross") and a dashed line on the outside ("come on in").

Everyone makes it sound like this art project was tried with a real autonomous car, but it hasn't been.

This is just a guy sprinkling salt around his regular car.

So that was just him driving in and not an autonomous car? https://vimeo.com/208642358

The car drives into the salt circle then stops.

Yes. From the interview above:

Is this actually an autonomous car, or is it conceptual?

I don't actually have a self-driving car, unfortunately....

We're talking about it on HN, so I'd say he did it exactly right :)

In my state, there's a (fast) road that turns down under a bridge. From a distance, it kind of looks like I'm going to crash into the bridge.

Even though I know the road wouldn't drive directly into the bridge, I slow down a little and look carefully to make sure I'm actually not going to crash into the bridge.

When my perception doesn't fit my internal model, I gather more data (look at different parts of the bridge and what other cars are doing), or transform the data (ie turn my head slightly and look at the bridge and road from different angles)

Edit: Likewise, when someone's tone doesn't match their words, I gather more data (look at their body language).

Have any researchers experimented with neural nets to do the same? I haven't noticed any posts here about that.

There was a paper that claimed those sorts of rotations and movements can help alleviate adversarial examples:


However, OpenAI quickly refuted it by creating adversarial examples that continue to fool the classifier even when rotated, scaled, etc:


So it looks like there's no "easy" way out here. Multiple types of sensors may help, but it seems likely that it will still be possible to construct examples that fool network over all sensor inputs at once.

Ian Goodfellow and Nicolas Papernot have a good blog on machine learning security issues. One relevant post on why this is such a hard problem:


Is it a big surprise that you can construct adversarial examples for algorithms? Don't humans have the same class of problems with optical illusions? And those are not even adversarial, just confusing.

If we constructed truly adversarial examples for human neurology, I bet they would be equally insane.

In some sense, yes, optical illusions are simlar to such "adversarial examples". But if you think about it, any kind of image is somewhat delusive, since we perceive it as whatever object it depicts while actually staring at a piece of paper with some ink on it.

Also, adversarial in this case seems to refer to images perceived differently by machines than by humans, so it's not really possible to create such ones for humans.

No, adversarial simply means deliberately trying to engineer false positives and negatives. This can be done against humans, machines, ants, trees, viruses ...

What is a false positive or false negative in this case? The "ground truth" here is what human perceive.

Optical illusion?

Camouflage is an adversarial example.

Wait... the dress was blue and black?

Thank you for your awesome response. It probably would have taken me hours of googling and reading to find these exact posts. A mere upvote is not enough of a thank you for such a quality reply.

Storrow Drive in Boston has several bridges over it like this. The worst part about it is that a car could be stopped under the bridge and it's hard to tell until you are on top of it (the car not the bridge). Granted the speed limit on the road is low enough that if you are obeying it you should be able to stop in time.

So a self driving vehicle would have to somehow know that the road dips and from that know that they won't run into the bridge and know that there may be cars hidden after the dip.

Edit: tried to get a good photo from Google maps but since the Google Street View Car camera is high up off the ground you can actually see more clearly what is going on with the bridge then you would be able to at street level so it's not a good example. I wonder what implications camera height has on the safety of self driving systems. Also, the bridges I have in mind seem to be on Soldier Field Road not Storrow (Storrow turns into Soldier Field and I haven't lived in Boston for years so forgive the mistake :))

I think this isn't well-applicable to machine learning because if there are more data sources to be considered (like other cars' behaviour in addition to just camera images), they'll be always considered anyway. For humans, it requires considerable additional effort to do so, but for machines it's fairly cheap, so why not do it anyway?

The uncertainity part should ideally, in my understanding, be represented by confidences returned by neural networks. They don't claim "there's a crossroad ahead, and not a tollgate" but rather "85% match for crossroad, 30% match for tollgate, [...]". If those results are not distinct enough, the surrounding application should probably go into a more cautious mode and slow the car down to begin with. I suppose that's what such systems do, but maybe someone with more field knowledge can confirm/negate that.

The road environment is designed for people to see and understand. Without a human-emulating "general AI", it's unlikely that every sign and surface marking out there will be legible to a machine. IMO, the whole premise of training these machines to see roads like a human is flawed. Their developers should be working with government agencies (like the FHWA in the US) to create new standards that are aware of the capabilities of the machine.

I agree and I think the whole self driving push should be incremental anyway:

1. To begin with it is only used on motorways/highways as they are long straight roads, where the AI can take over the boring part and the driver can be left to take over if something complex happens.

2. Shared AI maps across all cars that can route everyone to/from work. As this is shared, it can balance the load and result in the least amount of traffic jams. Humans still do the driving.

3. The government sets up AI cameras that monitor city routes, and cars can tap into this information as they drive. The benefit here would be seeing the road from all angles, as well as massive computers bigger than cars doing all the number crunching. Over time more and more city routes could become approved to be fully driven by AI.

That's my approach to an incremental design at least.

So before transportation there were no roads, then Humans begun using animals and the roads were formed to accommodate animal based transport.

Then Animal powered mechanical transport(carriages) was invented and people made roads to accommodate the new means of transport.

Then the fully mechanical transport came and we just made roads for these machines.

For some reason, this time Humans are trying to keep the roads the same. Well, actually not, there are project to create roads designed for self driving vehicles.

I think those in the software industry have a hammer and now see everything as a nail. What a successful AI transportation system would be? Probably based on specifically designed infrastructure that is using the AI tech that never worked for the old school roads.

The machine evolves too quickly right now for standards to be worth it. Let the craze die down before doing something like that.

Even if we intentionally supported machine drivers with special signs, beacons, or other things, those are still going to be subject to interference and defacing.

This article says that the training set was small. And presumably the misclassified images were unlike images in the training set. If the training set includes images intended to mislead, then wouldn't the classifier then be more tolerant?

This is the price we all pay for lazily defining a complex cognitive task as a mere image <-> label problem.

The right solution to this is to have an official database of signs and their GPS coordinates provided by the government or whoever is responsible for road safety, free of charge (you're not directly paying for having signs on the road, why would you have to pay for an electronic version of that?).

Road signs were made for humans because we don't have the ability to connect to the internet and fetch the data in less than a second, but autonomous card do, so why not use it?

Introducing a single point of failure (1 database, across the internet), organized by a largely complex system (the DoT, possibly) to cope with an edge case doesn't feel like a very elegant solution.

A more costly, less elegant solution whose problem is already shared by street signs in general (meaning we could plan for it) would be RFID tags or something that tells the computer what sign this is if it can't read the sign. You could also use this for training, so it learns to filter away poorly drawn swastikas from the sign.

An attack on something like this would scale very poorly, as you would need physical access to all street signs. Issuing RFID to a street sign would be just another step along the manufacturing process of the sign, or as a step to the mounting of the sign.

I can't take credit for this idea though, it's already been [explored in a paper](https://link.springer.com/chapter/10.1007/978-3-642-41647-7_...).

>An attack on something like this would scale very poorly, as you would need physical access to all street signs

I wouldn't underestimate the eventual legions of unemployed truck drivers.

They will probably lose eventually, but I expect a spirited effort.

This actually doesn't seem that far fetched. Ex-drivers who are angry and have plenty of time on their hands out at night picking the RFID tags out of stop signs.

Another scenario is more of a slow and steady war. I'd expect that autonomous vehicles will occasionally be alone and unoccupied, which makes them a tempting and defenseless target. You wouldn't even need to do any physical damage to disable them - just a little electrical tape over their sensors, or some chocks under their tires. Soon, the autonomous trucking cos have to employ some of the ex-truck drivers to go around re-enabling their vehicles. And of course, those ex-truck drivers are in cahoots with the disbalers. That could certainly drive down their savings on labor costs.

They would be most successful where population and sign density is high, which would be the cities. During such an attack, public transport could move around people while the authorities would take care of the attackers.

The problem for them is that where they would cause the most harm, rural areas (since they are in most dire need of supplies a few towns away and don't have any good alternatives to cars), is where such an attack would be the hardest to implement. Canada, Australia, Iceland and Alaska has many roads where vital street signs can be tens of miles apart from each other, as well any actual people who might be effected by this. Also, demographic movements is working to their disadvantage; more and more people everyday are moving to large cities.

In the US they can focus on just the interstates. And, they are unemployed, already used to long boring trips, and have established communication networks between them. Oh, and disenfranchised friends at the various rural truck stops and motels that will also be razed by self driving tech.

This isn't a single point of failure by any stretch of the imagination.

Waze and other navigation systems have a database of signs, road speeds and traffic enforcement cameras.

I don't know about the US but there are places where signs are in a database already.

This can be crowd sourced just like anything else.

Google Street View already has a pretty good DB to start from.

> Road signs were made for humans because we don't have the ability to connect to the internet and fetch the data in less than a second, but autonomous car[s] do, so why not use it?

Because you shouldn’t need to connect to the Internet and resolve GPS coordinates to know there’s a sign 50 meters away. If autonomous cars can’t "see" signs, let make them "see-able" with e.g. small radio transmitters along the road.

The issue is that upgrading every sign with a transmitter is costly, would require power and ongoing maintenance (not to mention vandalism - a sign is just metal and usually pretty resistant, but any electronics will be destroyed by vandals in no time).

An online database that can be cached (so you aren't in trouble if your network drops while you go through a tunnel or similar) is much cheaper to implement than upgrading physical signs.

This whole line of thought is somewhat beside the point, because the larger issue is not just about recognizing signs, but features of the visual field in general. If a system cannot be trusted to identify a road sign, I do not trust it to recognize a person.

The system cannot be trusted to recognise a road sign that has been very specifically adjusted to fool it, just like a person running into the road at night dressed in black is hard to see.

I'm not really concerned about the risk to humans that have been very precisely dressed to fool cars.

As mentioned elsewhere in this thread, until these systems can recognize that what they are seeing is not trustworthy like humans do on a day to day basis, the systems cannot be trusted or even depended on in a real world setting.

The larger systems can, just because the visual system says one thing doesn't mean it's trusted. Google made a short comment on it recently about cars with stickers of realistic scenes on them.

But in the context of deliberately hidden or altered pedestrians, there's no risk I can see here. The pedestrian would have to be trying to look like something else or hide in a very precise way, and they can do that to regular drivers right now.

The idea that this is only an issue of disguised pedestrians is a red herring that should not stop people considering the broader implications of the fragility of vision and other ML systems. When a system does not always function according to its intended purpose, it is sound engineering judgement to consider whether this has implications beyond the specific cases that have been found, and there have been some tragic outcomes when the people in charge found it expedient to not do so. In the case of ML, the principle that systems can generalize appropriately beyond their training sets is central, and anything that raises concerns over the generality of that capability needs to be taken seriously. You can certainly hold the opinion that it will not turn out to be a major problem, but the burden of proof lies with those claiming that the systems (after modification, if necessary) are safe enough, and avoiding the question is the opposite of discharging that burden.

It so happens that both Space Shuttle losses involved "the problems seen so far are not directly relevant, so we will ignore them" thinking.

Or any other obstacle for that matter.

Since the sign is just metal, it could serve as the antenna. Bury the transmitter beneath the ground, there you could also hook it up to the power grid (unless you use the sign's surface to act as a solar panel). I would like to see vandals mess with that.

However, it ain't cheap and simple to implement, I agree with that. In countries like Sweden, Finland, Iceland, Canada and Australia they probably have the means to outfit their cities since they have the wealth to do it, but the rural areas would be a real challenge since they are so sparsely populated even when the money is there. In contrast to when these countries pioneered the implementation of the Internet, there's nothing like the phone network to piggy back this time.

Passive or Semi-Active RFID isn't expensive.

All signs can be a victim of vandalsim.

I think computer vision, a sign DB an IR water marked QR code and some RFID combined can produce a pretty robust and tamper reselient system.

> The right solution to this is to have an official database of signs and their GPS coordinates

Why does everyone think self driving cars can use GPS to identify sign locations? GPS is not accurate enough to do that, and the labeling would need to be done by hand, which is not feasible. Plus, signs change and move around, making the labeling task endless.

One solution is to add redundancy. A stop sign, for example, could broadcast a low-power radio signal, and be accompanied by a retro-reflective strip with a coded pattern on the side of the road leading up to it, and emit tones at particular ultrasonic frequencies…

GPS could be a part of this, but it's easy to imagine what could go wrong. Somebody mistakenly installs a stop sign without updating the database. Another person makes a mistake configuring a cache parameter and the CDN starts serving up last year's map. There's a whole class of problems eliminated by keeping the information local. Imagine if you had to drive looking only at road information served over the internet -- would you trust it?

What do you think will happen when it (inevitably) gets hacked?

If you had a diversity of differently trained algorithms, they would not admit the same sorts of adversarial examples. The risk isn't that adversarial examples exist -- an exponentially small number of them always exist with any representation. The risk is that you can search for them en masse like offline cracking of passwords. If you could do that with the human mind, I hesitate to think what you would find.

Cognitive biases, optical illusions probably.

Couldn't slightly randomizing the image before inputting it to the neural network invalidate such manipulations?

Yet humans are totally unphased by these modifications. What makes humans still excel at these edge cases?

Our brains are pretty good at image processing. Plus we have an inherent understanding the world, including the way objects rotate and change shape. We're looking for the stop sign (big red hexegon).

Though I've seen a new stop sign added recently, and the number of cars that don't see it is remarkable.

I'm not an expert on AI though ( a few classed at university and I specialized in something else).

Its the general classifiers that give us trouble. We want to show it 100s of stop sign pictures and have it figure out when we show it new one. But we're not asking it, is this a stop sign, we're asking "what is this".

We can write software that probably is good at finding hexagons and colors and thus stop signs. Take facial recognition, its remarkably good at this point, but its doing one thing ( though I've seen a computer id 3 people in a photo with 2 in it , because a third person's photo was in the background.)

We're looking for the stop sign (big red hexegon).

Though I've seen a new stop sign added recently, and the number of cars that don't see it is remarkable.

These two sentences together show that our brains aren't just looking for the signs; we're also looking at many other aspects of the situation and even taking into account past experience (e.g. is this an intersection? Have I seen a stop sign here before? If I'm new to this area, I'm likely going to be far more alert to the signage.)

If someone planted a (non-modified) stop sign on the side of a highway, where the road is completely straight and with no intersection, I bet some drivers won't even see it, those who do will be puzzled, and approximately none of them will even try to stop.

I'll give a counterpoint based on my own experience.

When I travelled to the US, from New Zealand, to interview with a certain large company, I actually managed to mis-read road signs, in particular traffic lights, on a number of occasions, much to my alarm.

I am not a bad driver - and there were a number of factors - perhaps being tired from the trip, so, it isn't clear cut. But, I remember feeling like the traffic lights just didn't look like traffic lights - the intersections just felt off - so off, that I actually failed to recognise them in some instances.

It was actually a bit upsetting at the time, I've never quite experienced anything like it, but I think it's possible that the human system can fail to work if the input is sufficiently different from what is "typical".

To be fair, this was in Christchurch not too long after one of the bigger earthquakes but I got severely messed around by the ambiguity of some of the road signs, to the point where I once ended up on the wrong side of the road.

> What makes humans still excel at these edge cases?

It is rather that we built the signs in a way such that they are very robust for the human visual system to detect.

These edge cases are designed specifically knowing the structure of (or at least being able to run large numbers of experiments on) the system in question.

We can't do that with humans. So we don't know what similar situations we could create if we did.

A more comprehensive world model. And the ability to realise that we may be interpreting an external stimulus incorrectly, and acting cautiously when that happens.

Humans have weird edge cases as well. There are hundreds of visual "illusions".

I was pondering on this topic earlier today and came to the conclusion, "because our performance is judged only by humans." Imagine if an alien visited Earth, and was completely dumbfounded by our sketches and cartoon drawings. "But it looks nothing like a chicken", they say, looking at a sketch of a chicken and a photo of a chicken. "It's just an edge case that happens to trigger your 'chicken' neuron."

We designed the signs to fit the human vision system, and the modifications target a different vision system. I imagine there will be similar edge cases that fool human vision.

Not even very clever edge cases, but consider that birds see more of the spectrum than humans, there are designs that are visible to birds but invisible to humans. There, we just found examples that are adversarially designed to blind humans but are so obvious to a bird: it wasn't that hard!

Possibly that we aren't trained on different training sets. In particular, I think most machine learning training sets have way too many positives. For example, for every second you see a traffic sign, there likely are a hundred or more where we don't see one (especially in our youth)

Also, the claim that humans excel at this can be called anthropocentrism. You could also point at the features those machine learning algorithms use and say "why don't humans see these very prominent features?"


"We'll just cover all of the possible use cases" - Self Driving Car Engineer

The first time I learned of adversarial research my initial thought was GD, in our automated AI-driven future, the people who master this are going to live like wizards.

You might really like https://cvdazzle.com/, they at least look the part!

What makes me wonder is how slight modifications make the algorithm miss gross, highly visible features, e.g mistake a blue sign for a red sign, or an upward-pointing triangle for a downward-pointing. I suspect it won't be very hard to make the algorithm pay more attention to it, by specially teaching it to tell between such differences, and maybe by running several separate networks taught to tell apart particular narrow features, not complete signs.

It seems to indicate that these machines are nowhere near as smart as they appear to be based on earlier successes. I'm starting to get worried about the possibility of another AI Winter if it turns out that reality and hype are too far apart.

The AI Winter will come only when the investment money runs out. As the Fed is keen to raise rates as slow as humanly possible (similar to 2004, and unlike 1994), I think the AI practitioners still have a quite a bit of runway left.

I would also hope that the car uses contextual geolocation info. That a speed limit sign is not posted on the corner of a 4-way intersection, and a stop sign is not typically put on the side of a limited access highway. In fact I would expect that most of the driving regulations should be encoded in the map. Anomalies would be treated with extreme caution (and reported back to the home office).

I would expect that some of the more basic rules win out. e.g. Don't crash into another vehicle. Do hit a pedestrian etc etc.

Because CCD based cameras see so much more IR than we do, you could probably make a set of IR reflective/absorptive stickers that could be placed on street signs that would be virtually invisible to humans but totally alter what the CCD sees.

OT: My favorite street sign graffiti is the hula-hoop stickers that artists put on pedestrian walking signs.

I have wondered this about mirrors and LIDAR, sounds like it is being worked on [1]

[1] http://ieeexplore.ieee.org/document/5409636/

I don't understand the use of 'adverserial attack.' Makes it sounds like the machines, or the creators of the machines are at war. When in reality a stop sign sticker adds texture to a mostly mundane public realm.

Adversarial is a technical term.

Give a general machine learning algorithm a long enough string of 1s with a few missing, and it will have no clue what fills the missing digits. ML is inherently incapable of matching human cognition.

Sometimes it seems that NN systems and Deep learning systems are a dead end.

Are there any other promising technologies that can replace or at-least augment current Machine Learning systems?

I would assume that you could somewhat trivially defend against this by validating the top n hypotheses from your classifier against reference images.

I want to know now how I can paint my face slightly to fool Facebook's (or some other widespread) facial recognition system.

Perhaps not "slightly" but there's this: https://cvdazzle.com/

To other humans, those are very much recognisable humans, but face detectors won't think they are.

Sorry I'm late, boss. Someone put a sticker on a street sign so my car drove into a bakery.

The first thing my mind went to was the image of muttley putting up a detour sign

Some context knowledge would help. For example, approaching an intersection... should I stop here? Oh, yah, there is a distorted red sign.

Also, figuring out the type of sign for the outline. Then the icon inside seems like an approach that could work.

Future humans only street.

Sounds like Artificial Intelligence is Stupid.

It's not fair to compare with AI. In this case, a visual network is more like a reflex. They can fool the equivalent of a human reflex. The vision neural net feeds into a "world model" where such inconsistencies are resolved on a more abstract level. The same world model is being used to plan the path of the car. Even if the vision net makes an error, the internal model has ways to detect that there's a perception error if it doesn't make sense in the context.

...stupid like a newborn baby: in very early development, but gradually learning. Personally, I'm not sure that AI will ever be as intelligent as a human, but the possibility is still somewhat uncomfortable.

Alright, so this is an image augmentation problem. Another training set with white noise variations, random unrelated pixellation/overlay texts can solve this. Simply your training wasn't general enough.

Constructing adversarial examples is a sophisticated task, but it's being done.

Sure, and it should go hand in hand with appropriate image augmentation. Maybe make something like a GAN that would try to generate anti-obfuscation image augmentation for every new adversarial example...?

Our eyes can be fooled easily anyway.

Slight street sign modifications can completely fool humans too. Luckily: 1) most people aren't assholes so don't want to cause traffic accidents; and 2) we have laws and police organizations to track down and punish assholes who do

Slight sign modification will not fool humans at least not on this manner.

This looks like a couple of bumper stickers will cause problems even when they do not obstruct the actual sign.

This is a pretty big issue that will have to be dealt with.

Except that in a world where self-driving vehicles are relying on visual recognition of signage, it would become obvious PRETTY QUICKLY if the modifications to a STOP sign made it look like a 65mph speedlimit, and it would be able to be treated exactly as severely as if someone had covered the stop sign with a 65mph speedlimit sign.

We're not talking about modifications causing a sign to accidentally be mistaken for something else, we are talking about deliberate modifications to road signs that cause vehicles to misinterpret them.

If you modified a sign so that most users still perceived it as a stop sign but colorblind people misread it as a speed limit you'd be doing exactly the same kind of thing.

I mean, it's not like there's some free-speech right to graffiti on road signage in the first place, let alone to modify it so that some road users will misunderstand the sign's meaning. If you interfere with a roadsign in order to deliberately confuse road users, you are a criminal.


Images with Zero Modifications Can Completely Fool Human Sight!

Examples: http://www.ritsumei.ac.jp/~akitaoka/index-e.html

(In case it's not obvious, the implication is that it's possible to engineer sign street modifications that fool human beings too.)

And there is a reason why we design signs the way we do and not use optical illusions.

This is about fooling vision systems with adversarial modifications.

Everyone needs to get out of themselves and see that the human vision systems can be fooled with adversarial mods as well, just not the same bugs as computer vision...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact