Hacker News new | comments | show | ask | jobs | submit login
Google's self-driving car gathers almost 1 GB/sec (twitter.com)
133 points by hammock 1729 days ago | hide | past | web | favorite | 73 comments

Dear moderator, why did you change the title? I worked hard to abbreviate it to 80 characters without changing the wording or editorializing. And it's not the fact that I found interesting, it was the picture. Thanks.

For the record I had GOOG Self-Driving Car gathers almost 1 GB/sec.What it "sees" making L turn [pic]

I can't believe that you were willing to abbr. "Google" with "GOOG" but had to use "almost" in this sentence (utterly redundant word here) or "1 GB/sec" instead of "1 GB/s"

Sorry, i don't know how much "effort" you put into your abbreviation but your title looks like a dirty hack, i would never want on the front page.

Google Car gathers 1GB/sec. What it "sees" making Left turn

That would have been better in my opinion.

With all due respect, i think the Mod changing the title is justified in this case.

I don't think you can leave out the self-driving car. 'Google Car' is ambiguous and could mean either the Google Streetview cars or their self-driving cars.

If you think so:

-) Google Self-Driving Car gathers 1GB/s. Picture of it analyzing left turn

-) Google Self-Driving Car gathers 1GB/s. What it "sees" making Left Turn

My critic is not so much with the title op chose, more so that he/she felt the need to make the case that a Mod shouldn't have changed it.

Op wanted to know why it is Mod did it, i answered.

The word almost was not kept for accuracy's sake (could have said ~1GB/sec), but for the word's storytelling quality. Imagine if we had said the car "collects barely 1 GB/sec." Without "almost," it becomes more difficult, however so slightly, to grasp that 1 GB/sec is an impressive amount of data.


I really dislike the mods here changing titles to bend to personal views. I've had a few submissions that gained good traction and got heavy upvotes until a mod changed the title to something meaningless and unrelated.

They seem to mostly change the titles to what the title of the actual article was. This is of course a terrible policy when the blog in question [edit: does not] use descriptive titles, or titles that only make sense if you are coming from the perspective of already being on the blog.

A good example would be the post I made about a third party audit of CloudFlare's security and how they failed to block any of the most basic widely known attacks. The mods change to the academic title of the PDF paper failed to provide any sort of enlightenment about why it was interesting at all to HN readers.

Some stories would drop off the front page fast if the original title was used initially. By the time the moderator changes it, usually there is enough weight that people read them because of the # of upvotes.

A lot of the titles are terrible. Like a single word or phrase that means absolutely nothing without context. I hope the mods are not doing that.

It doesn't seem to have changed the meaning much. If anything, I think it's a bit easier to read at a quick glance. I like to scan titles or I'd never get past the first few pages on HN.

GOOG Self-Driving Car gathers almost 1 GB/sec.What it "sees" making L turn


Google's self-driving car gathers almost 1 GB/sec

Reading the edited title, I wasn’t going to click through to Twitter. I figured it was just a one line tweet that wouldn’t explain much.

After reading hammock’s comment, I clicked the link to see the picture. The ‘[pic]’ suffix to the title was informative, no need to remove it.

I would have never clicked the link to the twitter post if not for the callout to the "image".

Titles do change our behavior by proving us with clues with why we should be interested.

He is off by an order of magnitude. The Velodyne 64 on that car only has a 100 Mbit connection [1]. Each of the 5-10 radars on that vehicle is connected over a 1MBit CAN bus [2]. The front facing is almost definitely a Gigi-E or Firewire camera so it's at most 1000MBit.

Doing the math: 100 + 10*1 + 1000 = 1100MBit != 8192Mbit (1 GB)

1. http://velodynelidar.com/lidar/products/brochure/HDL-64E%20S...

2. http://www.kvaser.com/zh/about-can/the-can-protocol/19.html

If other robots I've worked with (eg. the PR2) are any indication... 1GB/sec. is probably on the low end for the total data throughput. The PR2 had a high-res camera (GigE?), quad stereo cams in head (narrow and wide FoV), dual "wrist" cameras, a Kinect (usually), dual Hokuyo UTM LRF's, full 14+ DoF kinematics at 1KHz, and lots and lots of diagnostic info (eg. "heartbeat", system diagnostics, localization signals, IMUs, etc).

Granted, we didn't log all that data all the time, but TL;DR: I wouldn't nitpick over a factor of 8 based on your own calculations.

That was an upper bound to show the absurdity of his non sourced "fact". The reality is much lower because the main sensor is the Velodyne, and it only produces 5-6MB/s. The primary focus of his picture is the Velodyne point cloud. Which he says is "1 GB per second" which is 100 times larger than it really is.

I have worked on several fully autonomous vehicle projects based on Velodynes and their processing and logging rates are around of 5-20MB/ (as you would expect) so seeing someone quote "1 GB/s" is pretty awful.

how well does the Velodyne work in heavy rain or snow?

Wow, never thought about that in the Google Car context. To be fair, I dislike driving in heavy rain myself, and most people drive much slower in such conditions. Still, that might be more of an issue in rainy Hamburg compared to the Bay Area ...

Not well at all, snow covers road markings and doesn't always reflect back to the sensor, rain is similar. Puddles are invisible.

It's also not officially water proof. That said, we used ours in light rain a few times and nothing bad happened to it. The data was pretty much useless however (although determining that was a large part of the point of the experiment).

Agree. The point cloud in the self-driving car itself is probably on the order of 1/4 to 1/2 gig a second.

No, the Velodyne only has a 100 Mbit connection. It sends about 5000, 1206 byte packets per second [1]. Giving a data rate of ~50 Mbit/s (1/20 a gig).

1 - http://velodynelidar.com/lidar/products/manual/63-HDL64ES2h%...

The picture Gross shared is a frame from a 1.5 year old promo[1], so he may also have misheard 1Gb for 1GB.

[1] http://imgur.com/IfxYZL2 – the bottom image is from this presentation: http://youtu.be/YXylqtEQ0tk

You are making an unwarranted assumption that they are logging the raw Velodyne data. Most likely that is not the case. Pointcloud data computed from raw Velodyne data can easily require an order of magnitude more space.

That said, this guy may be more than an order of magnitude off (confused MB and GB?). If 1GB/sec figure is correct, then they'd fill up a 10TB NAS in 3 hours. Forget about network throughput, how much storage do they carry in the back of that vehicle?

Doesn't it also know about Google Maps data?

Meaning, it can look at a lot more data than just its raw sensors, every second.

Also, why do you assume the front-facing camera is the only one?

As far as I know the "google maps" data is used for things like road locations, stop signs, and other traffic marker information. Their primary mapping data source is based on Velodyne data itself. You also have to think storage needs, would it really have 4-6TB storage array for just a few hour drive?

I saw the tech lead of the project talk about 1.5 years ago. He said the primary sensor was the Velodyne, followed by radars. The only use for the camera mentioned is detecting stop light colors. At that time their main computer was a quad core machine, processing that much data sounds beyond it's limits as well.

Just because they're capturing it doesn't mean they're processing all of it immediately. This kind of data is great to keep around for simulations when you return home. Having ~10TB onboard storage isn't crazy.

From what I gather, they are capturing the velodyne data and turning it into a thinned point cloud of the areas of the USA in which they drive. Within this point cloud they can then do effectively a least-squares fit of the real-time view into the historical view and get another data source for localisation. Of course, there's GPS/IMU fusion too, but this sort of SLAM approach is fairly robust under GPS loss so combining them all in a filter is pretty standard practice. The scale of Google's data dwarfs most research projects, however.

There's a great presentation about Google's self-driving car here (check the second half for video of the real-time telemetry that it gathers):


One of the things that I found really interesting is how the car inches forward at a stop sign in order to show the "driver's" intent to other drivers. Lots of actions that seem to be second-nature for human drivers have to be carefully emulated here.

Thanks you for the link. It looks like the picture here is almost the same as what is shown in that video (from 1.5 years ago).

It is exactly the same. The picture on Twitter was the frame shown on 9:07 in the video. Given that the image is from a 1.5 year old promo video, which probably has been circulating all over, that puts the rest of the information in a new perspective as well.

Here’s a comparison I just put together: http://imgur.com/IfxYZL2

For perspective, the ATLAS [1] experiment at the LHC records O(400 MB/sec). To disk. Permanently. (With a duty cycle of something like ~0.5, in principle). So, I guess that's a different ballgame than temporary acquisition for decision making, but still. A decade ago it seemed like an insane amount of data and now it is becoming more normal by the day..

ATLAS "sees" something more like 1.5 MB * 40 MHz, but the vast majority of it is discarded after at most three seconds and there is zero suppression involved. Most of "the full data for a collision" isn't even involved in the decision making whether to keep a particular event.

[1] http://atlas.ch/

A number I've found useful is that 1080p60 raw video is roughly 1Gb/s (using YUV). So the car has the equivalent of 8 such cameras. Doesn't seem unreasonable.

The impressive bit is paying attention to that much meaningful data. It would be extremely taxing to actively pay attention to that many screens for an extended period of time; granted, filtering limits that, just as peripheral vision does for us. I tend to take for granted the immense amount of data processed subconsciously.

What are things city planners can do from an infrastructure perspective so that we can drop the amount of data that needs to analyzed in real time? For example, can we use special paint for road lines in order to make the cost of detection of road lanes lower?

keeping up with maintenance is probably the most important thing towards making either self-driving or human driven cars safe. faded, worn, or poorly-visible road markings and signage has got to be a huge challenge. Signage and marking is already very well standardized and optimized for low ambiguity.

making anything special is probably just going to make things more difficult - the cars have to be able to drive anywhere, regardless of whether the road has the correct markings or not.

I'll argue that signage and marking is already very well standardize and optimized for low ambiguity for human vision, but we are able to build things that humans cannot see but technology can.

The cars do have to be able to drive anywhere, but that doesn't mean we cannot improve quality of service in other areas. Suppose we can build roads where self driving cars can safely travel 100mph+ due to special infrastructure - I do see that as a big benefit.

Some people are using LiDAR/computer vision to decide where maintenance is needed http://www.nicta.com.au/media/previous_releases3/2012_media_...

I can't imagine many too many cases where they'd be useful in everyday situations because if the car can't handle everyday situations then it's not going to be a good car and is going to sink to the "we're concerned for safety" lobbying that rivals will engage in to ban it off the road until they've caught up. Perhaps QR-code machine readable parking with the available hours, that'd be useful even without self-driving cars.




The purpose of most signage and road lines is to decrease ambiguity (Are there three or four lanes here? Which is better for a left turn?) and increase throughput safely. If signage could be made less ambiguous in such a way that self driving cars benefit, that's a good thing.

I think, however, that such a scheme should be based on single point of truth, ie. no QR-codes: The signage that a human can read should be the same that the computer can read.

The car knows where it is to within a few cm. No need for anything except a database of parking information. Don't clutter up the environment with QR codes that humans can't read.

Last week we had snow on the road and I couldn't see the lines. On my way into work this morning, I noticed some truck had drop clumps of hay on the road. I am thinking that the car better be able to handle things without extra work on the road.

They should let every electronic light have some infrared LEDs and come up with a standard communications protocol to allow for fast transfers of alerts without the overhead of image processing.

For example, emergency vehicles could be broadcasting their presence and speed.

Schoolbuses could be doing the same and also broadcast when they are letting children out. same with the blinking light on school zone signs.

Emergency vehicles already broadcast their presence to traffic signals:


What you're looking for is something similar to ADS-B in aircraft, hopefully designed properly this time with encryption and authentication.

I'd love to see better integration of live traffic safety data into the cars' navigation (ie. Highway police logging lane or road closures). So, if there's a big accident on a major freeway, cars will automatically begin to find early detours rather than getting caught in gridlock.

No! Do it like humans do it - use deep learning to let it grok its environment better with less / different sensor data. Also, make it pay attention to history (it may already, I don't know) and give it something like object permanence.

Move car parking between bike lanes and the street, as they do in Amsterdam.

Drop billions of RFID tags into asphalt, make a millimetre-accurate real world map and rent the data to Google. Or do nothing and let Google figure it out.

For reference, the raw RGB data for 1080p video at 30 fps works out to 178 MB/s.

How much data does a human driver gather per second?

From what I've seen on the highway? About 160 characters.

Does anyone know what happens with this data? Obviously it is analyzed in realtime for driving but is it also stored for machine learning purposes? Is that data shared so that other cars can take advantage of it?

It's 1 GB/sec of data from sensors. Likely processed, but not stored.

To get 1 GB/sec write throughput in a moving car you would need to be using some form of solid state storage, which would be pretty darn pricey over an hour test drive.

It's likely far cheaper to store it than it would be to run the test again. This is Google, they can certainly afford it – I'd be incredibly surprised if they weren't storing it.

Having developed autonomous vehicles in a previous career, I can tell you that raw sensor data is far less useful than you think. By the time you are starting to hand over controls to the system itself you are pretty damn confident that your pipeline up to the control logic is all very solid.

Also, i'm not entirely sure what you think the costs involved in a test are? We purchased about 20 acres of land for a test track, but that was the only real expense involved in day to day development.

The true data rates (see my other posts) are most likely around 5-10 MB/s. Which is so low the benefits out weigh the costs. After all, what if their is an issue with the software? How are you going to reproduce without the logs, especially on a R&D system like Google has (yes it's still R&D, they can't even drive in the rain).

Your "true data rate" is completely wrong. You have no idea at what frequency they are converting analog to digital signals. You don't know for sure they are using off the shelf sensors. Your ignoring data sources in your calculations like OBD from the ECU or even GPS and AGPS data. etc etc.

You only need to store the data until you can upload it, which might not be that long if the car has an internet connection. Furthermore I assume not all parts of that data is equally valuable. If this is raw data coming from sensors I think it is fair to assume that a significant part is noise that can be discarded.

So I would assume that even though you are probably unable to store 1GB/s of data - today at least - you might still be able to store the interesting parts and use those.

Computers get hacked evey now and then, I predict software assassinations in the future, i.e. somebody hacks into a self-driving car, installs a software component which will be activated at some point (i.e. when it identifies a specific car/human and crash into it). Then it can auto-destruct, no evidence left. I ask, who will be responsible?

Probably find the hacker? Just like what we do right now ...

You are assuming detecting manipulation is straightforward or even possible. What if someone is able to MITM software updates?

I'm not assuming it's possible or feasible. I'm saying the problems are the same even as of today - it's difficult even now. Of course, hackers capabilities will increase, but same goes with those who are trying to catch them. Eventually, everyone will be monitored though ... And apparently US govt. has already started doing that ...

Here are my analyses:

When somebody hacks into one's computer, he can't do any physical harm, he sure can steal money, spy on the victim, but he can't make the computer burn or somehow harm the victim.

The computers that can be used for doing real damage (i.e. bomb activation, drone command and control centre, etc.) are well protected from the hackers (not connected to the internet and physicall inaccessable).

Controrary to this, the self-driving cars are potentially dangerous and will be widely accessable and hackable, I see this as a real problem. We should not allow a computer to do things that can kill people, and a computer-driven car can kill people.

> We should not allow a computer to do things that can kill people

Should we not allow computers (auto-pilot) to fly airplanes?

Though your example is not the best (pilots are still there and can disable the auto-pilot and drive in the manual mode) I will correct myself: I think we should now allow computers, which can be easily accessed and potentially altered/hacked, to do things that can kill people.

The thing is, computers as of today are dangerous enough already, and depending on which ones you hack can cause enough destruction as of today. So of course, security is important for all of them and I'm sure Google's self-driving car will do something to be as secure as they can.

Most Boeing 747 craft run a very old, unpatched version of Solaris. Someone with user access could almost certainly crash a plane using that. The same holds true for a lot of consumer goods running embedded linux. Even my damn TV has vulnerabilities.

>pilots are still there and can disable the auto-pilot

If the computer decides to allow the pilot to do that.

> We should not allow a computer to do things that can kill people, and a computer-driven car can kill people.

Your car's computer already has control of throttle, brakes, door locks, etc.

While we're at it, we should probably prevent people from doing things that can kill people.

People have already demonstrated this capability with modern human-driven cars. The only way you can avoid that would be to remove electronics from card to get back to a 1960s level.

The loss of ABS, traction control, and engine management systems would likely kill more people than any phantom assassination threat ever will.

Would it be possible for someone to remotely interfere with the sensors on this car, causing a malfunction or collision?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact