Two dimension data for a three dimension problem - Is Elon Wrong About LiDAR?

This turns out to be an interesting article

Musk’s prediction cast a spotlight on a rapidly growing divide in the world of AV (autonomous vehicle) development: whether to aim for vehicles that, like human drivers, can navigate the world through sight alone or if sensors like LiDAR are still necessary to counter-balance some of the limitations of computer vision.

Is Elon Wrong About LiDAR? (Scale)

While the world is 3D the self-driving car needs a 2D “map” to navigate. Lidar uses the same techniques as radar but uses reflected laser instead of radio waves to aid in the 3D/2D conversion as opposed to only using video which has difficulty with precise ranges.

This is very similar to the issue discussed on the thread about the third mate on the Exxon Valdez. While a pilot, being in familiar waters, can convert the 3D view he sees to the 2D “map” in his head the third mate has to use a literal map (the chart) to do the same task.

Like the self-driving car needed Lidar (or not) the third mate needs radar ranges and compass bearing to make the conversion.

What does it mean that humans struggle to annotate these types of 2D scenes (while managing to walk around our houses and drive to work without a second thought)? It’s really a testament to just how differently our brains go about perceiving the world compared to the software of a self-driving car. When it comes to planning physical movements, we don’t need to perform mathematical calculations in our head about the environment to know to hit the brakes when a driver runs a red-light in front of us.

If I needed a “chart” of the inside of my house and had to plot bearing and ranges to navigate inside the house the task of going to the kitchen and back for a cup of coffee would be much more difficult. Likewise the third mate has a much more difficult task than the pilot

I find it fairly obvious that advances in neural network AI will soon lead to self driving cars that only need a camera input or two, probably a CNN stacked on some sort of LSTM (but I’m no expert). Just consider how well a gifted amateur can do on his own these days:

This field is advancing with blinding speed, ref. the obliquely related GPT-2, which scares me quite a bit.

Lidar is just another means to find the own position, relative to other ones’.
The problem stays always at the “Artificial Intelligence” level. This is not a thing falling down from the skies; humans must program it.

How can you program the decision to hit a full school bus or an innocent tree?
How can you program the decision to hit a full cruise ship or a floating empty wreck?

That’s only partially true. Programming is a relatively small piece of the puzzle, the heavy lifting happens in the training phase. It’s all about number crunching power and a good training data set. GPT-2 is an example of combining an old training algorithm with an enormous data set and enough computing power to create something hair raising.

You reward you AI for doing what you want, and punish it for doing what it shouldn’t.

My take is that the question is to use video imagery or video imagery AND lidar. Without lidar fewer sensors were used but the computational problems got much more difficult. Musk believes that problem can be solved on the computational side without lidar.

Here is from the post:

In our experience the best way around these issues is to reference high resolution 3D data — this is where LiDAR comes in. Looking at the LiDAR point cloud takes almost all this guesswork out…

I worked on the Honeywell entry for the DARPA Autonomous Vehicle challenge in the 70’s. That being then and this being now, we outlined/partly wrote a Concept of Operations (Conops) for the Gold-Plated Golf Cart (as we called it). Interesting exercise, especially if you toss malicious actors into the mix; Elon should try it :slightly_smiling_face:

So one takeaway was this: if you were going to reduce the perceived risk of collision/allision to a level that “ordinary” people will accept, you are never going to do it by just having a computer look at a picture. Transmitters/transponders are the only thing that will work. Objects of great interest (other vehicles, stoplights, stop signs, pedestrians in crosswalks, etc.) are going to have to announce their presence and bona fides actively. The aircraft collision avoidance folks learned this lesson a while ago. It’s not a vehicle problem, it’s an infrastructure problem

A bit of artistry to reinforce the point:



…punish it as in the other thread “I wonder if that AI-box floats” ?

1 Like