Here is how to perform testing of the Point-GNN algorithm using LiDAR data in the simulated environment for autonomous vehicle designs.
There was a time when self-driving cars taking over the roads was a distant vision. Today, the situation has changed dramatically. Although autonomous vehicles (AVs) took longer than expected to make their first appearance, automotive experts believe that this technology is poised to mature quickly.
Automotive OEMs and their key technology partners are already in the process of developing and testing truly self-driving vehicles—Level 5 autonomous cars—that can cruise along open roads without human drivers. This is partly made possible through the technology of sensor fusion, powered by machine-learning algorithms.
Perception and decision-making
While perception includes the ability of the machine-learning model to accurately identify and classify surrounding objects, decision-making encompasses the ability of the algorithm to determine the position of the object in the future and the possibility of a collision. The vision of a Level 5 autonomous vehicle would be possible only if both these milestones are achieved with accuracy approaching 100%.
Both milestones pose a fair degree of challenge to the machine-learning engineer. To test and ensure supreme accuracy in the perception and decision-making processes, our engineering team designed a simulation environment to test autonomous vehicles. Any algorithm can be tested using this simulated environment. Algorithms can utilize inputs from LiDAR, camera, and radar (as necessary) and provide accurate predictions.
In this article, we discuss the testing of the Point-GNN algorithm in the simulated environment.
Point-GNN is a machine-learning algorithm that processes the point-cloud data for object detection. Here, we discuss the use of LiDAR data and Point-GNN, as well as modified Resnet, for the detection and tracking of 3D objects across multiple frames. We also use this data to predict collisions and the time to collide—a crucial factor for avoiding accidents.
We used sensor fusion to improve the accuracy of the prediction. We also analyzed the performance of graphs and Point-GNN when compared with alternative methods.
Development of simulation environment
The work on this project started toward the end of 2020. In the initial days of the project, we focused on developing the simulation environment. Sensor-fusion algorithm development started after the simulation environment was ready for testing.
In our project, we used the Carla Simulator to develop the demonstrator. The underlying Unreal Engine of Carla manages the rendering, physics of objects, and their movements. It also controls the movement of non-player characters (NPCs) that are essential for realism. Additionally, Carla supports the simulation of various weather and lighting conditions to test our algorithms for ADAS/AD functionalities.
Carla also provides various sensor suites such as radar, LiDAR, color/grayscale cameras, IMUs, and more. For this project, we have configured a LiDAR sensor for 3D point-cloud data and a camera for visualization.
Carla also offers data pipeline for training and validation of ADAS/AD algorithms. In our project, we used the classic data pipeline that comprises subsystems for perception, planning, and control.
Kitti vision benchmark suite
The need for simulated data in autonomous-driving applications has become increasingly important, both for validation of pretrained models and for training new models. For these models to generalize real-world applications, it is critical that the underlying dataset contains a variety of driving scenarios and that the simulated sensor readings closely mimic real-world sensors. In this project, we used the Kitti dataset for training and testing.
Because there are multiple coordinate systems for each sensor, it is necessary to transform coordinates from one reference to the other and fuse the data from multiple sensors. For instance, it’s important to perform transformation operation and correctly draw the bounding boxes in image data, so the bounding boxes of the object detected in point-cloud data (from the LiDAR sensor) are represented accurately.
A high-level architecture diagram of our project is shown below:
Figure 1 The block diagram shows the high-level view of the solution architecture. Source: Embitel
The following steps are performed to detect 3D objects from a point cloud:
A graph is a unique data structure for representing point-cloud data as it doesn’t have a fixed form. It consists of vertices/nodes and edges, as shown below:
Figure 2 Point cloud to graph conversion comprises vertices/nodes and edges. Source: Embitel
A graph (G) can be constructed for a point cloud as G = (P, E), where P represents the point cloud and E represents neighbors in a fixed radius.
We also used the Voxel downsampling technique before the construction of the initial graph to reduce computational complexity.
GNNs use the concept of nodes being defined by their neighbors to derive output. So, in Figure 2, the state of A can be updated based on the states of B, E, and D and the features of edges AB, AE, and AD.
Prediction of collision and time to collide
After the object-detection stage, the next steps are prediction of collision and computation of time to collide. In this context, “ego vehicle” is the term used to refer to the vehicle that has the sensor suite fitted and is controlled by the user (manually or through API calls).
Prediction is done in two steps:
Examples for bounding boxes before and after extension are shown below along with LiDAR data.
Figure 3 This image shows the detected bounding boxes. Source: Embitel
Testing and evaluation of different algorithms
After the installation of the necessary libraries for the programming environment, the 3D object-detection data from the Kitti dataset is used. For Point-GNN, a pretrained model is employed, and evaluation is run over all 7,500+ test samples.
For comparison, modified Resnet is used to train and test on Kitti data. We used pretrained Resnet parameters and added a layer of our own to tune the parameters on training dataset (Kitti).
For modified Resnet, training is performed over nearly 7,400+ training data. For testing, 7,500+ test samples are used.
The workflow of Point-GNN in Carla is shown below:
Figure 4 The Point-GNN workflow is shown in Carla Simulator. Source: Embitel
We identified that Point-GNN has better mean precision when compared with modified Resnet. However, the precision for pedestrians is significantly less than cars.
Figure 5 This is how 3D object detection with LiDAR looks like. Source: Embitel
Figure 6 The comparison shows the difference in achieving object detection through Point-GNN and modified Resnet approaches. Source: Embitel
The road ahead
Currently, our team is working on these aspects to fortify the capabilities of the system for perception and prediction.
This article was originally published on EDN.
Vidya Sagar Jampani is head of Embitel’s IoT Business Unit.
Leya Lakshmanan is head of marketing at Embitel Technologies.