Parallel Programming on the GPU
SLAM or Simultaneous Localization And Mapping is a key technology for enabling robots and automotives to operate in complex and unknown environments. The basic idea behind SLAM is to use sensor measurements (such as camera images, LIDAR scans, or sonar data) to create a map of the environment, and to use that map to estimate the robot's location and orientation within the environment. Visual SLAM or vSLAM uses camera as sensor to extract features and map the environment.
ORB-SLAM2 is the state-of-the-art vSLAM system that uses feature-based methods for real-time camera tracking and 3D reconstruction. It is based on the ORB (Oriented FAST and Rotated BRIEF) feature detector and descriptor, which is a fast and efficient method for detecting and matching feature points in images. It uses a combination of geometric and photometric information to estimate the camera pose and create a 3D map of the environment. ORB-SLAM2 is a computationally intensive task and can lead to significant performance improvements if parallel processing power of the GPU can be harnessed and optimised. The implementation of ORB-SLAM2 in CUDA involves modifying the existing CPU-based code to use CUDA for parallel processing on the GPU. This involves identifying the computationally intensive parts of the code and converting them into GPU kernels, which can be executed in parallel on the GPU.
Our research group has implemented a lidar based SLAM application on an automotive robot powered with NVIDIA Jetson Nano and lidar. Jetson Nano is an embedded system-on-module (SoM) and developer kit from the NVIDIA Jetson family, including an integrated 128-core Maxwell GPU, quad-core ARM A57 64-bit CPU, 4GB LPDDR4 memory.
main
|-kernels
|-host code
|-pre-processing
|-evaluation