HP3 CUDA Programming

Parallel Programming on the GPU

Project 2 Optimization of vSLAM applications

Objective

SLAM or Simultaneous Localization And Mapping is a key technology for enabling robots and automotives to operate in complex and unknown environments. The basic idea behind SLAM is to use sensor measurements (such as camera images, LIDAR scans, or sonar data) to create a map of the environment, and to use that map to estimate the robot's location and orientation within the environment. Visual SLAM or vSLAM uses camera as sensor to extract features and map the environment.

ORB-SLAM2 is the state-of-the-art vSLAM system that uses feature-based methods for real-time camera tracking and 3D reconstruction. It is based on the ORB (Oriented FAST and Rotated BRIEF) feature detector and descriptor, which is a fast and efficient method for detecting and matching feature points in images. It uses a combination of geometric and photometric information to estimate the camera pose and create a 3D map of the environment. ORB-SLAM2 is a computationally intensive task and can lead to significant performance improvements if parallel processing power of the GPU can be harnessed and optimised. The implementation of ORB-SLAM2 in CUDA involves modifying the existing CPU-based code to use CUDA for parallel processing on the GPU. This involves identifying the computationally intensive parts of the code and converting them into GPU kernels, which can be executed in parallel on the GPU.

Our research group has implemented a lidar based SLAM application on an automotive robot powered with NVIDIA Jetson Nano and lidar. Jetson Nano is an embedded system-on-module (SoM) and developer kit from the NVIDIA Jetson family, including an integrated 128-core Maxwell GPU, quad-core ARM A57 64-bit CPU, 4GB LPDDR4 memory.

The group need to finish the following work items-

Implement of ORB-SLAM2 algorithm in CUDA.

Implement and analysis of various optimization techiques to enhance the performance of ORB-SLAM2 algorithm.

End to end integration of impelmented vSLAM applications on real automotive robot (Jetson Nano).

Evaluation Guidelines:

Out of 50 marks, individual contribution of will have 80% weigtage and group contribution will have 20% weightage. So, each student in the group will be evaluated out of 40 marks for his/her individual performance and contribution towards the project. The remaining 10 marks will be evaluated on the overall performance of the whole project.

Implementation: Implementation and demonstrtion of ORB-SLAM2 Algorithm with various GPU optimizations on Jetson Nano board.
main |-kernels |-host code |-pre-processing |-evaluation
Project Report: Design a detailed report documenting the algorithms, the optimizations used for CUDA kernel operations and experimental results.
Project Presentation: Design a detailed presentation describing the algorithms and experimental observations

Reference Papers

Peng, Tao, Dingnan Zhang, Don Lahiru Nirmal Hettiarachchi, and John Loomis. "An evaluation of embedded GPU systems for visual SLAM algorithms." Electronic Imaging 2020, no. 6 (2020): 325-1.
Peng, Tao, Dingnan Zhang, Ruixu Liu, Vijayan K. Asari, and John S. Loomis. "Evaluating the power efficiency of visual SLAM on embedded GPU systems." In 2019 IEEE National Aerospace and Electronics Conference (NAECON), pp. 117-121. IEEE, 2019.
Mur-Artal, Raul, and Juan D. Tardós. "Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras." IEEE transactions on robotics 33, no. 5 (2017): 1255-1262.