【系列文章】面向自动驾驶的三维点云处理与学习(4)

2021-03-04 10:54:50 浏览数 (1)

标题:3D Point Cloud Processing and Learning for Autonomous Driving

作者:Siheng Chen, Baoan Liu, Chen Feng, Carlos Vallespi-Gonzalez, Carl Wellington

编译:点云PCL

来源:arXiv 2020

本文仅做学术分享,如有侵权,请联系删除。内容如有错误欢迎评论留言,未经允许请勿转载!

写在前面

这篇文章在可以说是很完整的介绍了点云在自动驾驶中各个模块的角色,从宏观的分模块的介绍了点云在自动驾驶中的作用与应用,看完整篇文章,将不仅对自动驾驶技术有了更为全面的理解,并理解点云在自动驾驶中的重要性,这里介绍的高精地图的创建以及定位感知等模块介绍是自动驾驶领域的核心技术,比如在介绍的定位模块的两种定位方式的时候就介绍了不同场景下的使用语义的几何信息以及点云强度信息进行定位的方法时,完全对得上apollo自动驾驶方案,让读者收获颇多。这里博主决定将其完整的翻译过来分享给更多感兴趣的朋友。

【系列文章】面向自动驾驶的三维点云处理与学习(1)

【系列文章】面向自动驾驶的三维点云处理与学习(2)

【系列文章】面向自动驾驶的三维点云处理与学习(3)

在翻译与理解的过程中可能存在笔误或者理解不到位,欢迎大家留言交流。由于文章篇幅较长,文章将被分成多篇文章更新发布。其完整英文版pdf可在免费知识星球中获取。

目录

1、介绍

1-A 自动驾驶的意义、历史与现状

1-B 一个完整的自动驾驶系统模块

1-C 三维点云处理与学习

1-D 大纲

2、三维点云处理与学习的关键要素

2-A 点云特性

2-B 矩阵表示法

2-C 代表性的工具

3、高精地图的创建以及三维点云的处理

3-A 高精地图创建模块概述

3-B 三维点云的拼接

3-C 点云语义特征的提取

3-D 地图创建面对的挑战

4、基于点云定位的处理

4-A 定位模块的概述

4-B 基于地图的定位

4-C 点云定位存在的挑战

5、点云感知

5-A 感知模块概述

5-B 3D点云物体的检测

5-C 点云感知存在的挑战

6、总结与扩展问题

6-A 学术界与工业领域的关系

6-B 定性结果

4,基于点云定位的处理

4-A

定位模块的概述

如第1-B节所述,定位模块的作用是定位出自动驾驶车辆相对于高精地图中相对参考位置的位置。它需要来自多个传感器的实时测量数据,包括激光雷达、IMU、GPS、里程计、摄像头以及高精地图;见下图。

这是一个标准的基于点云地图的定位系统中包括两个核心部分:激光雷达到地图的配准和多传感器融合。激光雷达地图配准采用基于几何的匹配和基于激光反射率的匹配来实现高精度和高召回率;多传感器融合采用Bayes滤波器来融合多种模式。

我们知道高精地图的三维表示的点云数据,自动驾驶车辆的定位输出是6自由度姿态(平移和旋转),这是地图坐标系和激光雷达坐标系之间的刚性变换。定位模块对于自动驾驶的重要性在于它将高精地图连接到自主系统中的其他模块。例如,通过将HD地图的先验信息(如车道几何结构)投影到LiDAR坐标系上,自动驾驶车辆获得了自己在哪条车道上行驶以及检测到的交通在哪条车道上的先验知识。

为了实现全自主驾驶,高精度和鲁棒性是衡量定位模块性能的关键。

高精度。平移误差在厘米级,转角误差在微弧度级。它允许从1公里远检测到的交通标志与高精地图中的车道关联性,并且可以通过测量车轮到车道边界的距离来预测较近交通的换道指示,这对运动规划和预测模块有很大的帮助;

鲁棒性。分析表明,该定位模块能够适应光照、天气、交通和道路条件的变化。需要注意的是,尽管具有实时运动学模式的商用级GPS/IMU单元在开阔区域具有精确的位置测量,但是由于复杂路况下,它在城市中存在精度低的问题,因此对于自动驾驶来说不够健壮。为了达到上述标准,基于地图的多传感器融合定位是标准的方法。

4-B

基于地图的定位

基于地图的定位的基本思想是利用IMU、GPS和摄像机的测量数据,将激光雷达扫描数据与高精地图中的点云地图进行匹配,从而估计出LIDAR的姿态。基于地图的定位系统通常由两个组件组成;第一个模块是LiDAR到地图的配准,它通过将LiDAR扫描点云配准到点云地图来计算LiDAR姿态;第二个模块是多传感器融合,它通过IMU、里程计、GPS以及LiDAR到地图配准来估计最终的姿态。

激光雷达地图配准。LiDAR到点云地图的配准模块是通过将LiDAR扫描数据与点云地图匹配来直接估计LiDAR姿态。设S, S(map)分别为实时激光雷达扫描数据和点云地图。激光雷达对地图的配准问题可以表述为

其中P是激光雷达姿态,x_i是激光雷达扫描中数据中的第i帧的三维点云,S(map)是点云地图中与激光雷达扫描数据中的第i个三维点相关联的3D点。函数G(*)表示了从激光雷达扫描点和HD地图点之间的损失函数。通常,g(⋅)的形式为激光雷达扫描数据和点云地图之间的点到点、点到线或点到面距离。为了解决该问题并获得较高的精度和召回率,主要有两种方法。

基于几何的匹配。这种方法通过将激光雷达扫描点与基于ICP算法的点云地图匹配来计算高精度6自由度姿态[19]。这种方法通常适用于交通复杂和具有挑战性的天气条件,如下雪,因为点云地图包含大量的几何先验信息,可供激光雷达扫描匹配点;然而,在几何生成的场景中,如隧道、桥梁和公路,ICP计算可能会因丢失几何模式而出现偏差,从而导致精度不高;

基于激光反射率的匹配。该方法通过激光雷达扫描与基于激光反射率信号的点云地图匹配来计算姿态。匹配可以采用稠密二维图像匹配方法,也可以采用基于特征提取的ICP匹配方法。

对于第一种方法,首先将激光雷达扫描和点云地图的激光反射率读数转换为基于BEV表示的灰度二维图像,然后通过图像匹配技术计算姿态。请注意,此方法仅计算姿势的x、y和偏航分量。为了获得六自由度姿态,需要基于HD地图中的地形信息估计z、roll、pitch分量。

对于第二种方法,首先根据激光反射率值从激光雷达扫描点中提取感兴趣的区域对象,如车道标记和极点[20]。然后,可以使用ICP算法通过在实时激光雷达扫描和HD地图中的先验值之间匹配感兴趣对象的区域来计算激光雷达姿态。在公路和桥梁的场景中,这种方法通常优于基于几何的匹配,因为这些场景缺乏几何特征,但地面上有丰富的激光反射纹理(例如虚线车道标记)。这种方法不能很好地工作在具有挑战性的天气条件下,如大雨和大雪的激光反射率的地面将发生重大变化。为了获得最佳的性能,这两种方法都可以同时用于LiDAR姿态估计,但是,单靠LiDAR-To-map配准不能保证姿态估计的100%精度和召回率。举一个极端的例子,如果激光雷达被并排车辆或前后行驶的卡车完全遮挡,激光雷达到地图的配准模块将失败。为了处理极端情况并使定位模块具有鲁棒性,需要多传感器融合组件。

多传感器融合。多传感器融合模块是根据多个传感器(包括IMU、GPS、里程计、摄像机)的测量值,以及LiDAR-to-map配准模块估计的姿态,来估计一个健壮且具有高置信度的姿态。多传感器融合的标准方法是采用贝叶斯滤波形式,如卡尔曼滤波、扩展卡尔曼滤波或粒子滤波。Bayes滤波器考虑了一种基于车辆运动动力学和多传感器测量值的迭代方法来预测和校正LiDAR姿态和其他状态。

在自动驾驶系统中,Bayes滤波器跟踪和估计的状态通常包括姿态、速度、加速度等运动相关状态和IMU偏差等传感器相关状态,Bayes滤波器分为预测和更新校正两个迭代步骤。在预测步骤中,在传感器测量值之间的间断期间,贝叶斯滤波器基于车辆运动动力学和假定的传感器模型预测车辆状态。例如,将恒定加速度近似作为车辆在短时间内的运动动力学,可以用牛顿定律预测姿态、速度和加速度的演化。通过假设IMU具有白噪声特性,可以预测IMU的偏置状态。在校正步骤中,当接收到传感器读数或姿态测量时,贝叶斯滤波器基于相应的观测模型校正状态。例如,当接收到IMU读数时,校正加速度、角速度和IMU偏移的状态。当接收到位姿测量时,位置状态被校正。注意,这些状态需要更新校正,因为预测步骤不是完美的,并且随着时间的推移存在累积误差。

4-C

点云定位的挑战

定位模块的挑战是在极端场景中工作时的不确定性,例如,当自动驾驶车辆在没有虚线车道标志的直行隧道中行驶时,几乎没有几何和纹理特征,导致激光雷达无法进行地图配准;当自动驾驶车辆被大卡车包围时,激光雷达可能会被完全阻挡,也会导致激光雷达无法进行地图额批准。当激光雷达的地图配准失败持续数分钟后,由多传感器融合分量估计的激光雷达位姿将发生较大漂移,定位模块将失去定位精度等等问题。

参考文献

向上滑动阅览

[1] A. Taeihagh and H. Si Min Lim, “Governing autonomous vehicles: emerging responses for safety, liability, privacy, cybersecurity, and industry risks,” Transport Reviews, vol. 39, no. 1, pp. 103–128, Jan. 2019.

[2] National Research Council, “Technology development for army unmanned ground vehicles,” 2002.

[3] C. Badue, R. Guidolini, R. Vivacqua Carneiro, P. Azevedo, V. Brito Cardoso, A. Forechi, L. Ferreira Reis Jesus, R. Ferreira Berriel, T. Meireles Paixo, F. Mutz, T. Oliveira-Santos, and A. Ferreira De Souza, “Self-driving cars: A survey,” arXiv:1901.04407 [cs.RO], Jan. 2019. [4] M. Bansal, A. Krizhevsky, and A. S. Ogale, “ChauffeurNet: Learning to drive by imitating the best and synthesizing the worst,” CoRR, vol. abs/1812.03079, 2018.

[5] C. Urmson, J. Anhalt, D. Bagnell, C. R. Baker, R. Bittner, M. N. Clark, J. M. Dolan, D. Duggins, T. Galatali, C. Geyer, M. Gittleman, S. Harbaugh, M. Hebert, T. M. Howard, S. Kolski, A. Kelly, M. Likhachev, M. McNaughton, N. Miller, K. M. Peterson, B. Pilnick, R. Rajkumar, P. E. Rybski, B. Salesky, Y-W. Seo, S. Singh, J. M. Snider, A. Stentz, W. Whittaker, Z. Wolkowicki, J. Ziglar, H. Bae, T. Brown, D. Demitrish, B. Litkouhi, J. Nickolaou, V. Sadekar, W. Zhang, J. Struble, M. Taylor, M. Darms, and D. Ferguson, “Autonomous driving in urban environments: Boss and the urban challenge,” in The DARPA Urban Challenge: Autonomous Vehicles in City Traffic, George Air Force Base, Victorville, California, USA, 2009, pp. 1–59.

[6] G. P. Meyer, A. Laddha, E. Kee, C. Vallespi-Gonzalez, and C. K. Wellington, “Lasernet: An efficient probabilistic 3d object detector for autonomous driving,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recogn., 2019.

[7] C. Ruizhongtai Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recogn., 2017, pp. 77–85.

[8] X. Chen, H. Ma, J. Wan, B. Li, and T. Xia, “Multi-view 3d object detection network for autonomous driving,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recogn., 2017, pp. 6526–6534.

[9] M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Communications of the ACM, vol. 24, no. 6, pp. 381–395, 1981.

[10] X-F. Hana, J. S. Jin, J. Xie, M-J. Wang, and W. Jiang, “A comprehensive review of 3d point cloud descriptors,” arXiv preprint arXiv:1802.02297, 2018.

[11] J. Peng and C.-C. Jay Kuo, “Geometry-guided progressive lossless 3D mesh coding with octree (OT) decomposition,” ACM Trans. Graph. Proceedings of ACM SIGGRAPH, vol. 24, no. 3, pp. 609–616, Jul. 2005.

[12] A. Ortega, P. Frossard, J. Kovacevic, J. M. F. Moura, and P. Vandergheynst, “Graph signal processing: Overview, challenges, and applications,” Proceedings of the IEEE, vol. 106, no. 5, pp. 808–828, 2018.

[13] S. Chen, D. Tian, C. Feng, A. Vetro, and J. Kovacevi ˇ c, “Fast resampling ´ of three-dimensional point clouds via graphs,” IEEE Trans. Signal Processing, vol. 66, no. 3, pp. 666–681, 2018.

[14] Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M. Solomon, “Dynamic graph CNN for learning on point clouds,” ACM Transactions on Graphics (TOG), vol. 38, no. 5, November 2019.

[15] S. Chen, S. Niu, T. Lan, and B. Liu, “Large-scale 3d point cloud representations via graph inception networks with applications to autonomous driving,” in Proc. IEEE Int. Conf. Image Process., Taipei, Taiwan, Sept. 2019.

[16] G. Li, M. Muller, A. K. Thabet, and B. Ghanem, “DeepGCNs: Can ¨ GCNs go as deep as CNNs?,” in ICCV, Seoul, South Korea, Oct. 2019.

[17] G. Grisetti, R. Kummerle, C. Stachniss, and W. Burgard, “A tutorial ¨ on graph-based SLAM,” IEEE Intell. Transport. Syst. Mag., vol. 2, no. 4, pp. 31–43, 2010.

[18] D. Droeschel and S. Behnke, “Efficient continuous-time SLAM for 3d lidar-based online mapping,” in 2018 IEEE International Conference on Robotics and Automation, ICRA, 2018, Brisbane, Australia, May 21-25, 2018, 2018, pp. 1–9.

[19] P. J. Besl and N. D. McKay, “A method for registration of 3D shapes,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 14, no. 2, pp. 239–256, 1992.

[20] A. Y. Hata and D. F. Wolf, “Road marking detection using LIDAR reflective intensity data and its application to vehicle localization,” in 17th International IEEE Conference on Intelligent Transportation Systems, ITSC 2014, Qingdao, China, October 8-11, 2014, 2014, pp. 584–589.

[21] S. Shi, X. Wang, and H. Li, “PointRCNN: 3d object proposal generation and detection from point cloud,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recogn., Long Beach, CA, June 2019.

[22] B. Li, “3d fully convolutional network for vehicle detection in point cloud,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017, Vancouver, BC, Canada, September 24-28, 2017, 2017, pp. 1513–1518.

[23] B. Yang, W. Luo, and R. Urtasun, “PIXOR: real-time 3d object detection from point clouds,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recogn., 2018, pp. 7652–7660.

[24] J. Zhou, X. Lu, X. Tan, Z. Shao, S. Ding, and L. Ma, “Fvnet: 3d front-view proposal generation for real-time object detection from point clouds,” CoRR, vol. abs/1903.10750, 2019.

[25] B. Li, T. Zhang, and T. Xia, “Vehicle detection from 3d lidar using fully convolutional network,” in Robotics: Science and Systems XII, University of Michigan, Ann Arbor, Michigan, USA, June 18 - June 22, 2016, 2016.

[26] Y. Zhou and O. Tuzel, “Voxelnet: End-to-end learning for point cloud based 3d object detection,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recogn., Salt Lake City, UT, USA, June 2018, pp. 4490–4499.

[27] S. Shi, Z. Wang, X. Wang, and H. Li, “Part-a2 net: 3d part-aware and aggregation neural network for object detection from point cloud,” CoRR, vol. abs/1907.03670, 2019.

[28] Y. Yan, Y. Mao, and B. Li, “Second: Sparsely embedded convolutional detection,” Sensors, vol. 18, no. 10, 2019.

[29] A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “Pointpillars: Fast encoders for object detection from point clouds,” CoRR, vol. abs/1812.05784, 2018.

[30] T-Y. Lin, P. Dollar, R. B. Girshick, K. He, B. Hariharan, and S. J. ´ Belongie, “Feature pyramid networks for object detection,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recogn., 2017, pp. 936–944.

[31] F. Yu, D. Wang, E. Shelhamer, and T. Darrell, “Deep layer aggregation,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recogn., 2018, pp. 2403–2412.

[32] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. E. Reed, C-Y. Fu, and A. C. Berg, “SSD: single shot multibox detector,” in Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part I, 2016, pp. 21–37.

[33] T-Y. Lin, P. Goyal, R. B. Girshick, K. He, and P. Dollar, “Focal ´ loss for dense object detection,” in IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, 2017, pp. 2999–3007.

[34] J. Ku, M. Mozifian, J. Lee, A. Harakeh, and S. L. Waslander, “Joint 3d proposal generation and object detection from view aggregation,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2018, Madrid, Spain, October 1-5, 2018, 2018, pp. 1–8.

[35] C. Ruizhongtai Qi, W. Liu, C. Wu, H. Su, and L. J. Guibas, “Frustum pointnets for 3d object detection from RGB-D data,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recogn., 2018, pp. 918–927.

[36] D. Xu, D. Anguelov, and A. Jain, “PointFusion: Deep sensor fusion for 3d bounding box estimation,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recogn. 2018, pp. 244–253, IEEE Computer Society.

[37] M. Liang, B. Yang, S. Wang, and R. Urtasun, “Deep continuous fusion for multi-sensor 3d object detection,” in Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XVI, 2018, pp. 663–678.

[38] M. Liang, B. Yang, Y. Chen, R. Hu, and R. Urtasun, “Multi-task multi-sensor fusion for 3d object detection,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recogn., 2019, pp. 7345–7353.

[39] G. P. Meyer, J. Charland, D. Hegde, A. Laddha, and C. VallespiGonzalez, “Sensor fusion for joint 3d object detection and semantic segmentation,” CoRR, vol. abs/1904.11466, 2019.

[40] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the KITTI vision benchmark suite,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recogn., Providence, RI, June 2012, pp. 3354– 3361.

[41] K-L. Low, “Linear least-squares optimization for point-toplane icp surface registration,” Tech. Rep., University of North Carolina at Chapel Hill, 2004.

[42] J. Yang, H. Li, D. Campbell, and Y. Jia, “Go-icp: A globally optimal solution to 3d ICP point-set registration,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 11, pp. 2241–2254, 2016.

[43] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” International Journal of Robotics Research (IJRR), 2013.

[44] Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao, “3d shapenets: A deep representation for volumetric shapes,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recogn., 2015, pp. 1912–1920.

[45] F. Pomerleau, F. Colas, and R. Siegwart, “A review of point cloud registration algorithms for mobile robotics,” Foundations and Trends in Robotics, vol. 4, no. 1, pp. 1–104, 2015.

[46] H. Fathi, F. Dai, and M. I. A. Lourakis, “Automated as-built 3d reconstruction of civil infrastructure using computer vision: Achievements, opportunities, and challenges,” Advanced Engineering Informatics, vol. 29, no. 2, pp. 149–161, 2015.

[47] B. Reitinger, C. Zach, and D. Schmalstieg, “Augmented reality scouting for interactive 3d reconstruction,” in IEEE Virtual Reality Conference, VR 2007, 10-14 March 2007, Charlotte, NC, USA, Proceedings, 2007, pp. 219–222.

未完待续...

0 人点赞