Article(id=1200070539386646535, tenantId=1146029695717560320, journalId=1189918454225211397, issueId=1200070538191274203, articleNumber=null, orderNo=null, doi=10.20104/j.cnki.1674-6546.20230474, pmid=null, cstr=null, oa=null, hot=null, price=null, onlineType=0, articleFormat=0, articleType=null, articleTypeStr=null, receivedDate=null, receivedDateStr=null, revisedDate=1699286400000, revisedDateStr=2023-11-07, acceptedDate=null, acceptedDateStr=null, onlineDate=1764048712821, onlineDateStr=2025-11-25, pubDate=1705248000000, pubDateStr=2024-01-15, doiRegisterDate=null, doiRegisterDateStr=null, onlineIssueDate=1764048712821, onlineIssueDateStr=2025-11-25, onlineJustAcceptDate=null, onlineJustAcceptDateStr=null, onlineFirstDate=null, onlineFirstDateStr=null, sourceXml=null, magXml=null, createTime=1764048712821, creator=13701087609, updateTime=1764048712821, updator=13701087609, issue=Issue{id=1200070538191274203, tenantId=1146029695717560320, journalId=1189918454225211397, year='2024', volume='', issue='1', pageStart='1', pageEnd='48', issueExtLink='null', onlineDate='null', pubDate='null', beforeIssueId=null, nextIssueId=null, price=null, status=1, issueComplete=1, articleOrder=1, issueType=-1, specialIssue=null, createTime=1764048712537, creator=13701087609, updateTime=1764049067190, updator=13701087609, preIssue=null, nextIssue=null, ext={EN=IssueExt(id=1200072025768293209, tenantId=1146029695717560320, journalId=1189918454225211397, issueId=1200070538191274203, language=EN, specialIssueTitle=, coverIllustrator=null, specialIssueEditor=, specialIssueAbout=), CN=IssueExt(id=1200072025768293210, tenantId=1146029695717560320, journalId=1189918454225211397, issueId=1200070538191274203, language=CN, specialIssueTitle=, coverIllustrator=null, specialIssueEditor=, specialIssueAbout=)}, issueFiles=null}, startPage=12, endPage=18, ext={EN=ArticleExt(id=1200070539671859212, articleId=1200070539386646535, tenantId=1146029695717560320, journalId=1189918454225211397, language=EN, title=LiDAR Semantic Segmentation Network for Real-Time Multimodal Projection, columnId=1200070539592167435, journalTitle=Automotive Engineer, columnName=Special Topic on Autonomous Driving Environment Perception and Positioning Technology at Chongqing Jiaotong University, runingTitle=null, highlight=null, articleAbstract=

Traditional methods based on points cannot balance the detection speed and accuracy in LiDAR semantic segmentation. To address this issue, this paper proposes a multimodal fusion LiDAR semantic segmentation network. Semantic features are extracted through the point-grid module, spatial and contextual information is aggregated through the attention mechanism module, semantic segmentation is achieved through the 2D Fully Convolutional Network (FCN) feature fusion pyramid, and finally, information loss is reduced through the fusion of 2D and 3D features, and the weights are updated to optimize the model using the loss function. Verification of SemanticKITTI dataset indicates that this model achieves an average crossover ratio of 63.3%, and takes into account of real-time property and accuracy as compared with other algorithms, which significantly improves the accuracy of LiDAR semantic segmentation.

, correspAuthors=null, authorNote=null, correspAuthorsNote=null, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=null, magXml=null, pdfUrl=null, pdf=null, pdfFileSize=null, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=null, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=null, mapNumber=null, authorCompany=null, fund=null, authors=null, authorsList=Binhong Tang), CN=ArticleExt(id=1200070542255550521, articleId=1200070539386646535, tenantId=1146029695717560320, journalId=1189918454225211397, language=CN, title=基于多模态投影的激光雷达点云实时语义分割网络, columnId=1200070539772522510, journalTitle=汽车工程师, columnName=重庆交通大学自动驾驶环境感知与定位技术专题, runingTitle=null, highlight=null, articleAbstract=

针对激光雷达点云语义分割过程中,传统方法无法权衡检测速度和精度的问题,提出了一种多模态融合的激光雷达点云语义分割网络架构。利用点-网格模块提取语义特征,经过空间注意力模块聚合空间和上下文信息,通过二维全卷积网络(FCN)特征融合金字塔模块实现语义分割,最后通过二维和三维特征融合减少信息损失,并使用损失函数更新权重优化模型。SemanticKITTI数据集上的验证结果表明,该模型平均交并比达到63.3%,与其他优秀算法相比兼顾了实时性与准确性,显著提高了激光雷达点云语义分割的精度。

, correspAuthors=null, authorNote=null, correspAuthorsNote=null, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=qD80aI0CaN9KAxV12IUAqA==, magXml=r9E9W5shTipNkQ4vPxeV6w==, pdfUrl=null, pdf=v/jpdKJ7ddhwH2mBPrsrLw==, pdfFileSize=1482847, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=q9P98DWNUIuTfGUUPBYTOg==, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=Iu0vcxmWn9uLfNL+yAXItg==, mapNumber=null, authorCompany=null, fund=null, authors=null, authorsList=唐彬洪)}, authors=[Author(id=1200407190952604172, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, orderNo=0, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1200407191057461783, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, authorId=1200407190952604172, language=EN, stringName=Binhong Tang, firstName=Binhong, middleName=null, lastName=Tang, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=Chongqing Jiaotong University, Chongqing 400074, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1200407191158125089, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, authorId=1200407190952604172, language=CN, stringName=唐彬洪, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=重庆交通大学, 重庆 400074, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1200407190789026302, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, xref=null, ext=[AuthorCompanyExt(id=1200407190801609218, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, companyId=1200407190789026302, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=Chongqing Jiaotong University, Chongqing 400074), AuthorCompanyExt(id=1200407190826775046, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, companyId=1200407190789026302, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=重庆交通大学, 重庆 400074)])])], keywords=[Keyword(id=1200407191304925747, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=EN, orderNo=1, keyword=Automatic driving), Keyword(id=1200407191397200441, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=EN, orderNo=2, keyword=LiDAR), Keyword(id=1200407191518835263, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=EN, orderNo=3, keyword=Semantic segmentation), Keyword(id=1200407191627887183, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=EN, orderNo=4, keyword=Deep learning), Keyword(id=1200407191753716315, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=CN, orderNo=1, keyword=自动驾驶), Keyword(id=1200407191892128359, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=CN, orderNo=2, keyword=激光雷达), Keyword(id=1200407192022151796, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=CN, orderNo=3, keyword=语义分割), Keyword(id=1200407192139592317, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=CN, orderNo=4, keyword=深度学习)], refs=[Reference(id=1200407195461481234, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, doi=null, pmid=null, pmcid=null, year=2017, volume=null, issue=null, pageStart=652, pageEnd=660, url=null, language=null, rfNumber=[1], rfOrder=0, authorNames=QI C R, SU H, MO K, journalName=2017 IEEE Conference on Computer Vision and Pattern Recognition, refType=null, unstructuredReference=QI C R, SU H, MO K, et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 652-660., articleTitle=PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation, refAbstract=null), Reference(id=1200407195583116057, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, doi=null, pmid=null, pmcid=null, year=2017, volume=30, issue=null, pageStart=5105, pageEnd=5114, url=null, language=null, rfNumber=[2], rfOrder=1, authorNames=QI C R, YI L, SU H, journalName=Advances in Neural Information Processing Systems, refType=null, unstructuredReference=QI C R, YI L, SU H, et al. Pointnet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space[C]// Advances in Neural Information Processing Systems, 2017, 30: 5105-5114., articleTitle=Pointnet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space, refAbstract=null), Reference(id=1200407195692167965, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, doi=null, pmid=null, pmcid=null, year=2018, volume=31, issue=null, pageStart=820, pageEnd=830, url=null, language=null, rfNumber=[3], rfOrder=2, authorNames=LI Y Y, BU R, SUN M C, journalName=Advances in Neural Information Processing Systems, refType=null, unstructuredReference=LI Y Y, BU R, SUN M C, et al. PointCNN: Convolution on X-Transformed Points[C]// Advances in Neural Information Processing Systems, 2018, 31: 820-830., articleTitle=PointCNN: Convolution on X-Transformed Points, refAbstract=null), Reference(id=1200407195809608484, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, doi=null, pmid=null, pmcid=null, year=2020, volume=null, issue=null, pageStart=11108, pageEnd=11117, url=null, language=null, rfNumber=[4], rfOrder=3, authorNames=HU Q Y, YANG B, XIE L H, journalName=Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, refType=null, unstructuredReference=HU Q Y, YANG B, XIE L H, et al. Randla-Net: Efficient Semantic Segmentation of Largescale Point Clouds[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA: IEEE, 2020: 11108-11117., articleTitle=Randla-Net: Efficient Semantic Segmentation of Largescale Point Clouds, refAbstract=null), Reference(id=1200407195922854694, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, doi=null, pmid=null, pmcid=null, year=2018, volume=null, issue=null, pageStart=4490, pageEnd=4499, url=null, language=null, rfNumber=[5], rfOrder=4, authorNames=ZHOU Y, TUZEL O, journalName=2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, refType=null, unstructuredReference=ZHOU Y, TUZEL O. VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 4490-4499., articleTitle=VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection, refAbstract=null), Reference(id=1200407196065461042, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, doi=null, pmid=null, pmcid=null, year=2022, volume=null, issue=8, pageStart=1173, pageEnd=1182, url=null, language=null, rfNumber=[6], rfOrder=5, authorNames=黄润辉, 胡立坤, 苏鸣方, journalName=汽车工程, refType=null, unstructuredReference=黄润辉, 胡立坤, 苏鸣方, 等. 基于三维锥形栅格的激光雷达点云语义分割方法[J]. 汽车工程, 2022(8): 1173-1182., articleTitle=基于三维锥形栅格的激光雷达点云语义分割方法, refAbstract=null), Reference(id=1200407196203873075, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, doi=null, pmid=null, pmcid=null, year=2018, volume=null, issue=8, pageStart=1173, pageEnd=1182, url=null, language=null, rfNumber=[6], rfOrder=6, authorNames=HUANG R H, HU L K, SU M F, journalName=Automotive Engineering, refType=null, unstructuredReference=HUANG R H, HU L K, SU M F, et al. Semantic Segmentation of Lidar Point Cloud Based on 3D Conical Grids[J]. Automotive Engineering, 2018(8): 1173-1182., articleTitle=Semantic Segmentation of Lidar Point Cloud Based on 3D Conical Grids, refAbstract=null), Reference(id=1200407196300342069, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, doi=null, pmid=null, pmcid=null, year=2015, volume=null, issue=null, pageStart=3431, pageEnd=3440, url=null, language=null, rfNumber=[7], rfOrder=7, authorNames=LONG J, SHELHAMER E, DARRELL T, journalName=2015 IEEE Conference on Computer Vision and Pattern Recognition, refType=null, unstructuredReference=LONG J, SHELHAMER E, DARRELL T. Fully Convolutional Networks for Semantic Segmentation[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 3431-3440., articleTitle=Fully Convolutional Networks for Semantic Segmentation, refAbstract=null), Reference(id=1200407196459725629, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, doi=null, pmid=null, pmcid=null, year=2019, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[8], rfOrder=8, authorNames=MILIOTO A, VIZZO I, BEHLEY J, journalName=2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, refType=null, unstructuredReference=MILIOTO A, VIZZO I, BEHLEY J, et al. RangeNet++: Fast and Accurate LiDAR Semantic Segmentation[C]// 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems. Macau, China: IEEE, 2019., articleTitle=RangeNet++: Fast and Accurate LiDAR Semantic Segmentation, refAbstract=null), Reference(id=1200407196598137667, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, doi=null, pmid=null, pmcid=null, year=2022, volume=null, issue=11, pageStart=32, pageEnd=38, url=null, language=null, rfNumber=[9], rfOrder=9, authorNames=王玮琦, 游雄, 苏明占, journalName=测绘通报, refType=null, unstructuredReference=王玮琦, 游雄, 苏明占, 等. SANet:空间注意力机制下的LiDAR点云实时语义分割方法[J]. 测绘通报, 2022(11): 32-38., articleTitle=SANet:空间注意力机制下的LiDAR点云实时语义分割方法, refAbstract=null), Reference(id=1200407196736549706, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, doi=null, pmid=null, pmcid=null, year=2022, volume=null, issue=11, pageStart=32, pageEnd=38, url=null, language=null, rfNumber=[9], rfOrder=10, authorNames=WANG W Q, YOU X, SU M Z, journalName=Bulletin of Surveying Mapping, refType=null, unstructuredReference=WANG W Q, YOU X, SU M Z, et al. SANet: Real-Time Semantic Segmentation of LiDAR Point Clouds with Spatial Attention Mechanisms[J]. Bulletin of Surveying Mapping, 2022(11): 32-38., articleTitle=SANet: Real-Time Semantic Segmentation of LiDAR Point Clouds with Spatial Attention Mechanisms, refAbstract=null), Reference(id=1200407196895933265, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, doi=null, pmid=null, pmcid=null, year=2016, volume=null, issue=null, pageStart=770, pageEnd=778, url=null, language=null, rfNumber=[10], rfOrder=11, authorNames=HE K M, ZHANG X Y, REN S Q, journalName=2016 IEEE Conference on Computer Vision and Pattern Recognition, refType=null, unstructuredReference=HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 770-778., articleTitle=Deep Residual Learning for Image Recognition, refAbstract=null), Reference(id=1200407197021762389, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, doi=null, pmid=null, pmcid=null, year=2019, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[11], rfOrder=12, authorNames=CHEN X Y L, MILIOTO A, PALAZZOLO E, journalName=2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, refType=null, unstructuredReference=CHEN X Y L, MILIOTO A, PALAZZOLO E, et al. SuMa++: Efficient LiDAR-Based Semantic SLAM[C]// 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems. Macau, China: IEEE, 2019., articleTitle=SuMa++: Efficient LiDAR-Based Semantic SLAM, refAbstract=null), Reference(id=1200407198175195999, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, doi=null, pmid=null, pmcid=null, year=2020, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[12], rfOrder=13, authorNames=XU C F, WU B C, WANG Z N, journalName=16th European Conference on Computer Vison, Glasgow, UK, refType=null, unstructuredReference=XU C F, WU B C, WANG Z N, et al. SqueezeSegV3: Spatially-Adaptive Convolution for Efficient Point-Cloud Segmentation[C]// 16th European Conference on Computer Vison, Glasgow, UK, 2020., articleTitle=Spatially-Adaptive Convolution for Efficient Point-Cloud Segmentation, refAbstract=null), Reference(id=1200407198334579557, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, doi=null, pmid=null, pmcid=null, year=2020, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[13], rfOrder=14, authorNames=CORTINHAL T, TZELEPIS G, AKSOY E E, journalName=15th International Symposium on Advances in Vision Computing, refType=null, unstructuredReference=CORTINHAL T, TZELEPIS G, AKSOY E E. SalsaNext:Fast, Uncertainty-Aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving[C]// 15th International Symposium on Advances in Vision Computing. San Diego, California, USA: Springer, Cham, 2020., articleTitle=SalsaNext:Fast, Uncertainty-Aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving, refAbstract=null), Reference(id=1200407198460408684, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, doi=null, pmid=null, pmcid=null, year=2021, volume=null, issue=null, pageStart=1800, pageEnd=1809, url=null, language=null, rfNumber=[14], rfOrder=15, authorNames=ALNAGGAR Y A, AFIFI M, AMER K, journalName=2021 IEEE/CVF Winter Conference on Applications of Computer Vision, refType=null, unstructuredReference=ALNAGGAR Y A, AFIFI M, AMER K, et al. Multi Projection Fusion for Real-Time Semantic Segmentation of 3D LiDAR Point Clouds[C]// 2021 IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa, HI, USA: IEEE, 2021: 1800-1809., articleTitle=Multi Projection Fusion for Real-Time Semantic Segmentation of 3D LiDAR Point Clouds, refAbstract=null)], funds=null, companyList=[AuthorCompany(id=1200407190789026302, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, xref=null, ext=[AuthorCompanyExt(id=1200407190801609218, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, companyId=1200407190789026302, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=Chongqing Jiaotong University, Chongqing 400074), AuthorCompanyExt(id=1200407190826775046, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, companyId=1200407190789026302, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=重庆交通大学, 重庆 400074)])], figs=[ArticleFig(id=1200407192403833496, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=EN, label=null, caption=null, figureFileSmall=lGR6fqEXDZOCrnyzothQmg==, figureFileBig=p+wBOuZ9dCfs8IRfBpbsPQ==, tableContent=null), ArticleFig(id=1200407192546439838, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=CN, label=图1, caption=点网格融合模块, figureFileSmall=lGR6fqEXDZOCrnyzothQmg==, figureFileBig=p+wBOuZ9dCfs8IRfBpbsPQ==, tableContent=null), ArticleFig(id=1200407193733427885, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=EN, label=null, caption=null, figureFileSmall=CbForefpQYg8QhQ5v+2BEQ==, figureFileBig=7jPWerVsnhRy+XgX2JkXcg==, tableContent=null), ArticleFig(id=1200407193821508273, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=CN, label=图2, caption=点到网格操作, figureFileSmall=CbForefpQYg8QhQ5v+2BEQ==, figureFileBig=7jPWerVsnhRy+XgX2JkXcg==, tableContent=null), ArticleFig(id=1200407193930560184, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=EN, label=null, caption=null, figureFileSmall=liHlktukPAwpigdsw9Ffzg==, figureFileBig=0/rZDXamBvgiUwSjYoFwbg==, tableContent=null), ArticleFig(id=1200407194048000707, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=CN, label=图3, caption=空间注意力模块, figureFileSmall=liHlktukPAwpigdsw9Ffzg==, figureFileBig=0/rZDXamBvgiUwSjYoFwbg==, tableContent=null), ArticleFig(id=1200407194173829832, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=EN, label=null, caption=null, figureFileSmall=bnC4f+f2qIGkPWaCK4EEBA==, figureFileBig=JHwFOsZBloedUl7b/KrlDw==, tableContent=null), ArticleFig(id=1200407194299658957, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=CN, label=图4, caption=三重下采样模块, figureFileSmall=bnC4f+f2qIGkPWaCK4EEBA==, figureFileBig=JHwFOsZBloedUl7b/KrlDw==, tableContent=null), ArticleFig(id=1200407194400322261, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=EN, label=null, caption=null, figureFileSmall=motIjrCtNCq5HqbA3BYKGQ==, figureFileBig=M1zcqcDozotRzqVjDQNy0Q==, tableContent=null), ArticleFig(id=1200407194500985563, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=CN, label=图5, caption=网格到点操作, figureFileSmall=motIjrCtNCq5HqbA3BYKGQ==, figureFileBig=M1zcqcDozotRzqVjDQNy0Q==, tableContent=null), ArticleFig(id=1200407194605843167, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=EN, label=null, caption=null, figureFileSmall=nTRw+QlyLFDykS6a4olVWA==, figureFileBig=xk6DZS5bRjy5Jrq3Z1D/wQ==, tableContent=null), ArticleFig(id=1200407194723283683, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=CN, label=图6, caption=序列21数据集定性分割结果, figureFileSmall=nTRw+QlyLFDykS6a4olVWA==, figureFileBig=xk6DZS5bRjy5Jrq3Z1D/wQ==, tableContent=null), ArticleFig(id=1200407194836529899, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=EN, label=null, caption=null, figureFileSmall=null, figureFileBig=null, tableContent=
距离视图
分支
鸟瞰图
分支
三重下
采样
空间注
意力
平均交
并比
× × × 58.3
× × 61.2
× 61.6
63.3
), ArticleFig(id=1200407194962359024, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=CN, label=表1, caption=

特定融合块分析 %

, figureFileSmall=null, figureFileBig=null, tableContent=
距离视图
分支
鸟瞰图
分支
三重下
采样
空间注
意力
平均交
并比
× × × 58.3
× × 61.2
× 61.6
63.3
), ArticleFig(id=1200407195071410935, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=EN, label=null, caption=null, figureFileSmall=null, figureFileBig=null, tableContent=
对象 交并比/%
Range-Net++ Squeeze-SegV3[12] Salsa-Next[13] MPF[14] 本文
汽车 91.4 92.5 90.9 93.4 93.5
自行车 25.7 38.7 36.4 30.2 55.8
摩托车 34.4 36.5 29.5 38.3 54.1
卡车 25.7 29.6 21.7 26.1 50.5
其他车辆 23.0 33.0 19.9 28.5 47.2
行人 38.3 45.6 52.0 48.1 65.1
自行车骑行者 38.8 46.2 52.7 46.1 62.3
摩托车骑行者 4.8 20.1 16.0 18.1 33.8
道路 91.8 91.7 90.9 90.6 91.7
停车区域 65.0 63.4 58.1 62.3 65.4
人行道 75.2 74.8 74.0 74.5 75.5
其他路面 27.8 26.4 27.8 30.6 22.5
建筑物 87.4 89.0 87.9 88.5 88.5
栅栏 58.6 59.4 58.2 59.7 62.4
植被 80.5 82.0 81.8 83.5 81.9
树干 55.1 58.7 61.7 59.7 65.1
绿地 64.6 65.4 66.3 69.2 66.2
线杆 47.9 49.6 51.7 49.7 56.8
交通标志 55.9 58.9 58.0 58.1 65.3
平均交并比 52.2 55.9 54.5 55.5 63.3
单帧时间/ms 82.3 124.3 40.7 31.0 42.6
), ArticleFig(id=1200407195205628674, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070539386646535, language=CN, label=表2, caption=

SemanticKITTI测试数据集的平均交并比分数定量对比

, figureFileSmall=null, figureFileBig=null, tableContent=
对象 交并比/%
Range-Net++ Squeeze-SegV3[12] Salsa-Next[13] MPF[14] 本文
汽车 91.4 92.5 90.9 93.4 93.5
自行车 25.7 38.7 36.4 30.2 55.8
摩托车 34.4 36.5 29.5 38.3 54.1
卡车 25.7 29.6 21.7 26.1 50.5
其他车辆 23.0 33.0 19.9 28.5 47.2
行人 38.3 45.6 52.0 48.1 65.1
自行车骑行者 38.8 46.2 52.7 46.1 62.3
摩托车骑行者 4.8 20.1 16.0 18.1 33.8
道路 91.8 91.7 90.9 90.6 91.7
停车区域 65.0 63.4 58.1 62.3 65.4
人行道 75.2 74.8 74.0 74.5 75.5
其他路面 27.8 26.4 27.8 30.6 22.5
建筑物 87.4 89.0 87.9 88.5 88.5
栅栏 58.6 59.4 58.2 59.7 62.4
植被 80.5 82.0 81.8 83.5 81.9
树干 55.1 58.7 61.7 59.7 65.1
绿地 64.6 65.4 66.3 69.2 66.2
线杆 47.9 49.6 51.7 49.7 56.8
交通标志 55.9 58.9 58.0 58.1 65.3
平均交并比 52.2 55.9 54.5 55.5 63.3
单帧时间/ms 82.3 124.3 40.7 31.0 42.6
)], attaches=null, journal=Journal(id=1189918244568731652, delFlag=0, nameCn=汽车工程师, nameEn=Automotive Engineer, nameHistory1=null, nameHistory2=null, issn=1674-6546, eissn=null, cn=22-1432/U, coden=null, periodic=0, language=CN, oaType=null, ccby=null, superviseOffice=null, ownerOffice=null, pubOffice=null, editorOffice=null, officeType=null, aims=null, clcCode=null, officeProv=null, officeCity=null, officeAddr=null, officeZip=null, officeEmail=null, officePhone=null, editDirector=null, officeDirector=null, officeDirectorPhone=null, officeStaffNum=null, officeEmpNum=null, coverPicUrl=+bJsKkKt/pjz9u6EwhnksQ==, journalPrice=null, startedYear=null, abbrevIsoEn=null, journalRemark=null, publicationField=null, createdTime=1761628217121, updatedTime=1761735708780, createdBy=18614031015, updatedBy=13701087609, firstLetterCn=A, firstLetterEn=A, subjectCode=Engineering, subjectName=Engineering, subjectCodeEn=Engineering, subjectNameEn=null, picCn=+bJsKkKt/pjz9u6EwhnksQ==, picEn=O3Sn3tnYYrh/jm6emnnMWA==, jcr=null, cjcr=null, exts=[JournalExt(id=1190369097415233706, language=CN, name=汽车工程师, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=, createdTime=1761735708812, updatedTime=1761735708812, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=https://tjqc.cbpt.cnki.net/index.aspx?t=1, submissionEditorUrl=https://tjqc.cbpt.cnki.net/index.aspx?t=3, submissionReviewUrl=https://tjqc.cbpt.cnki.net/index.aspx?t=2, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""}), JournalExt(id=1190369097553645739, language=EN, name=Automotive Engineer, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=, createdTime=1761735708845, updatedTime=1761735708845, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=https://tjqc.cbpt.cnki.net/index.aspx?t=1, submissionEditorUrl=https://tjqc.cbpt.cnki.net/index.aspx?t=3, submissionReviewUrl=https://tjqc.cbpt.cnki.net/index.aspx?t=2, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""})], databaseList=null, tenantJournalId=1189918454225211397, websiteList=[Website(id=1189918982430847716, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1189918454225211397, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/qcgcs/CN, language=CN, createTime=1761628393037, createBy=18614031015, updateTime=1761628422913, updateBy=18614031015, name=汽车工程师-中文, tplId=1146099689490845704, title=汽车工程师, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1189919800185917791, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=articleTextType, value=kx, createTime=1761628588005, updateTime=1761628588005, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800164946268, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=banner, value=null, createTime=1761628588000, updateTime=1761628588000, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800211083618, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=grayFlag, value=0, createTime=1761628588011, updateTime=1761628588011, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800156557659, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=logo, value=https://castjournals.cast.org.cn/joweb/qcgcs/CN/file/pic?fileId=yiZ96RYoYcnGnRMuWdmkWA==, createTime=1761628587998, updateTime=1761628587998, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800223666532, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=minRunFlag, value=0, createTime=1761628588014, updateTime=1761628588014, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800181723486, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/qcgcs/CN/file/pic, createTime=1761628588004, updateTime=1761628588004, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800215277923, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=silenceFlag, value=0, createTime=1761628588012, updateTime=1761628588012, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800173334877, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_cn_619/, createTime=1761628588002, updateTime=1761628588002, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800194306400, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=themeColor, value=null, createTime=1761628588007, updateTime=1761628588007, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800202695009, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=themeStyle, value=null, createTime=1761628588009, updateTime=1761628588009, creator=18614031015, updator=18614031015)]), Website(id=1189918982527316711, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1189918454225211397, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/qcgcs/EN, language=EN, createTime=1761628393061, createBy=18614031015, updateTime=1761628543075, updateBy=18614031015, name=汽车工程师-英文, tplId=1146101810881728533, title=Automotive Engineer, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1189919837561352952, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=articleTextType, value=kx, createTime=1761628596916, updateTime=1761628596916, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837540381429, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=banner, value=null, createTime=1761628596911, updateTime=1761628596911, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837582324475, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=grayFlag, value=0, createTime=1761628596921, updateTime=1761628596921, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837527798516, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=logo, value=https://castjournals.cast.org.cn/joweb/qcgcs/EN/file/pic?fileId=yiZ96RYoYcnGnRMuWdmkWA==, createTime=1761628596908, updateTime=1761628596908, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837594907389, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=minRunFlag, value=0, createTime=1761628596924, updateTime=1761628596924, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837557158647, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/qcgcs/EN/file/pic, createTime=1761628596915, updateTime=1761628596915, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837586518780, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=silenceFlag, value=0, createTime=1761628596922, updateTime=1761628596922, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837548770038, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_en_623/, createTime=1761628596913, updateTime=1761628596913, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837569741561, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=themeColor, value=null, createTime=1761628596918, updateTime=1761628596918, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837573935866, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=themeStyle, value=null, createTime=1761628596919, updateTime=1761628596919, creator=18614031015, updator=18614031015)])], journalTitle=汽车工程师, weixinUrl=null, journalUrl=https://tjqc.cbpt.cnki.net/, iacademicId=null, status=1, seqNo=null, journalTitleEn=Automotive Engineer, journalPhotoCn=+bJsKkKt/pjz9u6EwhnksQ==, journalPhotoEn=O3Sn3tnYYrh/jm6emnnMWA==, journalFirstLetter=A, journalRecommend=null, journalNew=null, journalCollection=null, jcrJf=null, cjcrJf=null, jcrJfStr=null, cjcrJfStr=null, submissionFirstDecision=null, sciSubjectClassification=null, casSubjectClassification=null, citeScore=null, totalCitationFrequency=null, icpCode=null, psCode=null, advertisingLicenseCode=null, copyrightInformation=null, country=null, option=, provinceCode=null, provinceName=null, collectFlag=false), detailUrlCn=https://castjournals.cast.org.cn/joweb/qcgcs/CN/10.20104/j.cnki.1674-6546.20230474, detailUrlEn=https://castjournals.cast.org.cn/joweb/qcgcs/EN/10.20104/j.cnki.1674-6546.20230474, pdfUrlCn=https://castjournals.cast.org.cn/joweb/qcgcs/CN/PDF/10.20104/j.cnki.1674-6546.20230474, pdfUrlEn=https://castjournals.cast.org.cn/joweb/qcgcs/EN/PDF/10.20104/j.cnki.1674-6546.20230474, aliStartDate=null, aliEndDate=null, collectionFlag=false, citedCount=null, citedUrl=null, reference=null)
收藏切换
基于多模态投影的激光雷达点云实时语义分割网络
收藏切换
PDF下载
唐彬洪
汽车工程师 | 重庆交通大学自动驾驶环境感知与定位技术专题 2024,(1): 12-18
收起
收藏切换
汽车工程师 | 重庆交通大学自动驾驶环境感知与定位技术专题 2024, (1): 12-18
基于多模态投影的激光雷达点云实时语义分割网络
全屏
唐彬洪
作者信息
  • 重庆交通大学, 重庆 400074
LiDAR Semantic Segmentation Network for Real-Time Multimodal Projection
Binhong Tang
Affiliations
  • Chongqing Jiaotong University, Chongqing 400074
出版时间: 2024-01-15 doi: 10.20104/j.cnki.1674-6546.20230474
文章导航
收藏切换

针对激光雷达点云语义分割过程中,传统方法无法权衡检测速度和精度的问题,提出了一种多模态融合的激光雷达点云语义分割网络架构。利用点-网格模块提取语义特征,经过空间注意力模块聚合空间和上下文信息,通过二维全卷积网络(FCN)特征融合金字塔模块实现语义分割,最后通过二维和三维特征融合减少信息损失,并使用损失函数更新权重优化模型。SemanticKITTI数据集上的验证结果表明,该模型平均交并比达到63.3%,与其他优秀算法相比兼顾了实时性与准确性,显著提高了激光雷达点云语义分割的精度。

自动驾驶  /  激光雷达  /  语义分割  /  深度学习

Traditional methods based on points cannot balance the detection speed and accuracy in LiDAR semantic segmentation. To address this issue, this paper proposes a multimodal fusion LiDAR semantic segmentation network. Semantic features are extracted through the point-grid module, spatial and contextual information is aggregated through the attention mechanism module, semantic segmentation is achieved through the 2D Fully Convolutional Network (FCN) feature fusion pyramid, and finally, information loss is reduced through the fusion of 2D and 3D features, and the weights are updated to optimize the model using the loss function. Verification of SemanticKITTI dataset indicates that this model achieves an average crossover ratio of 63.3%, and takes into account of real-time property and accuracy as compared with other algorithms, which significantly improves the accuracy of LiDAR semantic segmentation.

Automatic driving  /  LiDAR  /  Semantic segmentation  /  Deep learning
唐彬洪. 基于多模态投影的激光雷达点云实时语义分割网络. 汽车工程师, 2024 , (1) : 12 -18 . DOI: 10.20104/j.cnki.1674-6546.20230474
Binhong Tang. LiDAR Semantic Segmentation Network for Real-Time Multimodal Projection[J]. Automotive Engineer, 2024 , (1) : 12 -18 . DOI: 10.20104/j.cnki.1674-6546.20230474
在自动驾驶领域,激光雷达传感器因可提供丰富的场景信息和具有极强的抗干扰能力而获得广泛应用。激光雷达点云语义分割可为三维点云分配语义标签,并为地图增加语义信息,从而提高驾驶的准确性和安全性。
在以往的研究中,研究人员提出了各种深度学习模型来处理激光雷达三维点云,如基于点的方法、基于稀疏体素的方法和基于二维投影的方法。基于点的方法包括PointNet[1]、PointNet++[2]。PointNet使用点云逐点多层感知机(Point-wise Multi-Layer Perceptron,Point-wise MLP)对每个输入点逐点计算。PointNet++在其基础上提出了采样(Sampling)和分组(Grouping)模块整合局部邻域。PointCNN[3]使用点云卷积(Point Convolutions)方法,引入X变换(Χ-Transformation)使点云经卷积变换后的输出不变。RandLA-Net[4]同样使用点云逐点多层感知机,引入轻量级神经网络结构,依赖随机采样和本地特征聚合器考虑空间关系和点的特征,获得更大的邻域。这些基于点的方法直接应用于无序三维点云,不会造成任何信息损失,但采用的邻域搜索方式等相对耗时。
激光雷达三维点云具有稀疏性,因此,可基于体素(Voxel)的方法将三维点云量化为三维网格。VoxelNet[5]利用三维卷积配合图像语义分割中的全卷积网络(Fully Convolutional Network,FCN)结构来处理三维体素数据。黄润辉等[6]提出三维锥形栅格解决了激光点云的稀疏性和密度不一致性问题。基于体素的方法可使点云数据规则化,但体素化本身会带来离散伪影和信息丢失,在选择较高分辨率时会出现计算效率低与占用内存大的问题。
基于二维投影的方法将成熟的二维卷积神经网络(Convolutional Neural Networks,CNN)应用于三维点云投影的二维网格特征图。FCN[7]将原网络中的全连接层替换为卷积层。RangeNet++[8]尝试使用K近邻算法(K-Nearest Neighbor,KNN)作为后处理方法,而SANet[9]将空间相关性与空间注意力结合作为预处理方法。
综合基于点的方法能保存完整点云信息与基于二维投影的方法能保证实时性的优势,本文提出基于多模态投影的激光雷达点云实时语义分割网络架构,分别使用点到网络(Point to Grid,P2G)和网格到点(Grid to Point,G2P)模块同时在鸟瞰图和距离视图的二维网格上投影并提取语义特征,通过空间注意力模块聚合特征,再输入到特征融合金字塔模块,结合初步处理的三维点云输出分割预测结果。然后,利用提出的多模态投影增强点云特征信息,通过空间注意力模块处理投影后的二维网格特征图,并在二维全卷积网络中加入三重下采样模块提升下采样性能。最后,利用SemanticKITTI数据集对网络的速度和精度进行测试。
要实现准确、快速的激光雷达点云语义分割,不仅需要高效提取语义特征,还需要保留完整的点云信息。基于二维投影的方法可以有效降低算法的计算量,并通过鸟瞰图和距离视图保留完整的点云信息。本文提出的点网格融合模块如图1所示,融合步骤为:点到网格模块将输入的点特征投影到鸟瞰图和距离视图上;空间注意力模块由注意力模块和上下文模块组成,经过空间注意力模块提取的特征输入二维全卷积网络提取语义特征,注意力模块使用较大的感受野获取空间分布信息,学习较为重要的特征,上下文模块使用不同感受野聚合上下文信息,融合大小不同的感受野;使用二维全卷积网络处理二维特征图有效提取语义特征;网格到点模块将二维网格特征传输到三维点上;点融合模块将初步处理的三维点、鸟瞰图和距离视图分支的特征融合,以确保点云信息完整。
点到网格模型旨在将三维点特征转换为二维网格特征图。如图2所示,首先需要选择合适的网格大小,然后将第k个三维点${p}_{k}^{3D}$=(xk,yk,zk)投射到二维网格获取相应的二维坐标${p}_{k}^{2D}$=(uk,vk)。集合Rh,w包含落在同一二维网格(h,w)中的点的索引,即${{R}_{h,w}}=\left\{ k|\left\lfloor {{u}_{k}} \right\rfloor =h;\left\lfloor {{v}_{k}} \right\rfloor =w \right\}$,其中$\left\lfloor {{u}_{k}} \right\rfloor$、$\left\lfloor {{v}_{k}} \right\rfloor$分别为ukvk的整数部分,通常将ukvk坐标四舍五入到最接近的整数,并将点储存在相应的网格单元中。三维特征${F}_{k,c}^{3D}$通过每个通道c取最大三维特征点的特征值形成相应的二维网格特征${G}_{h,w,c}^{2D}$,计算公式为:
${G}_{h,w,c}^{2D}= \underset{k\in {R}_{h,w}}{max}{F}_{k,c}^{3D}$
鸟瞰图省略了高度维度,即z维度,而距离视图则省略了距离r维度。因此,将这两种视图互补可减少二维投影的信息损失。它们所使用的点到网络模型类似,只是在二维投影方式上有所不同。鸟瞰图使用矩形二维网格离散化,通过矩形二维网格(xmin,ymin,xmax,ymax)将三维点云投影到x-y平面上,该平面的离散度为宽度Wbev和高度Hbev
$\left(\genfrac{}{}{0pt}{}{{u}_{k}}{{v}_{k}}\right)= \left(\genfrac{}{}{0pt}{}{\frac{{x}_{k}-{x}_{min}}{{x}_{max}-{x}_{min}} \times   {W}_{bev}}{\frac{{y}_{k}-{y}_{min}}{{y}_{max}-{y}_{min}} \times   {H}_{bev}}\right)$
对于距离视图,三维点云是从三维空间${p}_{k}^{3D}$=(xk,yk,zk)映射到球形空间${p}_{k}^{sph}$=(rk,θk,ϕk):
$\left(\genfrac{}{}{0pt}{}{\begin{array}{l}{r}_{k}\\ {\theta }_{k}\end{array}}{{\varphi }_{k}}\right)=\left(\genfrac{}{}{0pt}{}{\sqrt{{{x}_{k}}^{2}+{{y}_{k}}^{2}+{{z}_{k}}^{2}}}{\begin{array}{l}arcsin\left(\frac{{z}_{k}}{\sqrt{{{x}_{k}}^{2}+{{y}_{k}}^{2}+{{z}_{k}}^{2}}}\right)\\             arctan\left(\frac{{y}_{k}}{{x}_{k}}\right)\end{array}}\right)$
式中,rkθkϕk分别为距离、垂直角、方位角。
然后将θkϕk离散化,忽略rk,获取距离视图的宽度Wrv和高度Hrv
$\left(\begin{array}{l}\\ \genfrac{}{}{0pt}{}{{u}_{k}}{\begin{array}{l}{v}_{k}\\ \end{array}}\end{array}\right)= \left(\genfrac{}{}{0pt}{}{\frac{1}{2}\left[1-{\varphi }_{k}{\pi }^{-1}\right]{W}_{rv}}{\left[1-({\theta }_{k}+{f}_{up}){f}^{-1}\right]{H}_{rv}}\right)$
式中,f=fup+fdown为激光雷达的垂直视角;fupfdown分别为垂直视角的上、下部分。
鸟瞰图分支接收形状为(Wbev=600,Hbev=600)的二维特征图,范围为(xmin=-50,ymin=-50,xmax=50,ymax=50)。范围视图分支接收形状为(Wrv=1 024,Hrv=16)的二维特征图。对于每个网格单元,根据其内部点的特征值通过最大池化操作计算聚合的特征。
空间注意力模块分为注意力模块和上下文模块。在鸟瞰图和距离视图中,上下文相关性主要体现在车辆和行人对道路有很强的依附性,即车辆和行人周围的像素极大概率属于道路的类别标签。空间分布规律体现在物体类别在空间分布中的相关性和一般规律,即行人和植被的旁侧像素大概率属于道路。此外,基于激光雷达生成点云的方法可以看出,在距离视图中道路一般处于图像中轴与底线位置。
图3所示,分别将点到网格模型处理后的二维网格特征图输入注意力模块和上下文模块。
注意力模块通过大尺寸卷积获取较大感受野并使用Sigmoid函数将权重归一化到0~1范围内,对每个通道进行缩放。上下文模块使用1×1卷积和空洞卷积改变通道数量并增大感受野,通过融合不同尺度的特征图获得更准确的上下文信息,最后,将两分支输出结果对应元素相乘获得空间注意力特征。此外,将上下文模块分支的输出结果与空间注意力输出结果相加,以进一步更新空间注意力特征。
采用编码器和解码器架构的二维全卷积网络分别应用于鸟瞰图和距离视图提取语义特征。编码器以ResNet[10]为基础,采用4个编码器和3个解码器,即9层的轻量级骨干网络。两视图采用类似的二维全卷积网络,但视图范围没有沿高度维度进行下采样。特征通道数分别设置为64、32、64、128、128、96、64和64。高分辨率特征图可显示更多细节,如轮廓、边缘、纹理等,而低分辨率特征图包含更多的语义信息,如表征道路、大型建筑物等,解码器使用特征金字塔进行上采样并融合高层和低层特征图。
在下采样阶段,为保留更多的信息,基于丰富块(Inception Block),本文提出三重下采样模块。使用3个包含二维卷积和二维最大池化的分支分别进行特征提取,最后将通道数相加,经过线性整流函数(Rectified Linear Unit,ReLU)层输出结果。三重下采样模块使用1×1卷积可以减少参数的积累,在提高网络深度的同时提高宽度且减少了模型参数,保留了更多信息,如图4所示。
与点到网格操作相反,网格到点模型从二维网格中每个单元格提取特征重新映射到三维点。如图5所示,它在4个相邻网格内应用双线性插值,即确定与点相关的4个相邻网格单元,它们是最接近点的单元格。为进行插值,计算点与这4个相邻单元格的距离权重。对于每个特征,根据权重和这4个单元格的特征值进行插值操作。计算公式为:
${F}_{k,c}^{3D}= \sum _{i=0}^{1}\sum _{j=0}^{1}{w}_{i,j,k}{G}_{⌊{u}_{k}⌋+i,⌊{v}_{k}⌋+j,{c}^{,}}^{2D}$
中,${{w}_{i,j,k}}=\left( 1-\left| {{u}_{k}}-\left( \left\lfloor {{u}_{k}} \right\rfloor +i \right) \right| \right)\left( 1-\left| {{v}_{k}}-\left( \left\lfloor {{v}_{k}} \right\rfloor +j \right) \right| \right)$为双线性插值权重;${G}_{⌊{u}_{k}⌋+i,⌊{v}_{k}⌋+j,{c}^{,}}^{2D}$为对应的二维网格特征。
式(5)考虑了点云位置(uk,vk)到目标网格位置(i,j)的距离。对于边缘点,超出视图范围的相邻网格视为无效网格。
点融合模块融合来自原始点云、鸟瞰图和距离视图的点特征。通过向量拼接(Concatenate)操作合并特征通道,使融合模块同时考虑鸟瞰图和距离视图的语义信息。与SuMa++[11]中的分割网格不同,本文不采用后处理模块,只通过点融合模块使用特征联合和多层感知机层作为最后的输出结果,实现端到端框架。
鸟瞰图和距离视图的分辨率不同,导致投影到二维网格的点范围不同。虽然在某个视图中,被投影到二维网格范围外的点的特征被视为无效,但它可以传递另一个视图的信息。本文设置超出鸟瞰图视图范围,但在距离视图范围内的点被视为有效点并传递相应距离视图点的特征信息。
分割预测通过一个全连接层对上一层点融合模块的输出特征进行处理,获得分割预测结果。受激光雷达点云数据的特点,以及激光雷达点云语义分割数据集各类标签的数据量不平衡的影响,语义分割网络在学习训练中对小类别语义分割存在困难。在同帧数据中,例如道路、汽车、植被、建筑等的环境要素的像素占比明显高于摩托车、行人等类别。而对比整个数据集数据量中的类别,道路、人行道和建筑物在数据集中的比例是行人和骑行者的数百倍。数据的极不平衡使语义分割网络在训练中更加倾向高占比类别,而难以提取和预测低占比类别。为减少由数据不平衡导致的语义分割网络的性能损失,本文采用加权交叉熵损失来强调低占比类别:
${L}_{wce}= -\sum _{c=1}^{C}{\alpha }_{c}{y}_{c}log\left(\widehat{{y}_{c}}\right)$
其中:
${\alpha }_{c}= \frac{1}{{F}_{c}+\epsilon }$
式中,yc为真实标签;$\widehat{{y}_{c}}$为预测概率;Fc为类别c在整个数据集中的频率;αc为类别c的权重;ε为一个很小的正数,防止除零错误;C为数据集的类别数量。
为评估本文算法的性能,选用SemanticKITTI数据集进行训练和测试。该数据集包含43 552帧360°雷达扫描点云数据,划分为22个序列。序列00~序列07与序列09~序列10共19 130帧点云数据作为训练数据集,序列08共4 071帧数据作为验证数据集,序列11~序列21共20 351帧点云数据作为测试数据集。选用Velodyne VLP-16激光雷达,具有垂直方向16线光束,每帧扫描约3.2×104个点。采用平均交并比(mean Intersection Over Union,mIoU)评估算法的表现:
${I}_{m}= \frac{1}{S}\sum _{i}^{S}\frac{{T}_{i}}{{T}_{i}+{F}_{i}+{N}_{i}}$
式中,S为类别数;Ti为类别i的真正值数量,即模型正确预测为类别i的数量;Fi为类别i的假正值数量,即模型错误预测为类别i的数量;Ni为类别i的假负值数量,即模型未能正确预测为类别i的数量。
所有测试均使用GeForce RTX 4090图形处理器(Graphics Processing Unit,GPU)硬件平台完成。
为验证本文算法各模块的有效性,在数据集上进行消融试验,各模块配置情况和试验结果如表1所示。
在SemanticKITTI测试集上与其他先进算法进行对比,结果如表2所示。由表2可以看出:对比RangeNet++,本文算法在精度和速度方面都有所提升,特别是精度明显优于RangeNet++;对比SqueezeSegV3,本文算法在精度提升的同时,单帧速度也大幅提升;对比SalsaNext和MPF,本文算法在速度上略显不及,但在精度上相比MPF至少提高了7.8百分点,证明了本文提出的多模态投影语义分割网络的高效性。
此外,本文算法与其他算法在序列21上的可视化对比结果如图6所示。由图6可以看出,本文算法对于物体的预测在保持高精度的前提下具有很好的稳定性。对于小物体预测,本文算法正确分割了“行人”这一对象,而其他算法精度较差,甚至忽略了该对象。
本文提出了一种基于多模态投影的实时激光雷达点云分割算法,通过点和网格模块实现二维和三维特征图的转换,将空间注意力模块和特征金字塔融合模块有机结合并加入三重下采样模块高效提取重要位置信息,整合了丰富的语义特征信息,结合三维点云、鸟瞰图和距离视图对点云信息进行互补。
通过与RangeNet++、SqueezeSegV3、SalsaNext和MPF等先进算法进行比较,验证了本文算法在精度和速度方面的优势。可视化结果表明,本文算法在小物体预测方面表现优异,展现出了较高的精度和稳定性。
参考文献 引证文献
排序方式:
[1]
QI C R, SU H, MO K, et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 652-660.
[2]
QI C R, YI L, SU H, et al. Pointnet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space[C]// Advances in Neural Information Processing Systems, 2017, 30: 5105-5114.
[3]
LI Y Y, BU R, SUN M C, et al. PointCNN: Convolution on X-Transformed Points[C]// Advances in Neural Information Processing Systems, 2018, 31: 820-830.
[4]
HU Q Y, YANG B, XIE L H, et al. Randla-Net: Efficient Semantic Segmentation of Largescale Point Clouds[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA: IEEE, 2020: 11108-11117.
[5]
ZHOU Y, TUZEL O. VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 4490-4499.
[6]
黄润辉, 胡立坤, 苏鸣方, 等. 基于三维锥形栅格的激光雷达点云语义分割方法[J]. 汽车工程, 2022(8): 1173-1182.
HUANG R H, HU L K, SU M F, et al. Semantic Segmentation of Lidar Point Cloud Based on 3D Conical Grids[J]. Automotive Engineering, 2018(8): 1173-1182.
[7]
LONG J, SHELHAMER E, DARRELL T. Fully Convolutional Networks for Semantic Segmentation[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 3431-3440.
[8]
MILIOTO A, VIZZO I, BEHLEY J, et al. RangeNet++: Fast and Accurate LiDAR Semantic Segmentation[C]// 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems. Macau, China: IEEE, 2019.
[9]
王玮琦, 游雄, 苏明占, 等. SANet:空间注意力机制下的LiDAR点云实时语义分割方法[J]. 测绘通报, 2022(11): 32-38.
WANG W Q, YOU X, SU M Z, et al. SANet: Real-Time Semantic Segmentation of LiDAR Point Clouds with Spatial Attention Mechanisms[J]. Bulletin of Surveying Mapping, 2022(11): 32-38.
[10]
HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 770-778.
[11]
CHEN X Y L, MILIOTO A, PALAZZOLO E, et al. SuMa++: Efficient LiDAR-Based Semantic SLAM[C]// 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems. Macau, China: IEEE, 2019.
[12]
XU C F, WU B C, WANG Z N, et al. SqueezeSegV3: Spatially-Adaptive Convolution for Efficient Point-Cloud Segmentation[C]// 16th European Conference on Computer Vison, Glasgow, UK, 2020.
[13]
CORTINHAL T, TZELEPIS G, AKSOY E E. SalsaNext:Fast, Uncertainty-Aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving[C]// 15th International Symposium on Advances in Vision Computing. San Diego, California, USA: Springer, Cham, 2020.
[14]
ALNAGGAR Y A, AFIFI M, AMER K, et al. Multi Projection Fusion for Real-Time Semantic Segmentation of 3D LiDAR Point Clouds[C]// 2021 IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa, HI, USA: IEEE, 2021: 1800-1809.
2024年第卷第1期
PDF下载
235
101
引用本文
BibTeX
文章信息
doi: 10.20104/j.cnki.1674-6546.20230474
  • 首发时间:2025-11-25
  • 出版时间:2024-01-15
补充材料
相关文章
文章信息
作者
出版历史
  • 修回日期:2023-11-07
基金
作者信息
    重庆交通大学, 重庆 400074
参考文献
分享链接
https://castjournals.cast.org.cn/joweb/qcgcs/CN/10.20104/j.cnki.1674-6546.20230474
分享至
全文二维码

扫描看全文

引用本文
BibTeX
本文的引用情况
2种不同金属材料的力学参数

Family
属数
Number of
genus
种数
Number of
species
占总种数比例
Percentage of
total species (%)

Genus
种数
Number of
species
占总种数比例
Percentage of total
species (%)
鹅膏菌科Amanitaceae 2 11 5.26 鹅膏菌属 Amanita 10 4.78
小菇科 Mycenaceae 2 12 5.74 丝盖伞属 Inocybe 5 2.39
多孔菌科 Polyporaceae 8 14 6.70 蜡蘑属 Laccaria 5 2.39
红菇科 Russulaceae 3 23 11.00 小皮伞属 Marasmius 6 2.87
小菇属 Mycena 11 5.26
光柄菇属 Pluteus 5 2.39
红菇属 Russula 17 8.13
栓菌属 Trametes 5 2.39
关闭全屏