Article(id=1200070652263756232, tenantId=1146029695717560320, journalId=1189918454225211397, issueId=1200070646895051378, articleNumber=null, orderNo=null, doi=10.20104/j.cnki.1674-6546.20240158, pmid=null, cstr=null, oa=null, hot=null, price=null, onlineType=0, articleFormat=0, articleType=null, articleTypeStr=null, receivedDate=null, receivedDateStr=null, revisedDate=1714752000000, revisedDateStr=2024-05-04, acceptedDate=null, acceptedDateStr=null, onlineDate=1764048739733, onlineDateStr=2025-11-25, pubDate=1723651200000, pubDateStr=2024-08-15, doiRegisterDate=null, doiRegisterDateStr=null, onlineIssueDate=1764048739733, onlineIssueDateStr=2025-11-25, onlineJustAcceptDate=null, onlineJustAcceptDateStr=null, onlineFirstDate=null, onlineFirstDateStr=null, sourceXml=null, magXml=null, createTime=1764048739733, creator=13701087609, updateTime=1764048739733, updator=13701087609, issue=Issue{id=1200070646895051378, tenantId=1146029695717560320, journalId=1189918454225211397, year='2024', volume='', issue='8', pageStart='1', pageEnd='48', issueExtLink='null', onlineDate='null', pubDate='null', beforeIssueId=null, nextIssueId=null, price=null, status=1, issueComplete=1, articleOrder=1, issueType=-1, specialIssue=null, createTime=1764048738454, creator=13701087609, updateTime=1764049350066, updator=13701087609, preIssue=null, nextIssue=null, ext={EN=IssueExt(id=1200073212257203051, tenantId=1146029695717560320, journalId=1189918454225211397, issueId=1200070646895051378, language=EN, specialIssueTitle=, coverIllustrator=null, specialIssueEditor=, specialIssueAbout=), CN=IssueExt(id=1200073212257203052, tenantId=1146029695717560320, journalId=1189918454225211397, issueId=1200070646895051378, language=CN, specialIssueTitle=, coverIllustrator=null, specialIssueEditor=, specialIssueAbout=)}, issueFiles=null}, startPage=15, endPage=21, ext={EN=ArticleExt(id=1200070652523803085, articleId=1200070652263756232, tenantId=1146029695717560320, journalId=1189918454225211397, language=EN, title=Infrared Pedestrian Object Detection Algorithm Based on Improved YOLOv7, columnId=1200070647679386243, journalTitle=Automotive Engineer, columnName=Special Issue on Intelligent Vehicle Environmental Perception and Target Detection Technology, runingTitle=null, highlight=null, articleAbstract=

To eliminate the defects of incomplete detection and high false detection rate caused by insignificant pedestrian target features, dense small targets and complex background in infrared images, this paper proposes an infrared pedestrian target detection algorithm based on improved YOLOv7. Firstly, the original Spatial Pyramid Pooling (SPP) module is replaced by the Channel Attention based Spatial Pyramid Pooling (CASPP) module based on the YOLOv7-tiny model, so that the model could pay more attention to the extraction of pedestrian features; then, the convolution module CBM based on the Meta-ACON activation function is introduced, which further suppressed the background noise and preserved the details of the pedestrians; finally, an alpha fusion data enhancement method is proposed to enrich the diversity of samples and improve the stability of the model in complex environments. The validation based on the FLIR dataset shows that the proposed method improves the accuracy by 3% and reduces the computation by 38% compared with the YOLOv7-tiny algorithm, which is more suitable for infrared pedestrian target detection scenarios.

, correspAuthors=null, authorNote=null, correspAuthorsNote=null, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=null, magXml=null, pdfUrl=null, pdf=null, pdfFileSize=null, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=null, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=null, mapNumber=null, authorCompany=null, fund=null, authors=null, authorsList=Changhai Li), CN=ArticleExt(id=1200070655686308369, articleId=1200070652263756232, tenantId=1146029695717560320, journalId=1189918454225211397, language=CN, title=基于改进YOLOv7的红外行人目标检测方法, columnId=1200070647847158411, journalTitle=汽车工程师, columnName=智能车辆环境感知与目标检测技术专刊, runingTitle=null, highlight=null, articleAbstract=

针对红外行人目标检测过程中,图像中行人目标特征不显著、小目标密集、背景复杂等因素导致的检测不全、误检率高等问题,提出了一种基于改进YOLOv7的红外行人目标检测算法。首先,以YOLOv7-tiny模型为基础,采用基于通道注意力机制的空间金字塔池化(CASPP)模块替换原始空间金字塔池化(SPP)模块,使模型更注重行人特征的提取;然后,引入基于Meta-ACON激活函数的卷积模块(CBM),进一步抑制背景噪声,保留行人细节;最后,提出一种alpha融合数据增强方法,以丰富样本的多样性,提高模型在复杂环境中的稳定性。基于FLIR数据集的验证结果表明,与YOLOv7-tiny算法相比,所提出的方法精度提高了3%,计算量减少了38%,更适用于红外行人目标检测场景。

, correspAuthors=null, authorNote=null, correspAuthorsNote=null, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=nAlDjXpBaeY5TA2+eSlL8g==, magXml=GzKk+RtgGdXbQ8P/i+HNRw==, pdfUrl=null, pdf=e41/We9zwvMUn1p+DVs7Sg==, pdfFileSize=7502426, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=a6c7zr9iGj8vD4D5f6zVvA==, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=VGDQdLPCA6Hp8agYhHZflg==, mapNumber=null, authorCompany=null, fund=null, authors=null, authorsList=李长海)}, authors=[Author(id=1200070656051212846, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, orderNo=0, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1200070656172847673, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, authorId=1200070656051212846, language=EN, stringName=Changhai Li, firstName=Changhai, middleName=null, lastName=Li, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=University of Electronic Science and Technology of China, Chengdu 611731, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1200070656323842622, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, authorId=1200070656051212846, language=CN, stringName=李长海, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=电子科技大学, 成都 611731, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1200070655946355237, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, xref=null, ext=[AuthorCompanyExt(id=1200070655954743845, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, companyId=1200070655946355237, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=University of Electronic Science and Technology of China, Chengdu 611731), AuthorCompanyExt(id=1200070655963132455, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, companyId=1200070655946355237, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=电子科技大学, 成都 611731)])])], keywords=[Keyword(id=1200070656483226186, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=EN, orderNo=1, keyword=Infrared image), Keyword(id=1200070656592278098, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=EN, orderNo=2, keyword=Pedestrian detection), Keyword(id=1200070656684552792, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=EN, orderNo=3, keyword=Attention mechanism), Keyword(id=1200070656852324958, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=EN, orderNo=4, keyword=Meta-ACON), Keyword(id=1200070656961376871, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=EN, orderNo=5, keyword=YOLOv7), Keyword(id=1200070657087205998, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=CN, orderNo=1, keyword=红外图像), Keyword(id=1200070657313698422, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=CN, orderNo=2, keyword=行人检测), Keyword(id=1200070657426944638, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=CN, orderNo=3, keyword=注意力机制), Keyword(id=1200070657561162375, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=CN, orderNo=4, keyword=Meta-ACON), Keyword(id=1200070657670214283, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=CN, orderNo=5, keyword=YOLOv7)], refs=[Reference(id=1200070661168264049, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=1988, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[1], rfOrder=0, authorNames=HARRIS C G, STEPHENS M J, journalName=Alvey Vision Conference, refType=null, unstructuredReference=HARRIS C G, STEPHENS M J. A Combined Corner and Edge Detector[C]// Alvey Vision Conference. Manchester, UK: University of Manchester, 1988., articleTitle=A Combined Corner and Edge Detector, refAbstract=null), Reference(id=1200070661260538748, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2004, volume=60, issue=2, pageStart=91, pageEnd=110, url=null, language=null, rfNumber=[2], rfOrder=1, authorNames=LOWE D G, journalName=International Journal of Computer Vision, refType=null, unstructuredReference=LOWE D G. Distinctive Image Features from Scale-Invariant Keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110., articleTitle=Distinctive Image Features from Scale-Invariant Keypoints, refAbstract=null), Reference(id=1200070661407339404, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2005, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[3], rfOrder=2, authorNames=DALAL N, TRIGGS B, journalName=2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, refType=null, unstructuredReference=DALAL N, TRIGGS B. Histograms of Oriented Gradients for Human Detection[C]// 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, CA, USA: IEEE, 2005., articleTitle=Histograms of Oriented Gradients for Human Detection, refAbstract=null), Reference(id=1200070661533168540, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2011, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[4], rfOrder=3, authorNames=RUBLEE E, RABAUD V, KONOLIGE K, journalName=International Conference on Computer Vision. Barcelona, Spain:IEEE, refType=null, unstructuredReference=RUBLEE E, RABAUD V, KONOLIGE K, et al. ORB: An Efficient Alternative to SIFT or SURF[C]// International Conference on Computer Vision. Barcelona, Spain:IEEE, 2011., articleTitle=ORB: An Efficient Alternative to SIFT or SURF, refAbstract=null), Reference(id=1200070661700940715, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2017, volume=39, issue=6, pageStart=1137, pageEnd=1149, url=null, language=null, rfNumber=[5], rfOrder=4, authorNames=REN S Q, HE K M, GIRSHICK R, journalName=IEEE Transactions on Pattern Analysis and Machine Intelligence, refType=null, unstructuredReference=REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149., articleTitle=Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, refAbstract=null), Reference(id=1200070661864518584, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2021, volume=40, issue=8, pageStart=126, pageEnd=129, url=null, language=null, rfNumber=[6], rfOrder=5, authorNames=胡均平, 孙希, journalName=传感器与微系统, refType=null, unstructuredReference=胡均平, 孙希. 基于改进Faster R-CNN的近红外夜间行人检测方法[J]. 传感器与微系统, 2021, 40(8): 126-129., articleTitle=基于改进Faster R-CNN的近红外夜间行人检测方法, refAbstract=null), Reference(id=1200070661986153417, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2021, volume=40, issue=8, pageStart=126, pageEnd=129, url=null, language=null, rfNumber=[6], rfOrder=6, authorNames=HU J P, SUN X, journalName=Sensors and Microsystems, refType=null, unstructuredReference=HU J P, SUN X. Near-Infrared Nighttime Pedestrian Detection Based on Improved Faster R-CNN[J]. Sensors and Microsystems, 2021, 40(8): 126-129., articleTitle=Near-Infrared Nighttime Pedestrian Detection Based on Improved Faster R-CNN, refAbstract=null), Reference(id=1200070662162314207, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2016, volume=null, issue=null, pageStart=770, pageEnd=778, url=null, language=null, rfNumber=[7], rfOrder=7, authorNames=HE K M, ZHANG X Y, REN S Q, journalName=2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), refType=null, unstructuredReference=HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA: IEEE, 2016: 770-778., articleTitle=Deep Residual Learning for Image Recognition, refAbstract=null), Reference(id=1200070662346863603, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2018, volume=null, issue=null, pageStart=null, pageEnd=null, url=https://arxiv.org/abs/1804.02767, language=null, rfNumber=[8], rfOrder=8, authorNames=REDMON J, FARHADI A, journalName=null, refType=null, unstructuredReference=REDMON J, FARHADI A. YOLOv3:An Incremental Improvement[EB/OL]. (2018-04-08) [2024-05-04]. https://arxiv.org/abs/1804.02767, articleTitle=YOLOv3:An Incremental Improvement, refAbstract=null), Reference(id=1200070662493663236, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2020, volume=null, issue=null, pageStart=null, pageEnd=null, url=https://arxiv.org/abs/2004.10934, language=null, rfNumber=[9], rfOrder=9, authorNames=BOCHKOVSKIY A, WANG C Y, LIAO H Y M, journalName=null, refType=null, unstructuredReference=BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4:Optimal Speed and Accuracy of Object Detection[EB/OL]. (2020-04-23) [2024-05-04]. https://arxiv.org/abs/2004.10934, articleTitle=YOLOv4:Optimal Speed and Accuracy of Object Detection, refAbstract=null), Reference(id=1200070662602715154, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2021, volume=39, issue=10, pageStart=19, pageEnd=22, url=null, language=null, rfNumber=[10], rfOrder=10, authorNames=刘怡帆, 王旭飞, 周鹏, journalName=数字技术与应用, refType=null, unstructuredReference=刘怡帆, 王旭飞, 周鹏, 等. 基于YOLOv4神经网络的红外图像道路行人检测[J]. 数字技术与应用, 2021, 39(10): 19-22., articleTitle=基于YOLOv4神经网络的红外图像道路行人检测, refAbstract=null), Reference(id=1200070662720155682, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2021, volume=39, issue=10, pageStart=19, pageEnd=22, url=null, language=null, rfNumber=[10], rfOrder=11, authorNames=LIU Y F, WANG X F, ZHOU P, journalName=Digital Technology and Applications, refType=null, unstructuredReference=LIU Y F, WANG X F, ZHOU P, et al. Infrared Image Road Pedestrian Detection Based on YOLOv4 Neural Network[J]. Digital Technology and Applications, 2021, 39(10): 19-22., articleTitle=Infrared Image Road Pedestrian Detection Based on YOLOv4 Neural Network, refAbstract=null), Reference(id=1200070663940698170, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2021, volume=11, issue=8, pageStart=31, pageEnd=34, url=null, language=null, rfNumber=[11], rfOrder=12, authorNames=史健婷, 张贵强, 陶金, journalName=智能计算机与应用, refType=null, unstructuredReference=史健婷, 张贵强, 陶金, 等. 改进的YOLOv4红外图像行人检测算法[J]. 智能计算机与应用, 2021, 11(8): 31-34+41., articleTitle=改进的YOLOv4红外图像行人检测算法, refAbstract=null), Reference(id=1200070664125247568, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2021, volume=11, issue=8, pageStart=31, pageEnd=34, url=null, language=null, rfNumber=[11], rfOrder=13, authorNames=SHI J T, ZHANG G Q, TAO J, journalName=Intelligent Computers and Applications, refType=null, unstructuredReference=SHI J T, ZHANG G Q, TAO J, et al. Improved Pedestrian Detection Algorithm for YOLOv4 Infrared Images[J]. Intelligent Computers and Applications, 2021, 11(8): 31-34+41., articleTitle=Improved Pedestrian Detection Algorithm for YOLOv4 Infrared Images, refAbstract=null), Reference(id=1200070664381100127, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2020, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[12], rfOrder=14, authorNames=HAN K, WANG Y H, TIAN Q, journalName=2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), refType=null, unstructuredReference=HAN K, WANG Y H, TIAN Q, et al. GhostNet:More Features From Cheap Operations[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, USA: IEEE, 2020., articleTitle=GhostNet:More Features From Cheap Operations[C], refAbstract=null), Reference(id=1200070664523706478, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2023, volume=53, issue=1, pageStart=57, pageEnd=63, url=null, language=null, rfNumber=[13], rfOrder=15, authorNames=王晓红, 陈哲奇, journalName=激光与红外, refType=null, unstructuredReference=王晓红, 陈哲奇. 基于YOLOv5算法的红外图像行人检测研究[J]. 激光与红外, 2023, 53(1): 57-63., articleTitle=基于YOLOv5算法的红外图像行人检测研究, refAbstract=null), Reference(id=1200070664636952698, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2023, volume=53, issue=1, pageStart=57, pageEnd=63, url=null, language=null, rfNumber=[13], rfOrder=16, authorNames=WANG X H, CHEN Z Q, journalName=Laser and Infrared, refType=null, unstructuredReference=WANG X H, CHEN Z Q. Research on Pedestrian Detection in Infrared Images Based on YOLOv5 Algorithm[J]. Laser and Infrared, 2023, 53(1): 57-63., articleTitle=Research on Pedestrian Detection in Infrared Images Based on YOLOv5 Algorithm, refAbstract=null), Reference(id=1200070664766976138, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2022, volume=12, issue=6, pageStart=33, pageEnd=38, url=null, language=null, rfNumber=[14], rfOrder=17, authorNames=李阳, 赵娟, 严运兵, journalName=智能计算机与应用, refType=null, unstructuredReference=李阳, 赵娟, 严运兵. 基于改进型YoloV5s的热红外道路车辆及行人检测方法[J]. 智能计算机与应用, 2022, 12(6): 33-38., articleTitle=基于改进型YoloV5s的热红外道路车辆及行人检测方法, refAbstract=null), Reference(id=1200070664905388185, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2022, volume=12, issue=6, pageStart=33, pageEnd=38, url=null, language=null, rfNumber=[14], rfOrder=18, authorNames=LI Y, ZHAO J, YAN Y B, journalName=Intelligent Computers and Applications, refType=null, unstructuredReference=LI Y, ZHAO J, YAN Y B. Thermal Infrared Road Vehicle and Pedestrian Detection Based on Improved YoloV5s[J]. Intelligent Computers and Applications, 2022, 12(6): 33-38., articleTitle=Thermal Infrared Road Vehicle and Pedestrian Detection Based on Improved YoloV5s, refAbstract=null), Reference(id=1200070665039605930, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2023, volume=null, issue=null, pageStart=7464, pageEnd=7475, url=null, language=null, rfNumber=[15], rfOrder=19, authorNames=WANG C Y, BOCHKOVSKIY A, LIAO H Y M, journalName=2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), refType=null, unstructuredReference=WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, BC, Canada: IEEE, 2023: 7464-7475., articleTitle=YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors, refAbstract=null), Reference(id=1200070665186406582, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2021, volume=null, issue=null, pageStart=8028, pageEnd=8038, url=null, language=null, rfNumber=[16], rfOrder=20, authorNames=MA N N, ZHANG X Y, LIU M, journalName=2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), refType=null, unstructuredReference=MA N N, ZHANG X Y, LIU M, et al. Activate or Not:Learning Customized Activation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, TN, USA: IEEE, 2021: 8028-8038., articleTitle=Activate or Not:Learning Customized Activation, refAbstract=null), Reference(id=1200070665333207232, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2009, volume=6, issue=3, pageStart=27, pageEnd=30, url=null, language=null, rfNumber=[17], rfOrder=21, authorNames=张展宏, journalName=华北科技学院学报, refType=null, unstructuredReference=张展宏. 基于模拟器的驾驶员应急状态下刹车反应时间的研究[J]. 华北科技学院学报, 2009, 6(3): 27-30., articleTitle=基于模拟器的驾驶员应急状态下刹车反应时间的研究, refAbstract=null), Reference(id=1200070665492590795, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2009, volume=6, issue=3, pageStart=27, pageEnd=30, url=null, language=null, rfNumber=[17], rfOrder=22, authorNames=ZHANG Z H, journalName=Journal of North China Institute of Science and Technology, refType=null, unstructuredReference=ZHANG Z H. A Simulator-Based Study of Driver Braking Reaction Time in Emergency Situations[J]. Journal of North China Institute of Science and Technology, 2009, 6(3): 27-30., articleTitle=A Simulator-Based Study of Driver Braking Reaction Time in Emergency Situations, refAbstract=null), Reference(id=1200070665656168659, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, doi=null, pmid=null, pmcid=null, year=2014, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[18], rfOrder=23, authorNames=LIN T Y, MAIRE M, BELONGIE J S, journalName=Microsoft COCO: Common Objects in Context, refType=null, unstructuredReference=LIN T Y, MAIRE M, BELONGIE J S, et al. Microsoft COCO: Common Objects in Context[M]// FLEET D, PAJDLA T, SCHIELE B, et al. Computer Vision - ECCV 2014. Cham, Switzerland: Springer International Publishing, 2014., articleTitle=null, refAbstract=null)], funds=null, companyList=[AuthorCompany(id=1200070655946355237, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, xref=null, ext=[AuthorCompanyExt(id=1200070655954743845, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, companyId=1200070655946355237, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=University of Electronic Science and Technology of China, Chengdu 611731), AuthorCompanyExt(id=1200070655963132455, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, companyId=1200070655946355237, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=电子科技大学, 成都 611731)])], figs=[ArticleFig(id=1200070657858957980, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=EN, label=null, caption=null, figureFileSmall=ZoW6VF6GemWrWWJaHMo9YQ==, figureFileBig=a6c7zr9iGj8vD4D5f6zVvA==, tableContent=null), ArticleFig(id=1200070657968009890, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=CN, label=图1, caption=改进后模型整体结构, figureFileSmall=ZoW6VF6GemWrWWJaHMo9YQ==, figureFileBig=a6c7zr9iGj8vD4D5f6zVvA==, tableContent=null), ArticleFig(id=1200070658257416884, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=EN, label=null, caption=null, figureFileSmall=2xIUkla1k5rGk7mbMY9XDw==, figureFileBig=In/dg46k6W5CQYVMY1oQ1w==, tableContent=null), ArticleFig(id=1200070658337108670, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=CN, label=图2, caption=基于通道注意力机制的CASPP模块, figureFileSmall=2xIUkla1k5rGk7mbMY9XDw==, figureFileBig=In/dg46k6W5CQYVMY1oQ1w==, tableContent=null), ArticleFig(id=1200070659473765070, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=EN, label=null, caption=null, figureFileSmall=Of5XSiPOYPARWjH2R7xJTA==, figureFileBig=PGgoyUgYvHFHpSiT8qpZ0Q==, tableContent=null), ArticleFig(id=1200070659582816983, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=CN, label=图3, caption=CBM模块, figureFileSmall=Of5XSiPOYPARWjH2R7xJTA==, figureFileBig=PGgoyUgYvHFHpSiT8qpZ0Q==, tableContent=null), ArticleFig(id=1200070659683480291, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=EN, label=null, caption=null, figureFileSmall=EKvQSKYIuk3jil/WrMWs0w==, figureFileBig=Wut/Be/AAsZ7IeUzo9zTMw==, tableContent=null), ArticleFig(id=1200070659817698032, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=CN, label=图4, caption=alpha融合数据增强算法流程, figureFileSmall=EKvQSKYIuk3jil/WrMWs0w==, figureFileBig=Wut/Be/AAsZ7IeUzo9zTMw==, tableContent=null), ArticleFig(id=1200070659972887292, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=EN, label=null, caption=null, figureFileSmall=mA/dTSjRKTlrxYcReGc2oA==, figureFileBig=JNpXXs4Yo5G6uznaLTA6BQ==, tableContent=null), ArticleFig(id=1200070660115493637, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=CN, label=图5, caption=alpha融合数据增强效果, figureFileSmall=mA/dTSjRKTlrxYcReGc2oA==, figureFileBig=JNpXXs4Yo5G6uznaLTA6BQ==, tableContent=null), ArticleFig(id=1200070660237128463, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=EN, label=null, caption=null, figureFileSmall=78x1A73sfWJDjODmsO8ECw==, figureFileBig=0Wv237zCaUHAIz8yYWD8uw==, tableContent=null), ArticleFig(id=1200070660354568991, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=CN, label=图6, caption=CLAHE对比度增强, figureFileSmall=78x1A73sfWJDjODmsO8ECw==, figureFileBig=0Wv237zCaUHAIz8yYWD8uw==, tableContent=null), ArticleFig(id=1200070660438455077, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=EN, label=null, caption=null, figureFileSmall=a1RsbHw6ri8afEZi+pRQtg==, figureFileBig=Lc/6z5iNU1lo+ZeR6+qCzQ==, tableContent=null), ArticleFig(id=1200070660564284209, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=CN, label=图7, caption=损失和精度值的变化过程, figureFileSmall=a1RsbHw6ri8afEZi+pRQtg==, figureFileBig=Lc/6z5iNU1lo+ZeR6+qCzQ==, tableContent=null), ArticleFig(id=1200070660669141820, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=EN, label=null, caption=null, figureFileSmall=null, figureFileBig=null, tableContent=
车速v/km·h-1 安全距离d/m
>100 >100
60~100 v
<60 ≥(v-10)
), ArticleFig(id=1200070660803359561, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=CN, label=表1, caption=

安全距离参考值

, figureFileSmall=null, figureFileBig=null, tableContent=
车速v/km·h-1 安全距离d/m
>100 >100
60~100 v
<60 ≥(v-10)
), ArticleFig(id=1200070660916605779, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=EN, label=null, caption=null, figureFileSmall=null, figureFileBig=null, tableContent=
模型 mAP
(IoU=0.5)
参数量/MB 浮点运算量
/×109 次·s-1
YOLOv3-tiny 0.66 8.67 12.9
YOLOv5s 0.75 7.01 15.8
YOLOv7-tiny 0.75 6.01 13.0
本文方法 0.78 6.75 8.0
), ArticleFig(id=1200070661034046305, tenantId=1146029695717560320, journalId=1189918454225211397, articleId=1200070652263756232, language=CN, label=表2, caption=

在FLIR数据集实验结果

, figureFileSmall=null, figureFileBig=null, tableContent=
模型 mAP
(IoU=0.5)
参数量/MB 浮点运算量
/×109 次·s-1
YOLOv3-tiny 0.66 8.67 12.9
YOLOv5s 0.75 7.01 15.8
YOLOv7-tiny 0.75 6.01 13.0
本文方法 0.78 6.75 8.0
)], attaches=null, journal=Journal(id=1189918244568731652, delFlag=0, nameCn=汽车工程师, nameEn=Automotive Engineer, nameHistory1=null, nameHistory2=null, issn=1674-6546, eissn=null, cn=22-1432/U, coden=null, periodic=0, language=CN, oaType=null, ccby=null, superviseOffice=null, ownerOffice=null, pubOffice=null, editorOffice=null, officeType=null, aims=null, clcCode=null, officeProv=null, officeCity=null, officeAddr=null, officeZip=null, officeEmail=null, officePhone=null, editDirector=null, officeDirector=null, officeDirectorPhone=null, officeStaffNum=null, officeEmpNum=null, coverPicUrl=+bJsKkKt/pjz9u6EwhnksQ==, journalPrice=null, startedYear=null, abbrevIsoEn=null, journalRemark=null, publicationField=null, createdTime=1761628217121, updatedTime=1761735708780, createdBy=18614031015, updatedBy=13701087609, firstLetterCn=A, firstLetterEn=A, subjectCode=Engineering, subjectName=Engineering, subjectCodeEn=Engineering, subjectNameEn=null, picCn=+bJsKkKt/pjz9u6EwhnksQ==, picEn=O3Sn3tnYYrh/jm6emnnMWA==, jcr=null, cjcr=null, exts=[JournalExt(id=1190369097415233706, language=CN, name=汽车工程师, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=, createdTime=1761735708812, updatedTime=1761735708812, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=https://tjqc.cbpt.cnki.net/index.aspx?t=1, submissionEditorUrl=https://tjqc.cbpt.cnki.net/index.aspx?t=3, submissionReviewUrl=https://tjqc.cbpt.cnki.net/index.aspx?t=2, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""}), JournalExt(id=1190369097553645739, language=EN, name=Automotive Engineer, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=, createdTime=1761735708845, updatedTime=1761735708845, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=https://tjqc.cbpt.cnki.net/index.aspx?t=1, submissionEditorUrl=https://tjqc.cbpt.cnki.net/index.aspx?t=3, submissionReviewUrl=https://tjqc.cbpt.cnki.net/index.aspx?t=2, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""})], databaseList=null, tenantJournalId=1189918454225211397, websiteList=[Website(id=1189918982430847716, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1189918454225211397, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/qcgcs/CN, language=CN, createTime=1761628393037, createBy=18614031015, updateTime=1761628422913, updateBy=18614031015, name=汽车工程师-中文, tplId=1146099689490845704, title=汽车工程师, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1189919800185917791, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=articleTextType, value=kx, createTime=1761628588005, updateTime=1761628588005, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800164946268, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=banner, value=null, createTime=1761628588000, updateTime=1761628588000, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800211083618, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=grayFlag, value=0, createTime=1761628588011, updateTime=1761628588011, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800156557659, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=logo, value=https://castjournals.cast.org.cn/joweb/qcgcs/CN/file/pic?fileId=yiZ96RYoYcnGnRMuWdmkWA==, createTime=1761628587998, updateTime=1761628587998, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800223666532, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=minRunFlag, value=0, createTime=1761628588014, updateTime=1761628588014, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800181723486, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/qcgcs/CN/file/pic, createTime=1761628588004, updateTime=1761628588004, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800215277923, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=silenceFlag, value=0, createTime=1761628588012, updateTime=1761628588012, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800173334877, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_cn_619/, createTime=1761628588002, updateTime=1761628588002, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800194306400, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=themeColor, value=null, createTime=1761628588007, updateTime=1761628588007, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919800202695009, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982430847716, code=themeStyle, value=null, createTime=1761628588009, updateTime=1761628588009, creator=18614031015, updator=18614031015)]), Website(id=1189918982527316711, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1189918454225211397, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/qcgcs/EN, language=EN, createTime=1761628393061, createBy=18614031015, updateTime=1761628543075, updateBy=18614031015, name=汽车工程师-英文, tplId=1146101810881728533, title=Automotive Engineer, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1189919837561352952, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=articleTextType, value=kx, createTime=1761628596916, updateTime=1761628596916, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837540381429, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=banner, value=null, createTime=1761628596911, updateTime=1761628596911, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837582324475, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=grayFlag, value=0, createTime=1761628596921, updateTime=1761628596921, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837527798516, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=logo, value=https://castjournals.cast.org.cn/joweb/qcgcs/EN/file/pic?fileId=yiZ96RYoYcnGnRMuWdmkWA==, createTime=1761628596908, updateTime=1761628596908, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837594907389, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=minRunFlag, value=0, createTime=1761628596924, updateTime=1761628596924, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837557158647, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/qcgcs/EN/file/pic, createTime=1761628596915, updateTime=1761628596915, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837586518780, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=silenceFlag, value=0, createTime=1761628596922, updateTime=1761628596922, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837548770038, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_en_623/, createTime=1761628596913, updateTime=1761628596913, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837569741561, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=themeColor, value=null, createTime=1761628596918, updateTime=1761628596918, creator=18614031015, updator=18614031015), WebsiteProps(id=1189919837573935866, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189918982527316711, code=themeStyle, value=null, createTime=1761628596919, updateTime=1761628596919, creator=18614031015, updator=18614031015)])], journalTitle=汽车工程师, weixinUrl=null, journalUrl=https://tjqc.cbpt.cnki.net/, iacademicId=null, status=1, seqNo=null, journalTitleEn=Automotive Engineer, journalPhotoCn=+bJsKkKt/pjz9u6EwhnksQ==, journalPhotoEn=O3Sn3tnYYrh/jm6emnnMWA==, journalFirstLetter=A, journalRecommend=null, journalNew=null, journalCollection=null, jcrJf=null, cjcrJf=null, jcrJfStr=null, cjcrJfStr=null, submissionFirstDecision=null, sciSubjectClassification=null, casSubjectClassification=null, citeScore=null, totalCitationFrequency=null, icpCode=null, psCode=null, advertisingLicenseCode=null, copyrightInformation=null, country=null, option=, provinceCode=null, provinceName=null, collectFlag=false), detailUrlCn=https://castjournals.cast.org.cn/joweb/qcgcs/CN/10.20104/j.cnki.1674-6546.20240158, detailUrlEn=https://castjournals.cast.org.cn/joweb/qcgcs/EN/10.20104/j.cnki.1674-6546.20240158, pdfUrlCn=https://castjournals.cast.org.cn/joweb/qcgcs/CN/PDF/10.20104/j.cnki.1674-6546.20240158, pdfUrlEn=https://castjournals.cast.org.cn/joweb/qcgcs/EN/PDF/10.20104/j.cnki.1674-6546.20240158, aliStartDate=null, aliEndDate=null, collectionFlag=false, citedCount=null, citedUrl=null, reference=null)
收藏切换
基于改进YOLOv7的红外行人目标检测方法
收藏切换
PDF下载
李长海
汽车工程师 | 智能车辆环境感知与目标检测技术专刊 2024,(8): 15-21
收起
收藏切换
汽车工程师 | 智能车辆环境感知与目标检测技术专刊 2024, (8): 15-21
基于改进YOLOv7的红外行人目标检测方法
全屏
李长海
作者信息
  • 电子科技大学, 成都 611731
Infrared Pedestrian Object Detection Algorithm Based on Improved YOLOv7
Changhai Li
Affiliations
  • University of Electronic Science and Technology of China, Chengdu 611731
出版时间: 2024-08-15 doi: 10.20104/j.cnki.1674-6546.20240158
文章导航
收藏切换

针对红外行人目标检测过程中,图像中行人目标特征不显著、小目标密集、背景复杂等因素导致的检测不全、误检率高等问题,提出了一种基于改进YOLOv7的红外行人目标检测算法。首先,以YOLOv7-tiny模型为基础,采用基于通道注意力机制的空间金字塔池化(CASPP)模块替换原始空间金字塔池化(SPP)模块,使模型更注重行人特征的提取;然后,引入基于Meta-ACON激活函数的卷积模块(CBM),进一步抑制背景噪声,保留行人细节;最后,提出一种alpha融合数据增强方法,以丰富样本的多样性,提高模型在复杂环境中的稳定性。基于FLIR数据集的验证结果表明,与YOLOv7-tiny算法相比,所提出的方法精度提高了3%,计算量减少了38%,更适用于红外行人目标检测场景。

红外图像  /  行人检测  /  注意力机制  /  Meta-ACON  /  YOLOv7

To eliminate the defects of incomplete detection and high false detection rate caused by insignificant pedestrian target features, dense small targets and complex background in infrared images, this paper proposes an infrared pedestrian target detection algorithm based on improved YOLOv7. Firstly, the original Spatial Pyramid Pooling (SPP) module is replaced by the Channel Attention based Spatial Pyramid Pooling (CASPP) module based on the YOLOv7-tiny model, so that the model could pay more attention to the extraction of pedestrian features; then, the convolution module CBM based on the Meta-ACON activation function is introduced, which further suppressed the background noise and preserved the details of the pedestrians; finally, an alpha fusion data enhancement method is proposed to enrich the diversity of samples and improve the stability of the model in complex environments. The validation based on the FLIR dataset shows that the proposed method improves the accuracy by 3% and reduces the computation by 38% compared with the YOLOv7-tiny algorithm, which is more suitable for infrared pedestrian target detection scenarios.

Infrared image  /  Pedestrian detection  /  Attention mechanism  /  Meta-ACON  /  YOLOv7
李长海. 基于改进YOLOv7的红外行人目标检测方法. 汽车工程师, 2024 , (8) : 15 -21 . DOI: 10.20104/j.cnki.1674-6546.20240158
Changhai Li. Infrared Pedestrian Object Detection Algorithm Based on Improved YOLOv7[J]. Automotive Engineer, 2024 , (8) : 15 -21 . DOI: 10.20104/j.cnki.1674-6546.20240158
在高级驾驶辅助系统(Advanced Driving Assistance System,ADAS)、无人驾驶系统中,实时准确地检测行人的位置至关重要,红外目标检测技术在这类系统中具有巨大应用价值。
近年来,为提高红外行人目标检测性能,研究人员提出了许多代表性的算法和模型。传统方法[1-4]提取的特征泛化能力不足,无法在复杂的外部环境下保持稳定性。得益于海量数据集的建立,基于深度学习的方法得到了广泛应用。目前,基于深度学习的红外行人检测方法主要分为单阶段方法和两阶段方法。两阶段方法通常采用更快速区域卷积神经网络(Faster Region Convolutional Neural Network,Faster R-CNN)[5]范式,需要先用区域候选网络(Region Proposal Network,RPN)生成候选框。如胡均平等[6]为提取更抽象的目标特征,采用ResNet-50[7]作为Faster R-CNN的骨干网络,并统计标注的行人框的纵横比来改善生成的候选框。这类方法在生成的粗略候选框基础上进一步精细调整,通常难以在推理速度和准确率上保持平衡,不适合部署在车载设备中。单阶段方法主要基于YOLO(You Only Look Once)[8-9]系列算法的思想,采用预先定义的候选框,而不是在网络中生成,从而缩短了模型推理时间。刘怡帆等[10]改进了YOLOv4网络模型,并对输入图像进行对比度受限的自适应直方图均衡化(Contrast Limited Adaptive Histogram Equalization,CLAHE)预处理,以增强行人特征。史健婷等[11]在YOLOv4基础上将骨干网络替换为更轻量的GhostNet[12]网络,提高了检测速度。王晓红等[13]、李阳等[14]对YOLOv5模型进行了改进,提高了准确率。上述方法虽然取得了一定的检测性能提升效果,但仍面临小目标处理不佳和背景误检率高的问题:远处行人小目标的分辨率通常很小,如果在特征提取过程中下采样倍率过大,则会导致原图行人在特征图上占据不到1个像素;红外图像中行人特征不显著,易与背景相似物混淆,模型学习效率低。
针对上述问题,本文以YOLOv7-tiny[15]为基础,提出基于通道注意力机制的空间金字塔池化(Channel Attention based Spatial Pyramid Pooling,CASPP)模块,替换原始的空间金字塔池化(Spatial Pyramid Pooling,SPP)模块,并调整其在模型中的位置,使模型更加关注红外行人特征的提取。为了进一步抑制背景噪声、保留行人细节,引入基于Meta-ACON[16]激活函数的卷积批标准化模块(Convolution, Batch Normalization and Meta-ACON,CBM)。此外,提出一种新的alpha融合数据增强方法,以丰富样本的多样性,提高模型在复杂环境下的稳定性。
本文在YOLOv7-tiny的基础上进行改进,模型的整体结构如图1所示。在骨干网络(Backbone)中,去除原始P5层的SPP模块。红外行人小目标比大目标更难预测,且小目标主要由浅层网络负责,所以将原始SPP模块置于P5输出层对小目标检测的作用不大。因此,本文利用通道注意力机制,提出了CASPP模块,并将其嵌入P3输出层,为P4层提供更好的输入,可以使模型高效学习小目标特征。
另一方面,YOLOv7-tiny中P5输出层的分辨率为输入的1/32,可能导致像素宽度小于32的远处行人小目标在P5层特征丢失。直接上采样后与P4层特征融合会添加很多无用的背景特征,影响模型性能。因此,本文设计了CBM模块,保留检测大目标所需的特征,同时抑制背景特征。P5层特征经过该模块后再与P4层特征融合,通过额外的学习参数,可以自适应选择是否激活P5层的特征,相当于一种特殊的注意力机制。检测头(Head)与YOLOv7-tiny保持一致,使用聚类算法在数据集所有标注框中迭代出9种合适的先验框。
热红外图像中,行人特征较弱,缺少重要纹理信息,易受无关背景的干扰。为了学习到具有区分性的行人特征,本文设计了CASPP模块,其基本结构如图2所示。CASPP模块首先通过卷积模块(CONV)和最大池化模块(MaxPool2D)得到多种具有不同感受野的特征,其中KSP分别为池化核大小、步长、边界扩张大小;接着在通道维度拼接(CONCAT)后经过一层卷积输出包含目标和背景的特征FRC×H×W,其中,CHW分别为特征F的通道数量、高度和宽度。为了突出行人目标特征,同时抑制背景特征,CASPP计算特征F各通道的权重因子:
Mω(F)=σ(fce(fcs(favg(F))))
式中:favg为全局平均池化函数,fcsfce为不同参数的全连接层,σ为S型函数(Sigmoid函数)。
输入特征F首先经过favg函数取同一个通道各空间位置的平均值,随后使用fcs全连接层压缩通道数量为原来的1/2,再通过fce全连接层扩展为原始大小,最后通过σ函数归一化,得到范围为0~1的权重因子。将权重因子作为输入特征F的通道权重进行乘法运算,权重因子越大,得到的特征与目标相关性越高。
CBM模块结构如图3所示,包括二维卷积函数Conv2D、批归一化函数BatchNorm2d和激活函数Meta-ACON。Meta-ACON激活函数通过学习参数的方式来自适应地选择是否激活特征,其一般形式为Meta-ACONC,计算方式为:
Sβ(p1x,p2x)=(p1-p2)xσ[β(p1-p2)x]+p2x
式中:x为输入特征,p1p2为2个直接可学习参数,β为间接可学习参数。
β有多种计算方法,最简单的是基于每个特征点计算,但是需要的参数量过大。为了减少模型参数,本文基于每个通道分别计算:
$\beta =\sigma {P}_{1}{P}_{2}{\sum }_{h=1}^{H}{\sum }_{w=1}^{W}{x}_{c,h,w}$
式中:P1RC×(C/2)P2R(C/2)×C为可以学习的权重参数,C为输入特征总通道数量,c为通道特征层索引,xc,h,w为第c个通道高度、宽度分别为hw的输入特征点。
式(3)类似于式(1)中的全连接层,不同之处在于需计算每个通道的特征总和。
在推理部署阶段,为了提高模型推理速度,进一步将Conv2D和BatchNorm2d合并,得到新的卷积层。新的卷积层权重参数和偏置参数分别为:
${W}^{\text{'}}={W}_{conv}\times (\gamma /\sqrt{{V}_{bn}+\epsilon )}$
${B}^{\text{'}}=\left[\right({B}_{conv}-{M}_{bn})/\sqrt{{V}_{bn}+\epsilon }]\times \gamma +\beta $
式中:WconvBconv分别为Conv2D的权重和偏置参数;VbnMbn分别为BatchNorm2d在训练过程中通过移动加权平均方式统计的方差和均值;γβ为BatchNorm2d的2个可学习参数,分别用于缩放和平移;ε为一个很小的数,本文设ε=10-6,以防止计算过程中方差项为零。
VbnMbn作为BatchNorm2d推理阶段的方差和均值的原因在于推理阶段的输入批量数据不大,通常为1张图像,计算得到的方差及均值与训练阶段对应的值存在较大偏差,导致推理精度下降。二者的计算公式分别为:
${V}_{bn}=\theta \times {V}_{bn}^{\text{'}}+(1-\theta )\times {V}_{bn}^{t}$
${M}_{bn}=\theta \times {M}_{bn}^{\text{'}}+(1-\theta )\times {M}_{bn}^{t}$
式中:θ为加权因子,本文设置为0.9;${V}_{bn}^{t}$${M}_{bn}^{t}$分别为根据当前批次数据计算得到的方差和均值;${V}_{bn}^{\text{'}}$${M}_{bn}^{\text{'}}$分别为根据之前所有批次数据计算得到的累计方差和均值。
由式(4)和式(5)可得到W′和B′,在复杂模型中可以有效减少整体浮点计算量。
红外行人数据集中的图像背景较为单一,可能会限制模型的泛化性能。为了丰富训练样本,本文提出一种alpha融合数据增强方法,将包含目标的图像和无目标的背景图像融合,计算流程如图4所示。
首先,将目标图像T和背景图像B重采样,使二者在宽度、高度上保持一致,目标边界框Gboxes也相应调整。其次,根据目标边界框设置所有目标区域为前景、其他区域为背景,生成二值图像M,前景保持不变,以防止行人特征失真变形,背景区域为两幅图像加权融合的结果,权重因子α在每次迭代过程中通过随机函数生成,范围为0~1。最后,根据权重因子和M进行融合,融合方式为:
Iti=(1-MT×α
Ibi=(1-MB×(1-α)
Im=M×T
If=Iti+Ibi+Im
式中:Iti为目标图像不包括目标区域的权重图,Ibi为背景图像不包括目标区域的权重图,Im为目标图像的目标区域,If为增强后的图像。
融合后的数据增强效果如图5所示,包含了不同背景和不同大小的目标。从图5中可以看出,融合后图像包含的信息更丰富,α越大,前景图像提供的信息越多。在训练过程中,通过改变权重因子α和提供的背景图像,可以起到数据增强的作用,提高模型的泛化性能。
红外图像的对比度较低,导致行人的边缘、纹理等细节模糊不清,如图6a所示,行人几乎融入背景环境中,难以辨识,从而影响模型学习特征的准确性,因此,在预处理阶段,本文采用CLAHE(Contrast Limited Adaptive Histogram Equalization)算法提高输入图像的对比度。CLAHE将整个图像分块,本文划分为8×8块,不足以整除的部分则填充对齐,然后在图像块区域内统计直方图,根据设定阈值限制对比度,调整直方图的分布,最后联合相邻块,使用双线性插值确定新像素,使图像对比度得到增强。增强后的图像如图6b所示,可以明显看出,相比于原始图像,行人的轮廓更清晰,纹理更丰富,易与背景环境区分,有利于提高模型的精度。
车辆行驶过程中,应与前方车辆保持一定的安全距离,当前方无车辆时,也应控制车速。如表1所示,车速达到100 km/h时,安全距离应大于100 m,才能有足够的反应时间避开紧急情况[17]。在实际场景中,远处目标通常很小,需要算法能准确及时地检测并计算与车辆的距离,如果该距离小于安全距离,则启用预警功能。
本文所有试验均在Ubuntu 22.04系统上进行,试验平台搭载NVIDIA GeForce RTX 4060 Ti显卡,显存为8 GB,深度学习框架采用PyTorch 2.1,Python版本为3.9,统一计算设备架构(Compute Unified Device Architecture,CUDA)版本为11.8。
使用FLIR数据集验证算法的有效性,该数据集分为训练、验证和测试3个子集,共15个不同类别。采集场景包含了多种恶劣的天气条件,如雨雪、雾霾等,以及白天、夜晚等不同的光照强度环境,提高了行人检测的难度。本文只关注行人类别,不包含行人的图像均视为负样本。因此,FLIR数据集中最终可用的训练集有8 205帧,验证集有819帧,测试集有2 231帧。验证集图像较少,很难反映模型训练过程的真实状态。本文将所有图像合并,再按照8∶2的比例划分,作为训练和评估模型性能的图像集。
车载设备用的红外行人目标检测模型不仅要评估算法的准确性,还要保证算法运行的实时性。准确性包括两个方面,分别是精度(Precision)P和查全率(Recall)R
P=NTP/(NTP+NFP)
R=NTP/(NTP+NFN)
式中:NTP为预测为正样本,实际也为正样本的数量;NFP为预测为正样本,实际为负样本的数量;NFN为预测为负样本,实际为正样本的数量。
R为横轴、P为纵轴构建P-R曲线,基于该曲线计算平均精度(mean Average Precision,mAP)作为准确性的评价指标[18]
${p}_{interp}\left(r\right)=\underset{{r}^{\text{'}}\ge r}{max}p\left({r}^{\text{'}}\right)$
${A}_{ap}=\sum _{i=1}^{n-1}({r}_{i+1}-{r}_{i}){p}_{interp}\left({r}_{i+1}\right)$
${A}_{map}=\frac{1}{K}\sum _{i=1}^{K}{A}_{api}$
式中:rr′、ri、ri+1为插值点处的召回率;p(r′)为r′处对应的精度;pinterp(r)为插值函数,其在r处的值取p(r′)在下一个插值点之前的最大值,有利于降低曲线波动的影响;n为所有插值点的数量;K为目标类别数量,本文需检测的目标只有行人,故取K=1;Aap为准确率,P-R曲线经插值后整体呈柱状分布,将柱条的宽度(ri+1-ri)与高度pinterp(ri+1)相乘得到柱条的面积,对该曲线下所有的柱条面积求和即为该曲线对应检测结果类别的准确率;Aapi为类别i的准确率。
另一方面,本文采用模型参数量和计算量来评估模型的实时性,轻量的模型更适合部署在车载设备中。
为了验证本文方法在红外行人目标检测方面的性能,在FLIR数据集上与目前常用的检测算法进行比较。在训练时,固定输入图像大小为640×640,参数初始学习率为0.001,衰减率为0.000 5,批量大小(Mini-Batch)为16,动量设置为0.937。本文所有试验均不使用预训练模型初始化参数,共训练了200轮,训练过程损失和交并比(Intersection over Union,IoU)为0.5时精度的可视化曲线如图7所示。
图7中可以看出,当训练到第25轮时,目标损失和分类损失已经收敛,边框损失仍然可以继续优化,mAP也在提升,直到第150轮时,边框损失趋于平缓,mAP达到极大值。在相同的输入图像尺寸条件下,与其他常用检测算法的对比结果如表2所示。从表2中可以看出,在参数量差异很小的情况下,相比于YOLOv3-tiny,本文方法获取的精度提高了12百分点,同时比YOLOv5s、YOLOv7-tiny高了3百分点,验证了本文方法的有效性。另一方面,本文方法的计算速度显著提升,整体计算量比YOLOv3-tiny和YOLOv7-tiny减少了约38%,比YOLOv5s减少了约50%。故本文方法更适合用于红外行人目标检测任务。
本文探讨了YOLOv7-tiny直接应用在红外行人目标检测任务中的局限性,并根据热红外图像的特性和实际场景,使用CLAHE算法突出行人的细节,提出CASPP模块增强行人特征的提取,将CBM模块嵌入特征融合路径中以自适应地选择不同大小的特征,提出alpha融合数据增强算法以提高模型在复杂环境的稳定性。在FLIR数据集上的试验结果验证了本文方法的有效性和高效性,在检测准确率相当的情况下,计算速度显著提升,更适合背景复杂、小目标多的红外行人检测任务。
参考文献 引证文献
排序方式:
[1]
HARRIS C G, STEPHENS M J. A Combined Corner and Edge Detector[C]// Alvey Vision Conference. Manchester, UK: University of Manchester, 1988.
[2]
LOWE D G. Distinctive Image Features from Scale-Invariant Keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110.
[3]
DALAL N, TRIGGS B. Histograms of Oriented Gradients for Human Detection[C]// 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, CA, USA: IEEE, 2005.
[4]
RUBLEE E, RABAUD V, KONOLIGE K, et al. ORB: An Efficient Alternative to SIFT or SURF[C]// International Conference on Computer Vision. Barcelona, Spain:IEEE, 2011.
[5]
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[6]
胡均平, 孙希. 基于改进Faster R-CNN的近红外夜间行人检测方法[J]. 传感器与微系统, 2021, 40(8): 126-129.
HU J P, SUN X. Near-Infrared Nighttime Pedestrian Detection Based on Improved Faster R-CNN[J]. Sensors and Microsystems, 2021, 40(8): 126-129.
[7]
HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA: IEEE, 2016: 770-778.
[8]
REDMON J, FARHADI A. YOLOv3:An Incremental Improvement[EB/OL]. (2018-04-08) [2024-05-04]. https://arxiv.org/abs/1804.02767 https://arxiv.org/abs/1804.02767
[9]
BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4:Optimal Speed and Accuracy of Object Detection[EB/OL]. (2020-04-23) [2024-05-04]. https://arxiv.org/abs/2004.10934 https://arxiv.org/abs/2004.10934
[10]
刘怡帆, 王旭飞, 周鹏, 等. 基于YOLOv4神经网络的红外图像道路行人检测[J]. 数字技术与应用, 2021, 39(10): 19-22.
LIU Y F, WANG X F, ZHOU P, et al. Infrared Image Road Pedestrian Detection Based on YOLOv4 Neural Network[J]. Digital Technology and Applications, 2021, 39(10): 19-22.
[11]
史健婷, 张贵强, 陶金, 等. 改进的YOLOv4红外图像行人检测算法[J]. 智能计算机与应用, 2021, 11(8): 31-34+41.
SHI J T, ZHANG G Q, TAO J, et al. Improved Pedestrian Detection Algorithm for YOLOv4 Infrared Images[J]. Intelligent Computers and Applications, 2021, 11(8): 31-34+41.
[12]
HAN K, WANG Y H, TIAN Q, et al. GhostNet:More Features From Cheap Operations[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, USA: IEEE, 2020.
[13]
王晓红, 陈哲奇. 基于YOLOv5算法的红外图像行人检测研究[J]. 激光与红外, 2023, 53(1): 57-63.
WANG X H, CHEN Z Q. Research on Pedestrian Detection in Infrared Images Based on YOLOv5 Algorithm[J]. Laser and Infrared, 2023, 53(1): 57-63.
[14]
李阳, 赵娟, 严运兵. 基于改进型YoloV5s的热红外道路车辆及行人检测方法[J]. 智能计算机与应用, 2022, 12(6): 33-38.
LI Y, ZHAO J, YAN Y B. Thermal Infrared Road Vehicle and Pedestrian Detection Based on Improved YoloV5s[J]. Intelligent Computers and Applications, 2022, 12(6): 33-38.
[15]
WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, BC, Canada: IEEE, 2023: 7464-7475.
[16]
MA N N, ZHANG X Y, LIU M, et al. Activate or Not:Learning Customized Activation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, TN, USA: IEEE, 2021: 8028-8038.
[17]
张展宏. 基于模拟器的驾驶员应急状态下刹车反应时间的研究[J]. 华北科技学院学报, 2009, 6(3): 27-30.
ZHANG Z H. A Simulator-Based Study of Driver Braking Reaction Time in Emergency Situations[J]. Journal of North China Institute of Science and Technology, 2009, 6(3): 27-30.
[18]
LIN T Y, MAIRE M, BELONGIE J S, et al. Microsoft COCO: Common Objects in Context[M]// FLEET D, PAJDLA T, SCHIELE B, et al. Computer Vision - ECCV 2014. Cham, Switzerland: Springer International Publishing, 2014.
2024年第卷第8期
PDF下载
156
65
引用本文
BibTeX
文章信息
doi: 10.20104/j.cnki.1674-6546.20240158
  • 首发时间:2025-11-25
  • 出版时间:2024-08-15
补充材料
相关文章
文章信息
作者
出版历史
  • 修回日期:2024-05-04
基金
作者信息
    电子科技大学, 成都 611731
参考文献
分享链接
https://castjournals.cast.org.cn/joweb/qcgcs/CN/10.20104/j.cnki.1674-6546.20240158
分享至
全文二维码

扫描看全文

引用本文
BibTeX
本文的引用情况
2种不同金属材料的力学参数

Family
属数
Number of
genus
种数
Number of
species
占总种数比例
Percentage of
total species (%)

Genus
种数
Number of
species
占总种数比例
Percentage of total
species (%)
鹅膏菌科Amanitaceae 2 11 5.26 鹅膏菌属 Amanita 10 4.78
小菇科 Mycenaceae 2 12 5.74 丝盖伞属 Inocybe 5 2.39
多孔菌科 Polyporaceae 8 14 6.70 蜡蘑属 Laccaria 5 2.39
红菇科 Russulaceae 3 23 11.00 小皮伞属 Marasmius 6 2.87
小菇属 Mycena 11 5.26
光柄菇属 Pluteus 5 2.39
红菇属 Russula 17 8.13
栓菌属 Trametes 5 2.39
关闭全屏