Article(id=1209910185787781300, tenantId=1146029695717560320, journalId=1189621681917173762, issueId=1209910182134542453, articleNumber=null, orderNo=null, doi=10.19620/j.cnki.1000-3703.20231097, pmid=null, cstr=null, oa=null, hot=null, price=null, onlineType=0, articleFormat=0, articleType=null, articleTypeStr=research-article, receivedDate=null, receivedDateStr=null, revisedDate=null, revisedDateStr=null, acceptedDate=null, acceptedDateStr=null, onlineDate=1766394667336, onlineDateStr=2025-12-22, pubDate=1721750400000, pubDateStr=2024-07-24, doiRegisterDate=null, doiRegisterDateStr=null, onlineIssueDate=1766394667336, onlineIssueDateStr=2025-12-22, onlineJustAcceptDate=null, onlineJustAcceptDateStr=null, onlineFirstDate=null, onlineFirstDateStr=null, sourceXml=null, magXml=null, createTime=1766394667336, creator=13701087609, updateTime=1766394667336, updator=13701087609, issue=Issue{id=1209910182134542453, tenantId=1146029695717560320, journalId=1189621681917173762, year='2024', volume='', issue='7', pageStart='1', pageEnd='62', issueExtLink='null', onlineDate='null', pubDate='null', beforeIssueId=null, nextIssueId=null, price=null, status=1, issueComplete=1, articleOrder=1, issueType=-1, specialIssue=null, createTime=1766394666465, creator=13701087609, updateTime=1766482240343, updator=13701087609, preIssue=null, nextIssue=null, ext={EN=IssueExt(id=1210277493739753804, tenantId=1146029695717560320, journalId=1189621681917173762, issueId=1209910182134542453, language=EN, specialIssueTitle=, coverIllustrator=null, specialIssueEditor=, specialIssueAbout=), CN=IssueExt(id=1210277493739753805, tenantId=1146029695717560320, journalId=1189621681917173762, issueId=1209910182134542453, language=CN, specialIssueTitle=, coverIllustrator=null, specialIssueEditor=, specialIssueAbout=)}, issueFiles=null}, startPage=9, endPage=16, ext={EN=ArticleExt(id=1209910186043633861, articleId=1209910185787781300, tenantId=1146029695717560320, journalId=1189621681917173762, language=EN, title=Vehicle Tracking Algorithm Based on Transformer’s Improved YOLOv5+DeepSORT, columnId=1209910182801436791, journalTitle=Automobile Technology, columnName=Feature Topic on Motion Planning and Control Techniques, runingTitle=null, highlight=null, articleAbstract=

In order to solve the shortcomings of traditional object detection and tracking algorithms, such as low detection accuracy, poor global perception ability, poor recognition ability of occlusion and small target objects, this paper proposed a vehicle tracking method based on YOLOv5 and DeepSORT algorithm improved by lightweight Transformer. Firstly, the EfficientFormerV2 model was used to improve the YOLOv5 algorithm model to enhance the target detection ability of the vehicle, and then the advantages of the Swin model were used to improve the Re-Identification module in the DeepSORT multi-target tracking algorithm to enhance the tracking ability and accuracy of the vehicle. Finally, the dataset KITTI and VeRi were used to carry out comparative experiments and ablation experiments. The results show that under complex conditions, the performance of the proposed method is significantly improved in vehicle occlusion and small target recognition, with an average accuracy of 96.7%, an increase of 9.547% in target tracking, and a reduction of 26.4% in the total number of ID switching.

, correspAuthors=null, authorNote=null, correspAuthorsNote=null, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=null, magXml=null, pdfUrl=null, pdf=null, pdfFileSize=null, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=null, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=null, mapNumber=null, authorCompany=null, fund=null, authors=null, authorsList=Shuilong He, Jingjia Zhang, Linjun Zhang, Deyun Mo), CN=ArticleExt(id=1209910189931753833, articleId=1209910185787781300, tenantId=1146029695717560320, journalId=1189621681917173762, language=CN, title=基于Transformer改进的YOLOv5+DeepSORT的车辆跟踪算法*, columnId=1209910182935654522, journalTitle=汽车技术, columnName=智能车辆运动规划与控制技术专题, runingTitle=null, highlight=null, articleAbstract=

针对传统目标检测跟踪算法检测精度低、全局感知能力差、对遮挡和小目标物体的识别能力差等问题,提出了一种基于轻量化Transformer改进的YOLOv5和DeepSORT算法的车辆跟踪方法。首先,利用EfficientFormerV2模型改进YOLOv5算法模型,增强车辆的目标检测能力;然后,利用移位窗口(Swin)模型的优点改进DeepSORT多目标跟踪算法中的重识别(Re-Identification)模块,提高车辆的跟踪能力和精度;最后,通过数据集KITTI和VeRi开展对比试验和消融实验。结果表明,在复杂工况下,该方法的性能在车辆遮挡和小目标识别方面显著提高,平均准确度达到96.7%,目标跟踪准确度提高了9.547%,编号(ID)切换总次数减少了26.4%。

, correspAuthors=null, authorNote=null, correspAuthorsNote=
莫德赟(1983—),男,副教授,主要研究方向为汽车智能驾驶,
, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=OfsUAf3nw+G0Sir1U8n8ug==, magXml=dXTQEyDH2XlK9vShJDmyQQ==, pdfUrl=null, pdf=hdtMb2QyeSp3voz1OUc+CQ==, pdfFileSize=14383479, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=/BALcWyTzZMPMP8ApsDORQ==, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=b8IsBThSdRBJmlLuS5LQCA==, mapNumber=null, authorCompany=null, fund=null, authors=null, authorsList=何水龙, 张靖佳, 张林俊, 莫德赟)}, authors=[Author(id=1210277263866720488, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, orderNo=0, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1210277265116623084, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, authorId=1210277263866720488, language=EN, stringName=Shuilong He, firstName=Shuilong, middleName=null, lastName=He, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, 2, address=1 Guilin University of Electronic Technology, Guilin 541004
2 Guilin University of Aerospace Technology,Guilin 541004, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1210277265183731950, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, authorId=1210277263866720488, language=CN, stringName=何水龙, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, 2, address=1 桂林电子科技大学,桂林 541004
2 桂林航天工业学院,桂林 541001, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1210277263669588187, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, xref=1, ext=[AuthorCompanyExt(id=1210277263682171100, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, companyId=1210277263669588187, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1 Guilin University of Electronic Technology, Guilin 541004), AuthorCompanyExt(id=1210277263686365406, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, companyId=1210277263669588187, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1 桂林电子科技大学,桂林 541004)]), AuthorCompany(id=1210277263778640096, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, xref=2, ext=[AuthorCompanyExt(id=1210277263782834402, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, companyId=1210277263778640096, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2 Guilin University of Aerospace Technology,Guilin 541004), AuthorCompanyExt(id=1210277263791223010, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, companyId=1210277263778640096, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2 桂林航天工业学院,桂林 541001)])]), Author(id=1210277265263423730, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, orderNo=1, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1210277265355698421, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, authorId=1210277265263423730, language=EN, stringName=Jingjia Zhang, firstName=Jingjia, middleName=null, lastName=Zhang, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1 Guilin University of Electronic Technology, Guilin 541004, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1210277265435390199, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, authorId=1210277265263423730, language=CN, stringName=张靖佳, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1 桂林电子科技大学,桂林 541004, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1210277263669588187, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, xref=1, ext=[AuthorCompanyExt(id=1210277263682171100, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, companyId=1210277263669588187, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1 Guilin University of Electronic Technology, Guilin 541004), AuthorCompanyExt(id=1210277263686365406, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, companyId=1210277263669588187, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1 桂林电子科技大学,桂林 541004)])]), Author(id=1210277265510887677, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, orderNo=2, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1210277265603162370, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, authorId=1210277265510887677, language=EN, stringName=Linjun Zhang, firstName=Linjun, middleName=null, lastName=Zhang, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1 Guilin University of Electronic Technology, Guilin 541004, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1210277265687048453, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, authorId=1210277265510887677, language=CN, stringName=张林俊, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1 桂林电子科技大学,桂林 541004, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1210277263669588187, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, xref=1, ext=[AuthorCompanyExt(id=1210277263682171100, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, companyId=1210277263669588187, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1 Guilin University of Electronic Technology, Guilin 541004), AuthorCompanyExt(id=1210277263686365406, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, companyId=1210277263669588187, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1 桂林电子科技大学,桂林 541004)])]), Author(id=1210277265791906056, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, orderNo=3, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=23440217@qq.com, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1210277265871597835, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, authorId=1210277265791906056, language=EN, stringName=Deyun Mo, firstName=Deyun, middleName=null, lastName=Mo, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2 Guilin University of Aerospace Technology,Guilin 541004, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1210277265963872528, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, authorId=1210277265791906056, language=CN, stringName=莫德赟, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2 桂林航天工业学院,桂林 541001, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1210277263778640096, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, xref=2, ext=[AuthorCompanyExt(id=1210277263782834402, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, companyId=1210277263778640096, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2 Guilin University of Aerospace Technology,Guilin 541004), AuthorCompanyExt(id=1210277263791223010, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, companyId=1210277263778640096, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2 桂林航天工业学院,桂林 541001)])])], keywords=[Keyword(id=1210277266169393429, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, orderNo=1, keyword=YOLOv5), Keyword(id=1210277266244890903, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, orderNo=2, keyword=Vehicle detection), Keyword(id=1210277266374914330, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, orderNo=3, keyword=DeepSORT), Keyword(id=1210277266483966238, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, orderNo=4, keyword=Transformer), Keyword(id=1210277266580435231, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, orderNo=1, keyword=YOLOv5), Keyword(id=1210277266651738401, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, orderNo=2, keyword=车辆检测), Keyword(id=1210277266735624484, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, orderNo=3, keyword=DeepSORT), Keyword(id=1210277266848870693, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, orderNo=4, keyword=Transformer)], refs=[Reference(id=1210277271282250102, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2017, volume=null, issue=null, pageStart=2961, pageEnd=2969, url=null, language=null, rfNumber=[1], rfOrder=0, authorNames=HE K, GKIOXARI G, DOLLÁR P, journalName=Proceedings of the IEEE International Conference on Computer Vision. Venice, refType=null, unstructuredReference=HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]// Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 2961-2969., articleTitle=Mask R-CNN, refAbstract=null), Reference(id=1210277271345164665, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2019, volume=null, issue=null, pageStart=6105, pageEnd=6114, url=null, language=null, rfNumber=[2], rfOrder=1, authorNames=TAN M X, LE Q V, journalName=International Conference on Machine Learning. Long Beach, refType=null, unstructuredReference=TAN M X, LE Q V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks[C]// International Conference on Machine Learning. Long Beach, California: PMLR, 2019: 6105-6114., articleTitle=EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, refAbstract=null), Reference(id=1210277271441633659, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2023, volume=34, issue=4, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[3], rfOrder=2, authorNames=SHEN L Z, TAO H F, NI Y Z, journalName=Measurement Science and Technology, refType=null, unstructuredReference=SHEN L Z, TAO H F, NI Y Z, et al. Improved YOLOv3 Model with Feature Map Cropping for Multi-Scale Road Object Detection[J]. Measurement Science and Technology, 2023, 34(4)., articleTitle=Improved YOLOv3 Model with Feature Map Cropping for Multi-Scale Road Object Detection, refAbstract=null), Reference(id=1210277271508742525, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2019, volume=null, issue=null, pageStart=9259, pageEnd=9266, url=null, language=null, rfNumber=[4], rfOrder=3, authorNames=ZHAO Q J, SHENG T, WANG Y T, journalName=Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, Hawaii, refType=null, unstructuredReference=ZHAO Q J, SHENG T, WANG Y T, et al. M2Det: A Single-Shot Object Detector Based on Multi-Level Feature Pyramid Network[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, Hawaii, USA: AAAI, 2019: 9259-9266., articleTitle=M2Det: A Single-Shot Object Detector Based on Multi-Level Feature Pyramid Network, refAbstract=null), Reference(id=1210277271580045694, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2021, volume=21, issue=9, pageStart=3263, pageEnd=null, url=null, language=null, rfNumber=[5], rfOrder=4, authorNames=YU J M, ZHANG W, journalName=Sensors, refType=null, unstructuredReference=YU J M, ZHANG W. Face Mask Wearing Detection Algorithm Based on Improved YOLO-v4[J]. Sensors, 2021, 21(9): 3263., articleTitle=Face Mask Wearing Detection Algorithm Based on Improved YOLO-v4, refAbstract=null), Reference(id=1210277271663931775, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2021, volume=16, issue=10, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[6], rfOrder=5, authorNames=WU W T, LIU H, LI L L, journalName=PLoS One, refType=null, unstructuredReference=WU W T, LIU H, LI L L, et al. Application of Local Fully Convolutional Neural Network Combined with YOLO v5 Algorithm in Small Target Detection of Remote Sensing Image[J]. PLoS One, 2021, 16(10)., articleTitle=Application of Local Fully Convolutional Neural Network Combined with YOLO v5 Algorithm in Small Target Detection of Remote Sensing Image, refAbstract=null), Reference(id=1210277271718457729, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2020, volume=null, issue=null, pageStart=657, pageEnd=668, url=null, language=null, rfNumber=[7], rfOrder=6, authorNames=BHARATI P, PRAMANIK A, journalName=Computational Intelligence in Pattern Recognition, refType=null, unstructuredReference=BHARATI P, PRAMANIK A. Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey[C]// Computational Intelligence in Pattern Recognition. Singapore: Springer, 2020: 657-668., articleTitle=Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey, refAbstract=null), Reference(id=1210277271772983683, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2016, volume=null, issue=null, pageStart=36, pageEnd=42, url=null, language=null, rfNumber=[8], rfOrder=7, authorNames=YU F W, LI W B, LI Q Q, journalName=Computer Vision-ECCV 2016 Workshops. Cham, refType=null, unstructuredReference=YU F W, LI W B, LI Q Q, et al. POI: Multiple Object Tracking with High Performance Detection and Appearance Feature[C]// Computer Vision-ECCV 2016 Workshops. Cham, Switzerland: Springer, 2016: 36-42., articleTitle=POI: Multiple Object Tracking with High Performance Detection and Appearance Feature, refAbstract=null), Reference(id=1210277271848481158, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2022, volume=null, issue=null, pageStart=2022, pageEnd=null, url=null, language=null, rfNumber=[9], rfOrder=8, authorNames=YU Z G, DONG Y Y, CHENG J H, journalName=Security and Communication Networks, refType=null, unstructuredReference=YU Z G, DONG Y Y, CHENG J H, et al. Research on Face Recognition Classification Based on Improved GoogleNet[J]. Security and Communication Networks, 2022, 2022., articleTitle=Research on Face Recognition Classification Based on Improved GoogleNet, refAbstract=null), Reference(id=1210277271936561546, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2020, volume=33, issue=7, pageStart=99, pageEnd=101, url=null, language=null, rfNumber=[10], rfOrder=9, authorNames=谢金龙, 胡勇, journalName=工业控制计算机, refType=null, unstructuredReference=谢金龙, 胡勇. 基于深度学习的车辆检测与跟踪系统[J]. 工业控制计算机, 2020, 33(7): 99-101., articleTitle=基于深度学习的车辆检测与跟踪系统, refAbstract=null), Reference(id=1210277272100139403, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2020, volume=33, issue=7, pageStart=99, pageEnd=101, url=null, language=null, rfNumber=[10], rfOrder=10, authorNames=XIE J L, HU Y, journalName=Industrial Control Computer, refType=null, unstructuredReference=XIE J L, HU Y. Vehicle Detection and Tracking System Based on Deep Learning[J]. Industrial Control Computer, 2020, 33(7): 99-101., articleTitle=Vehicle Detection and Tracking System Based on Deep Learning, refAbstract=null), Reference(id=1210277272196608397, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2020, volume=null, issue=null, pageStart=107, pageEnd=122, url=null, language=null, rfNumber=[11], rfOrder=11, authorNames=WANG Z D, ZHENG L, LIU Y X, journalName=European Conference on Computer Vision. Cham, refType=null, unstructuredReference=WANG Z D, ZHENG L, LIU Y X, et al. Towards Real-Time Multi-Object Tracking[C]// European Conference on Computer Vision. Cham, Switzerland: Springer, 2020: 107-122., articleTitle=Towards Real-Time Multi-Object Tracking, refAbstract=null), Reference(id=1210277272280494479, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2023, volume=13, issue=1, pageStart=4525, pageEnd=null, url=null, language=null, rfNumber=[12], rfOrder=12, authorNames=CHE J, HE Y T, WU J M, journalName=Scientific Reports, refType=null, unstructuredReference=CHE J, HE Y T, WU J M. Pedestrian Multiple-Object Tracking Based on FairMOT and Circle Loss[J]. Scientific Reports, 2023, 13(1): 4525., articleTitle=Pedestrian Multiple-Object Tracking Based on FairMOT and Circle Loss, refAbstract=null), Reference(id=1210277272335020433, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2020, volume=null, issue=null, pageStart=390, pageEnd=391, url=null, language=null, rfNumber=[13], rfOrder=13, authorNames=WANG C Y, LIAO H Y M, WU Y H, journalName=2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle, WA, refType=null, unstructuredReference=WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: A New Backbone That Can Enhance Learning Capability of CNN[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle, WA, USA: IEEE, 2020: 390-391., articleTitle=CSPNet: A New Backbone That Can Enhance Learning Capability of CNN, refAbstract=null), Reference(id=1210277272418906516, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2015, volume=37, issue=9, pageStart=1904, pageEnd=1916, url=null, language=null, rfNumber=[14], rfOrder=14, authorNames=HE K M, ZHANG X Y, REN S Q, journalName=IEEE Transactions on Pattern Analysis and Machine Intelligence, refType=null, unstructuredReference=HE K M, ZHANG X Y, REN S Q, et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916., articleTitle=Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition, refAbstract=null), Reference(id=1210277272477626775, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=1998, volume=86, issue=11, pageStart=2278, pageEnd=2324, url=null, language=null, rfNumber=[15], rfOrder=15, authorNames=LECUN Y, BOTTOU L, BENGIO Y, journalName=Proceedings of the IEEE, refType=null, unstructuredReference=LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-Based Learning Applied to Document Recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324., articleTitle=Gradient-Based Learning Applied to Document Recognition, refAbstract=null), Reference(id=1210277272527958426, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2014, volume=null, issue=null, pageStart=740, pageEnd=755, url=null, language=null, rfNumber=[16], rfOrder=16, authorNames=LIN T Y, MAIRE M, BELONGIE S, journalName=13th European Conference on Computer Vision. Zurich, refType=null, unstructuredReference=LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common Objects in Context[C]// 13th European Conference on Computer Vision. Zurich, Switzerland: Springer International Publishing, 2014: 740-755., articleTitle=Microsoft COCO: Common Objects in Context, refAbstract=null), Reference(id=1210277272595067291, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=https://arxiv.org/abs/1804.02767., language=null, rfNumber=[17], rfOrder=17, authorNames=REDMON J, FARHADI A, journalName=null, refType=null, unstructuredReference=REDMON J, FARHADI A. YOLOv3:An Incremental Improvement[EB/OL]. ( 2018-04-08)[2024-01-18]., articleTitle=YOLOv3:An Incremental Improvement, refAbstract=null), Reference(id=1210277272670564764, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=https://arxiv.org/abs/2004.10934., language=null, rfNumber=[18], rfOrder=18, authorNames=BOCHKOVSKIY A, WANG C Y, LIAO H Y M, journalName=null, refType=null, unstructuredReference=BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4:Optimal Speed and Accuracy of Object Detection[EB/OL]. ( 2020-04-23)[2024-01-18]., articleTitle=YOLOv4:Optimal Speed and Accuracy of Object Detection, refAbstract=null), Reference(id=1210277272758645151, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[19], rfOrder=19, authorNames=LI Y Y, HU J, WEN Y, journalName=Paris, refType=null, unstructuredReference=LI Y Y, HU J, WEN Y, et al.Rethinking Vision Transformers for MobileNet Size and Speed[C]// 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Paris, France: IEEE, 2023., articleTitle=Rethinking Vision Transformers for MobileNet Size and Speed, refAbstract=null), Reference(id=1210277272855114143, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2016, volume=null, issue=null, pageStart=3464, pageEnd=3468, url=null, language=null, rfNumber=[20], rfOrder=20, authorNames=BEWLEY A, GE Z Y, OTT L, journalName=Phoenix, AZ, refType=null, unstructuredReference=BEWLEY A, GE Z Y, OTT L, et al. Simple Online and Realtime Tracking[C]// 2016 IEEE International Conference on Image Processing (ICIP). Phoenix, AZ, USA: IEEE, 2016: 3464-3468., articleTitle=Simple Online and Realtime Tracking[C]// 2016 IEEE International Conference on Image Processing (ICIP), refAbstract=null), Reference(id=1210277274071462305, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2016, volume=null, issue=null, pageStart=770, pageEnd=778, url=null, language=null, rfNumber=[21], rfOrder=21, authorNames=HE K, ZHANG X, REN S, journalName=Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, refType=null, unstructuredReference=HE K, ZHANG X, REN S, et al. Deep Residual Learning for Image Recognition[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 770-778., articleTitle=Deep Residual Learning for Image Recognition, refAbstract=null), Reference(id=1210277274184708515, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2021, volume=null, issue=null, pageStart=10012, pageEnd=10022, url=null, language=null, rfNumber=[22], rfOrder=22, authorNames=LIU Z, LIN Y, CAO Y, journalName=Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, QC, refType=null, unstructuredReference=LIU Z, LIN Y, CAO Y, et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, QC, Canada: IEEE, 2021: 10012-10022., articleTitle=Swin Transformer: Hierarchical Vision Transformer using Shifted Windows, refAbstract=null), Reference(id=1210277274256011685, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2013, volume=32, issue=11, pageStart=1231, pageEnd=1237, url=null, language=null, rfNumber=[23], rfOrder=23, authorNames=GEIGER A, LENZ P, STILLER C, journalName=The International Journal of Robotics Research, refType=null, unstructuredReference=GEIGER A, LENZ P, STILLER C, et al. Vision Meets Robotics: The KITTI Dataset[J]. The International Journal of Robotics Research, 2013, 32(11): 1231-1237., articleTitle=Vision Meets Robotics: The KITTI Dataset, refAbstract=null), Reference(id=1210277274335703463, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, doi=null, pmid=null, pmcid=null, year=2016, volume=null, issue=null, pageStart=1, pageEnd=6, url=null, language=null, rfNumber=[24], rfOrder=24, authorNames=LIU X C, LIU W, MA H D, journalName=Seattle, WA, refType=null, unstructuredReference=LIU X C, LIU W, MA H D, et al. Large-Scale Vehicle Re-Identification in Urban Surveillance Videos[C]// 2016 IEEE International Conference on Multimedia and Expo (ICME). Seattle, WA, USA: IEEE, 2016: 1-6., articleTitle=Large-Scale Vehicle Re-Identification in Urban Surveillance Videos[C]// 2016 IEEE International Conference on Multimedia and Expo (ICME), refAbstract=null)], funds=[Fund(id=1210277270858625385, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, awardId=AA22068001, language=CN, fundingSource=*广西科技重大专项(AA22068001), fundOrder=null, country=null), Fund(id=1210277270984454509, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, awardId=AA23062031, language=CN, fundingSource=广西科技重大专项(AA23062031), fundOrder=null, country=null), Fund(id=1210277271051563374, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, awardId=AB21196029, language=CN, fundingSource=广西重点研发项目(AB21196029), fundOrder=null, country=null), Fund(id=1210277271143838065, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, awardId=2022AAA0102, language=CN, fundingSource=柳州市科技计划项目(2022AAA0102), fundOrder=null, country=null)], companyList=[AuthorCompany(id=1210277263669588187, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, xref=1, ext=[AuthorCompanyExt(id=1210277263682171100, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, companyId=1210277263669588187, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1 Guilin University of Electronic Technology, Guilin 541004), AuthorCompanyExt(id=1210277263686365406, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, companyId=1210277263669588187, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1 桂林电子科技大学,桂林 541004)]), AuthorCompany(id=1210277263778640096, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, xref=2, ext=[AuthorCompanyExt(id=1210277263782834402, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, companyId=1210277263778640096, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2 Guilin University of Aerospace Technology,Guilin 541004), AuthorCompanyExt(id=1210277263791223010, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, companyId=1210277263778640096, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2 桂林航天工业学院,桂林 541001)])], figs=[ArticleFig(id=1210277267046002984, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, label=null, caption=null, figureFileSmall=mi/scI+QF9E7m7VJKkhwCg==, figureFileBig=WX4T68S/e5E9xd/q0KMkXw==, tableContent=null), ArticleFig(id=1210277267125694762, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, label=图1, caption=改进后YOLOv5的网络模型框架, figureFileSmall=mi/scI+QF9E7m7VJKkhwCg==, figureFileBig=WX4T68S/e5E9xd/q0KMkXw==, tableContent=null), ArticleFig(id=1210277267259912493, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, label=null, caption=null, figureFileSmall=YYUkeydk/P14zXbJ0R6szw==, figureFileBig=Wz4IFYgZcDMi6aBjtVATYg==, tableContent=null), ArticleFig(id=1210277267347992880, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, label=图2, caption=EfficientFormerV2的网络结构, figureFileSmall=YYUkeydk/P14zXbJ0R6szw==, figureFileBig=Wz4IFYgZcDMi6aBjtVATYg==, tableContent=null), ArticleFig(id=1210277267431878962, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, label=null, caption=null, figureFileSmall=VC+KovAh6mjgasG8MEz10g==, figureFileBig=ueqZHDI99tUszBSm+9s/JQ==, tableContent=null), ArticleFig(id=1210277267498987827, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, label=图3, caption=前馈神经网络, figureFileSmall=VC+KovAh6mjgasG8MEz10g==, figureFileBig=ueqZHDI99tUszBSm+9s/JQ==, tableContent=null), ArticleFig(id=1210277267587068213, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, label=null, caption=null, figureFileSmall=+zz/sBnrJLME7Dh3T3ut8Q==, figureFileBig=uNbGZlX/fDgDP7wDvenuHg==, tableContent=null), ArticleFig(id=1210277267654177079, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, label=图4, caption=基于Swin Transformer改进后网络模型架构, figureFileSmall=+zz/sBnrJLME7Dh3T3ut8Q==, figureFileBig=uNbGZlX/fDgDP7wDvenuHg==, tableContent=null), ArticleFig(id=1210277267712897337, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, label=null, caption=null, figureFileSmall=zLDt3rAxd+G7wNobcmbTbQ==, figureFileBig=eUWahdczZx0MG9tzrihG/Q==, tableContent=null), ArticleFig(id=1210277267796783419, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, label=图5, caption=改进前、后的YOLOv5算法试验结果对比, figureFileSmall=zLDt3rAxd+G7wNobcmbTbQ==, figureFileBig=eUWahdczZx0MG9tzrihG/Q==, tableContent=null), ArticleFig(id=1210277267897446717, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, label=null, caption=null, figureFileSmall=3IVzo8KxcX30MYmq9PV6vQ==, figureFileBig=cMK7HfrzC5bAPmGskTzr6A==, tableContent=null), ArticleFig(id=1210277267977138495, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, label=图6, caption=全局目标识别效果对比, figureFileSmall=3IVzo8KxcX30MYmq9PV6vQ==, figureFileBig=cMK7HfrzC5bAPmGskTzr6A==, tableContent=null), ArticleFig(id=1210277268077801794, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, label=null, caption=null, figureFileSmall=J3mLrveHHg0sWktQ1oT2Uw==, figureFileBig=qBNcEwAAJX6GU99RyuaETw==, tableContent=null), ArticleFig(id=1210277268149104964, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, label=图7, caption=遮挡物体识别效果对比, figureFileSmall=J3mLrveHHg0sWktQ1oT2Uw==, figureFileBig=qBNcEwAAJX6GU99RyuaETw==, tableContent=null), ArticleFig(id=1210277268237185352, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, label=null, caption=null, figureFileSmall=0cFeZT5ydBUqNsZxFVhR9g==, figureFileBig=7wjmKO0a56UkcEq15s91+Q==, tableContent=null), ArticleFig(id=1210277268304294215, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, label=图8, caption=小目标识别效果对比, figureFileSmall=0cFeZT5ydBUqNsZxFVhR9g==, figureFileBig=7wjmKO0a56UkcEq15s91+Q==, tableContent=null), ArticleFig(id=1210277268379791691, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, label=null, caption=null, figureFileSmall=62kIfOrt9RvFfyG1mHgNJg==, figureFileBig=wvu0H880Oxg6dYr9X9D4wA==, tableContent=null), ArticleFig(id=1210277269558391114, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, label=图9, caption=全局感知能力效果对比, figureFileSmall=62kIfOrt9RvFfyG1mHgNJg==, figureFileBig=wvu0H880Oxg6dYr9X9D4wA==, tableContent=null), ArticleFig(id=1210277269629694285, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, label=null, caption=null, figureFileSmall=891dvtEleuRU6Ii4gg0duA==, figureFileBig=GeGPMJaCwJ7g7n8VSCqjeg==, tableContent=null), ArticleFig(id=1210277269705191758, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, label=图10, caption=遮挡物体识别效果对比, figureFileSmall=891dvtEleuRU6Ii4gg0duA==, figureFileBig=GeGPMJaCwJ7g7n8VSCqjeg==, tableContent=null), ArticleFig(id=1210277269780689232, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, label=null, caption=null, figureFileSmall=57nHAKHkRYEsRzGZ2EU7SA==, figureFileBig=LL3BSin75v0LiureH6A27g==, tableContent=null), ArticleFig(id=1210277269851992403, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, label=图11, caption=小目标识别效果对比, figureFileSmall=57nHAKHkRYEsRzGZ2EU7SA==, figureFileBig=LL3BSin75v0LiureH6A27g==, tableContent=null), ArticleFig(id=1210277269961044307, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, label=null, caption=null, figureFileSmall=mN5MxTyMnqOiJCpKnnpyig==, figureFileBig=8Swd3OEt5OVIMGv4XECw6w==, tableContent=null), ArticleFig(id=1210277270044930388, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, label=图12, caption=HOTA指标对比, figureFileSmall=mN5MxTyMnqOiJCpKnnpyig==, figureFileBig=8Swd3OEt5OVIMGv4XECw6w==, tableContent=null), ArticleFig(id=1210277270145593686, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, label=null, caption=null, figureFileSmall=lNqTz+E7wGa8OQf4fUWhzw==, figureFileBig=2xlGirOaCb3Dgh1d8NkCcQ==, tableContent=null), ArticleFig(id=1210277270237868377, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, label=图13, caption=目标跟踪算法改进前、后识别效果对比, figureFileSmall=lNqTz+E7wGa8OQf4fUWhzw==, figureFileBig=2xlGirOaCb3Dgh1d8NkCcQ==, tableContent=null), ArticleFig(id=1210277270313365852, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, label=null, caption=null, figureFileSmall=null, figureFileBig=null, tableContent=
参数名称 YOLOv5 EfficientFormerV2
权重文件 Yolov5s.pth eformer_s2_450.pth
代数 100 100
批大小 6 6
初始学习率 0.01 0.01
动量 0.937 0.937
预设衰减系数 0.000 5 0.000 5
), ArticleFig(id=1210277270388863325, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, label=表1, caption=

试验参数配置

, figureFileSmall=null, figureFileBig=null, tableContent=
参数名称 YOLOv5 EfficientFormerV2
权重文件 Yolov5s.pth eformer_s2_450.pth
代数 100 100
批大小 6 6
初始学习率 0.01 0.01
动量 0.937 0.937
预设衰减系数 0.000 5 0.000 5
), ArticleFig(id=1210277270481138015, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, label=null, caption=null, figureFileSmall=null, figureFileBig=null, tableContent=
项目 ResNet-50 Swin Transformer
mAP 71.56 79.69
Rank-1精度 88.83 92.18
), ArticleFig(id=1210277270573412705, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, label=表2, caption=

DeepSORT改进试验对比 %

, figureFileSmall=null, figureFileBig=null, tableContent=
项目 ResNet-50 Swin Transformer
mAP 71.56 79.69
Rank-1精度 88.83 92.18
), ArticleFig(id=1210277270632132963, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=EN, label=null, caption=null, figureFileSmall=null, figureFileBig=null, tableContent=
算法 MOTA/% ID变换次数/次
YOLOv5+DeepSORT 69.137 460
改进YOLOv5+DeepSORT 77.105 424
YOLOv5+改进DeepSORT 67.723 405
改进YOLOv5+改进DeepSORT 78.684 339
), ArticleFig(id=1210277270720213350, tenantId=1146029695717560320, journalId=1189621681917173762, articleId=1209910185787781300, language=CN, label=表3, caption=

不同方法评估结果

, figureFileSmall=null, figureFileBig=null, tableContent=
算法 MOTA/% ID变换次数/次
YOLOv5+DeepSORT 69.137 460
改进YOLOv5+DeepSORT 77.105 424
YOLOv5+改进DeepSORT 67.723 405
改进YOLOv5+改进DeepSORT 78.684 339
)], attaches=null, journal=Journal(id=1149693407745847311, delFlag=0, nameCn=汽车技术, nameEn=Automobile Technology, nameHistory1=null, nameHistory2=null, issn=1000-3703, eissn=null, cn=22-1113/U, coden=null, periodic=0, language=CN, oaType=null, ccby=null, superviseOffice=null, ownerOffice=null, pubOffice=null, editorOffice=null, officeType=null, aims=null, clcCode=null, officeProv=null, officeCity=null, officeAddr=null, officeZip=null, officeEmail=null, officePhone=null, editDirector=null, officeDirector=null, officeDirectorPhone=null, officeStaffNum=null, officeEmpNum=null, coverPicUrl=rYFtDx/CU9+iX8QTM0ckbw==, journalPrice=null, startedYear=null, abbrevIsoEn=null, journalRemark=null, publicationField=null, createdTime=1752037868679, updatedTime=1761735668047, createdBy=18614031015, updatedBy=13701087609, firstLetterCn=A, firstLetterEn=A, subjectCode=Engineering, subjectName=Engineering, subjectCodeEn=Engineering, subjectNameEn=null, picCn=rYFtDx/CU9+iX8QTM0ckbw==, picEn=oFT2NmUwKPUjZ27C1+d9pw==, jcr=null, cjcr=null, exts=[JournalExt(id=1190368926564450443, language=CN, name=汽车技术, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=, createdTime=1761735668078, updatedTime=1761735668078, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=https://qcjs.cbpt.cnki.net/index.aspx?t=1, submissionEditorUrl=https://qcjs.cbpt.cnki.net/index.aspx?t=3, submissionReviewUrl=https://qcjs.cbpt.cnki.net/index.aspx?t=2, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""}), JournalExt(id=1190368926618976396, language=EN, name=Automobile Technology, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=, createdTime=1761735668091, updatedTime=1761735668091, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=https://qcjs.cbpt.cnki.net/index.aspx?t=1, submissionEditorUrl=https://qcjs.cbpt.cnki.net/index.aspx?t=3, submissionReviewUrl=https://qcjs.cbpt.cnki.net/index.aspx?t=2, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""})], databaseList=null, tenantJournalId=1189621681917173762, websiteList=[Website(id=1189624193747526544, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1189621681917173762, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/qcjs/CN, language=CN, createTime=1761558109939, createBy=18614031015, updateTime=1761558140534, updateBy=18614031015, name=汽车技术-中, tplId=1146099689490845704, title=汽车技术, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1189625424704451180, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193747526544, code=articleTextType, value=kx, createTime=1761558403421, updateTime=1761558403421, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625424675091049, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193747526544, code=banner, value=null, createTime=1761558403414, updateTime=1761558403414, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625424733811311, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193747526544, code=grayFlag, value=0, createTime=1761558403428, updateTime=1761558403428, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625424658313832, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193747526544, code=logo, value=https://castjournals.cast.org.cn/joweb/qcjs/CN/file/pic?fileId=7En9rzX2QCa/1J8NnKt/Fg==, createTime=1761558403410, updateTime=1761558403410, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625424746394225, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193747526544, code=minRunFlag, value=0, createTime=1761558403431, updateTime=1761558403431, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625424691868267, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193747526544, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/qcjs/CN/file/pic, createTime=1761558403418, updateTime=1761558403418, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625424742199920, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193747526544, code=silenceFlag, value=0, createTime=1761558403430, updateTime=1761558403430, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625424683479658, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193747526544, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_cn_619/, createTime=1761558403416, updateTime=1761558403416, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625424712839789, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193747526544, code=themeColor, value=null, createTime=1761558403423, updateTime=1761558403423, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625424725422702, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193747526544, code=themeStyle, value=null, createTime=1761558403426, updateTime=1761558403426, creator=18614031015, updator=18614031015)]), Website(id=1189624193869161363, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1189621681917173762, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/qcjs/EN, language=EN, createTime=1761558109967, createBy=18614031015, updateTime=1761558340679, updateBy=18614031015, name=汽车技术-英文, tplId=1146101810881728533, title=Automobile Technology, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1189625550722311064, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193869161363, code=articleTextType, value=kx, createTime=1761558433466, updateTime=1761558433466, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625550688756629, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193869161363, code=banner, value=null, createTime=1761558433458, updateTime=1761558433458, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625550739088283, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193869161363, code=grayFlag, value=0, createTime=1761558433470, updateTime=1761558433470, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625550676173716, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193869161363, code=logo, value=https://castjournals.cast.org.cn/joweb/qcjs/EN/file/pic?fileId=7En9rzX2QCa/1J8NnKt/Fg==, createTime=1761558433455, updateTime=1761558433455, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625550751671197, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193869161363, code=minRunFlag, value=0, createTime=1761558433473, updateTime=1761558433473, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625550713922455, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193869161363, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/qcjs/EN/file/pic, createTime=1761558433464, updateTime=1761558433464, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625550743282588, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193869161363, code=silenceFlag, value=0, createTime=1761558433471, updateTime=1761558433471, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625550705533846, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193869161363, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_en_623/, createTime=1761558433462, updateTime=1761558433462, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625550726505369, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193869161363, code=themeColor, value=null, createTime=1761558433467, updateTime=1761558433467, creator=18614031015, updator=18614031015), WebsiteProps(id=1189625550734893978, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1189624193869161363, code=themeStyle, value=null, createTime=1761558433469, updateTime=1761558433469, creator=18614031015, updator=18614031015)])], journalTitle=汽车技术, weixinUrl=null, journalUrl=null, iacademicId=null, status=1, seqNo=null, journalTitleEn=Automobile Technology, journalPhotoCn=rYFtDx/CU9+iX8QTM0ckbw==, journalPhotoEn=oFT2NmUwKPUjZ27C1+d9pw==, journalFirstLetter=A, journalRecommend=null, journalNew=null, journalCollection=null, jcrJf=null, cjcrJf=null, jcrJfStr=null, cjcrJfStr=null, submissionFirstDecision=null, sciSubjectClassification=null, casSubjectClassification=null, citeScore=null, totalCitationFrequency=null, icpCode=null, psCode=null, advertisingLicenseCode=null, copyrightInformation=null, country=null, option=, provinceCode=null, provinceName=null, collectFlag=false), detailUrlCn=https://castjournals.cast.org.cn/joweb/qcjs/CN/10.19620/j.cnki.1000-3703.20231097, detailUrlEn=https://castjournals.cast.org.cn/joweb/qcjs/EN/10.19620/j.cnki.1000-3703.20231097, pdfUrlCn=https://castjournals.cast.org.cn/joweb/qcjs/CN/PDF/10.19620/j.cnki.1000-3703.20231097, pdfUrlEn=https://castjournals.cast.org.cn/joweb/qcjs/EN/PDF/10.19620/j.cnki.1000-3703.20231097, aliStartDate=null, aliEndDate=null, collectionFlag=false, citedCount=null, citedUrl=null, reference=null)
收藏切换
基于Transformer改进的YOLOv5+DeepSORT的车辆跟踪算法*
收藏切换
PDF下载
何水龙 1, 2 , 张靖佳 1 , 张林俊 1 , 莫德赟 2
汽车技术 | 智能车辆运动规划与控制技术专题 2024,(7): 9-16
收起
收藏切换
汽车技术 | 智能车辆运动规划与控制技术专题 2024, (7): 9-16
基于Transformer改进的YOLOv5+DeepSORT的车辆跟踪算法*
全屏
何水龙1, 2, 张靖佳1, 张林俊1, 莫德赟2
作者信息
  • 1 桂林电子科技大学,桂林 541004
  • 2 桂林航天工业学院,桂林 541001

通讯作者:

莫德赟(1983—),男,副教授,主要研究方向为汽车智能驾驶,
Vehicle Tracking Algorithm Based on Transformer’s Improved YOLOv5+DeepSORT
Shuilong He1, 2, Jingjia Zhang1, Linjun Zhang1, Deyun Mo2
Affiliations
  • 1 Guilin University of Electronic Technology, Guilin 541004
  • 2 Guilin University of Aerospace Technology,Guilin 541004
出版时间: 2024-07-24 doi: 10.19620/j.cnki.1000-3703.20231097
文章导航
收藏切换

针对传统目标检测跟踪算法检测精度低、全局感知能力差、对遮挡和小目标物体的识别能力差等问题,提出了一种基于轻量化Transformer改进的YOLOv5和DeepSORT算法的车辆跟踪方法。首先,利用EfficientFormerV2模型改进YOLOv5算法模型,增强车辆的目标检测能力;然后,利用移位窗口(Swin)模型的优点改进DeepSORT多目标跟踪算法中的重识别(Re-Identification)模块,提高车辆的跟踪能力和精度;最后,通过数据集KITTI和VeRi开展对比试验和消融实验。结果表明,在复杂工况下,该方法的性能在车辆遮挡和小目标识别方面显著提高,平均准确度达到96.7%,目标跟踪准确度提高了9.547%,编号(ID)切换总次数减少了26.4%。

YOLOv5  /  车辆检测  /  DeepSORT  /  Transformer

In order to solve the shortcomings of traditional object detection and tracking algorithms, such as low detection accuracy, poor global perception ability, poor recognition ability of occlusion and small target objects, this paper proposed a vehicle tracking method based on YOLOv5 and DeepSORT algorithm improved by lightweight Transformer. Firstly, the EfficientFormerV2 model was used to improve the YOLOv5 algorithm model to enhance the target detection ability of the vehicle, and then the advantages of the Swin model were used to improve the Re-Identification module in the DeepSORT multi-target tracking algorithm to enhance the tracking ability and accuracy of the vehicle. Finally, the dataset KITTI and VeRi were used to carry out comparative experiments and ablation experiments. The results show that under complex conditions, the performance of the proposed method is significantly improved in vehicle occlusion and small target recognition, with an average accuracy of 96.7%, an increase of 9.547% in target tracking, and a reduction of 26.4% in the total number of ID switching.

YOLOv5  /  Vehicle detection  /  DeepSORT  /  Transformer
何水龙, 张靖佳, 张林俊, 莫德赟. 基于Transformer改进的YOLOv5+DeepSORT的车辆跟踪算法*. 汽车技术, 2024 , (7) : 9 -16 . DOI: 10.19620/j.cnki.1000-3703.20231097
Shuilong He, Jingjia Zhang, Linjun Zhang, Deyun Mo. Vehicle Tracking Algorithm Based on Transformer’s Improved YOLOv5+DeepSORT[J]. Automobile Technology, 2024 , (7) : 9 -16 . DOI: 10.19620/j.cnki.1000-3703.20231097
目标识别和跟踪技术是提高高级辅助驾驶系统安全性能的核心手段之一,其通过实时识别并跟踪车辆、行人和道路标志等目标,帮助车辆感知周围交通状况,减少交通事故。
近年来,深度学习在目标检测领域不断发展。2017年,He等[1]提出了掩膜循环卷积神经网络(Mask Recycle Convolutional Neural Network,Mask R-CNN)算法,有效解决了原图与特征图的特征位置不匹配的问题。2018年,Redmon等[2]在改进基础网络的同时,结合金字塔结构,提出了YOLOv3[3]算法,获取了更多小目标的有效信息。2019年,Zhao等[4]针对目标尺度变化的问题,提出了M2Det算法。2020年后,基于YOLOv3改进的YOLOv4[5]和YOLOv5[6]模型在保持运行效率优势的基础上提高了检测与识别的准确率。然而,这些方法在某些方面仍然存在一定的局限性,如:Mask R-CNN在实现上比快速循环卷积神经网络(Faster Recycle Convolutional Neural Network,Faster R-CNN)[7]复杂,需要更多的计算资源,且使用了类似于Faster R-CNN的两阶段目标检测方法,检测速度相对较慢;YOLO系列模型在处理小目标和遮挡目标时仍存在挑战;M2Det算法需要处理多个尺度的特征金字塔,故其在实时性上并不理想。
随着深度学习技术的发展,多目标跟踪算法也不断改进。Yu等[8]提出了一个两阶段算法,先使用Faster R-CNN进行目标检测,再利用匈牙利算法对由GoogleNet[9]提取的特征进行关联,从而实现目标跟踪。Xie等[10]利用基于YOLOv3的检测器捕捉目标,并使用DeepSORT(Deep learning based Simple Online and Realtime Tracking)算法实现轨迹关联。然而,两阶段算法需要两个密集计算网络,存在跟踪效率低的问题。因此,诸多研究者转向基于重识别(Re-IDentification,Re-ID)技术的多目标跟踪算法研究,提高多目标跟踪效率。Wang等[11]率先提出了一种联合模型,通过改进YOLOv3检测模型,一次性解决目标检测和Re-ID特征提取,在行人数据集上实现了较高水平的跟踪效率。Zhang等[12]提出了FairMOT算法,使用深层特征融合网络进行特征提取,从而提高了跟踪性能。但上述算法所使用的骨干网络都是由检测器网络改造而来,在学习Re-ID特征上存在缺陷。
为进一步提升目标跟踪算法精度、效率和跟踪能力,本文提出一种基于轻量化Transformer改进的YOLOv5和DeepSORT的车辆跟踪方法,弥补YOLO系列对于小目标和遮挡物的检测能力不足以及DeepSORT中Re-ID模块泛化能力弱的缺点。
YOLOv5是一种基于深度残差和路径聚合网络的目标检测算法,其骨干网络基于CSPDarknet53[13],结合特征金字塔网络(Feature Pyramid Networks,FPN)[14]和空间金字塔池化(Spatial Pyramid Pooling,SPP)[15]技术,提升了小目标检测精度。在COCO数据集[16]上,YOLOv5的平均精度均值(mean Average Precision,mAP)表现优异[17-18],超越了当时的最先进水平。
YOLOv5采用CSPDarknet53对输入数据进行划分,通过拆分路由(Split Route)模块分为两个部分,然后用跨阶段部分(Cross Stage Partial,CSP)模块连接,再通过一个大卷积层将特征融合,从而得到骨干网络输出的特征图。这种操作能够很好地处理图像的局部特征。然而,由于YOLOv5采用无锚点(Anchor-Free)方式,在单个目标的检测方面存在缺陷。如在小目标物体检测和物体被遮挡的情况下,存在检测漏报和误报的情况。针对这种情况,本文提出一种改进YOLOv5目标检测模型,如图1所示。该模型在保证网络正常检测较大目标的同时,提高对小目标特征信息的感知能力和全局感知能力,以提高遮挡物体的识别率和泛化能力,满足实时性和提高检测精度的要求,采用最新的轻量化Transformer模型EfficientFormerV2[19]对YOLOv5的骨干网络进行改进。EfficientFormerV2使用全局自注意力机制,在处理道路交通领域的车辆目标检测任务时,特别是在存在大量背景干扰的情况下,能够有效地分割不同区域对应的目标对象,达到更好的检测效果。采用快速空间金字塔池化(Spatial Pyramid Pooling-Fast,SPPF)模块连接EfficientFormerV2模块,在不同尺度的特征图中划分多个子区域,并利用最大池化对每个子区域进行处理。最终将所有尺度的池化结果拼接成一个固定长度的特征向量,解决不同尺度特征图的融合问题,在处理车辆遮挡和全局感知方面可获得更好的效果。
EfficientFormerV2是Detransformer模型的改进版,基于Transformer的自注意力机制,能有效处理对象关系与局部图像信息,其网络结构如图2所示。本文选用其轻量化版本EfficientFormerV2-S2,参数量仅10.3×106个,适用于边缘计算处理器部署。
EfficientFormerV2采用了四阶段分层设计,可以获得输入图像分辨率在{1/4,1/8,1/16,1/32}处的特征图。为更高效地嵌入输入图像,EfficientFormerV2使用了小内核卷积,而不是非重叠补丁(Patch)的方式,从而提高了计算性能和模型泛化能力。该设计使得EfficientFormerV2在图像分类和目标检测等任务中都获得了极佳的性能表现。计算过程为:
X i | i = 1 , j | j = 1 B , C j | j = 1 , H 4 , W 4 = s t e m χ 0 B , 3 , H , W
式中:Xi,j表示第i层第j阶段的特征图,j∈{1,2,3,4},B为批大小,Cj为第j阶段通道大小(表示网络宽度),HW分别为特征图的高度和宽度,χ0为输入图像,stem为卷积下采样操作。
第一阶段和第二阶段的设计旨在以高分辨率捕获局部信息,采用了相同的前馈神经网络(Feedforward Neural Network,FFN)来处理每层特征图,如图3所示。这种设计使得EfficientFormerV2能够在局部区域获取更多的细节信息,有助于实现更准确的目标检测和图像分类:
X i + 1 , j B , C j , - H 2 j + 1 , W 2 j + 1 = S i , j F F N C j , E i , j X i , j + X i , j
式中:Si,j为一种可学习的层间尺度;FFN含有两种属性,即阶段宽度Cj和每块扩展比Ei,j
需要注意的是,每个FFN都采用了残差连接(Residual Connection)。在模型的最后两个阶段,本地FFN和全局多头自注意力(Multi-Head Self-Attention,MHSA)块均被使用。
本文将4个FFN模块封装在一个时序(Sequential)容器中,可方便地对它们进行堆叠和复用,避免手动重复编码。此外,在第2层、第4层、第6层的时序容器与批标准化(Batch Normalization)结合使用。其中,时序容器对输入的序列进行局部特征提取和非线性变换,而批归一化则可以对每个时序容器模块的输出进行标准化处理,减少数据内部协方差的影响,从而加速模型收敛并降低过拟合风险。EfficientFormerV2模块的输出特征向量被传递给SPPF模块和下游的其他卷积层。SPPF模块通过网络池化操作生成固定长度的特征向量,用于下游任务。
简单在线实时跟踪(Simple Online and Realtime Tracking,SORT)[20]利用卡尔曼滤波器预测目标运动,通过交并比(Intersection Over Union,IOU)评估预测边界框与检测边界框的相似度,并应用匈牙利算法关联数据,实现实时跟踪。DeepSORT在SORT基础上引入深度学习网络提取目标特征,采用级联匹配技术解决目标重叠或遮挡时的编号(ID)切换问题。该算法结合运动与外观特征计算代价矩阵,匹配检测结果,将未匹配的目标视为新目标,分配新ID。级联匹配技术根据目标丢失次数和轨迹活跃程度对目标进行优先排序,有效减少了ID切换次数。
目标特征提取的主要目的是获得目标的唯一标识特征,以便对其在不同位置或姿态下进行重新识别,从而实现目标跟踪。在DeepSORT算法中,特征提取的主要算法是基于卷积神经网络(Convolutional Neural Network,CNN)的ResNet-50[21],用以对目标图像区域进行卷积特征提取。对于每个检测目标,先裁剪其位置,再经CNN提取卷积特征,通过全连接层降维得到特征向量。该向量反映目标视觉与外观信息,鲁棒性强,不受位置和姿态变化的影响。ResNet-50在ImageNet上进行了大规模预训练,故提取的特征向量更准确且区分力更强。
不过,ResNet-50也存在一定不足:首先,ResNet-50具有非常深的网络结构,导致训练和推理速度较慢,尤其是在高分辨率图像上;其次,ResNet-50的感受野较大,当目标物体较小时,容易忽略一些关键信息,导致检测失败;最后,由于ResNet-50相对于一些轻量级神经网络而言体积更大,需要更多的存储和计算资源。为解决这些问题,本文对DeepSORT中的重识别模块进行了改进,将ResNet-50主干网络换成基于Transformer架构的移位窗口(Shifted windows,Swin)[22],如图4所示。Swin凭借分布式训练、跨群组部署及计算与存储分离等优势,可实现快速训练和推理,并展现出较强的可扩展性。其分级特征提取与多重注意力机制使得小目标检测敏感度超越了ResNet-50。计算注意力机制相似度时,在每个头(Head)中加入相对位置偏置 B R M 2 × M 2
A t e n t i o n   Q , K , V = S o f t M a x   Q K T / d + B V
式中: Q , K , V R M 2 , d分别为查询(Query)矩阵、键(Key)矩阵和价值(Value)矩阵,d为查询矩阵、键矩阵的维度,M2为局部窗口内的补丁数量。
此外,该算法还提出了横向和纵向的多重特征信息响应,这种分层设计的思路不仅方便根据任务调整网络深度,而且可以有效避免梯度消失等问题。
本文采用仅有27 MB的轻量化YOLOv5s模型,兼顾精度、速度与成本,提升算法运行性能。
本文试验采用开源的PyTorch深度学习框架。CPU使用第12代Intel Core i7-12700H,主频为4.70 GHz;采用Ubuntu20.04 LTS操作系统,其中包含Python 3.8和CUDA 12.0;图形处理器使用GeForce GTX 3060,显存容量为6 GB。
为适配KITTI数据集,本文对YOLOv5进行了重新训练,优化了训练参数与批大小(Batch Size),如表1所示,并利用文献[19]开源的权重加速收敛。
采用KITTI数据集[23]对模型进行测试和评估,KITTI数据集作为自动驾驶与计算机视觉评估的核心基准,包含多序列多视角图像数据。针对其与YOLOv5模型的不兼容性,本研究进行了预处理:数据被细分为六类目标,格式转为xml,并适配为YOLOv5训练标签,从而推进其在该模型中的有效应用。
VeRi车辆重识别数据集[24]是用于研究车辆重识别的公共数据集之一。该数据集涵盖20种摄像机视角下的视频及576辆车共计37 778张图像,展现多视角、多样图像质量(含模糊、噪声),及车辆局部细节(如车牌、车灯),适用于车辆重识别训练与算法性能评估。
图5所示为改进YOLOv5算法的对比试验结果。可以看出,改进算法在IOU阈值为0.5时的mAP明显提高,从95.6%提升至96.7%,说明了本文的方法能够有效提高对车辆目标的检测能力。
算法定性试验结果如图6所示,改进前的算法明显未能识别右下角的红色汽车,而改进后的算法成功地识别了该车辆。试验结果表明,改进后的YOLOv5具备更强的全局感知能力,对于车辆目标跟踪具有更好的泛化性能。
算法对遮挡物体的识别效果如图7所示。图7中,道路右侧前方的黑色轿车挡住了行人。改进前的算法无法识别被遮挡行人,而改进后的算法则能够正确识别。因此,改进后的YOLOv5在物体遮挡识别方面表现出色。
为了验证改进后算法的小目标检测效果,进行了相关试验,结果如图8所示,由于YOLOv5对于识别小目标准确度比较低,并未识别到小目标行人,而改进算法成功识别到目标。试验对比结果表明,改进算法对于小目标的检测能力显著提高。
KITTI数据集每个目标的标注行都包含了截断(Truncated)字段,表示相应物体在图像中是否被边界框截断,其取值通常在0~1范围内,表示目标相对于实际规模的截断程度。这个信息对于理解物体在图像中的完整性和全局性非常重要,尤其是在自动驾驶场景下。
试验计算了整个数据集中不同截断程度下的目标数量,为4 631个,并分成了多个段位,如图9所示。通过计算改进前、后算法中数据集内不同截断程度的目标识别成功数量,进而形成了改进前、后的效果对比。可以看出,截断程度越大,识别成功率越低,但改进算法成功识别的数量明显比原算法更多,充分说明改进算法在全局感知能力上有较好的提升效果。
遮挡(Occluded)属性通常表示物体被其他物体遮挡的程度,在KITTI标注中,该属性的值为整数。取值包括:0表示物体没有被遮挡,即物体在图像中是完全可见的;1表示物体被部分遮挡;2表示物体被大部分遮挡,但仍然可见;3表示物体被完全遮挡,即物体在图像中不可见。
根据数据集的标注属性统计了不同遮挡程度的目标总数,如图10所示。从试验统计结果可以看出,改进算法的识别成功数量明显比原算法的数量多,特别是在大部分遮挡的情况下,改进算法比原算法识别成功率高12.8%。
根据COCO数据集对于小目标的定义,本文采用相同策略,将32×32以下像素点的目标定义为小目标,符合小目标要求的总数量为6 756个。
通过试验结果可以看出,原算法的小目标识别率为84.9%,改进算法的识别率为92.82%,如图11所示,可以看出,改进算法在识别小目标上有明显优势。
根据试验结果可知,相较于原算法,改进后的YOLOv5算法改善了全局感知能力,提高了遮挡物的检测和小目标的识别效果,同时提升了目标检测的准确率。
针对重识别模块的模型对比试验,本文使用了基于开源代码DeepSORT的重识别模型。由于DeepSORT模型中默认使用ResNet-50作为网络模型,将其替换为Swin Transformer,并保持初始化参数相同,试验结果如表2所示。可见,改进模型的平均精度提升了8.13%,Rank-1精度(Rank-1 Accuracy)提升了3.35%。说明Transformer模型增强了传统CNN模型的多尺度特征融合能力,能够更好地提取多尺度特征,从而提高识别的准确率。
上述结果说明了算法模型改进的有效性。本文将改进后的算法应用于YOLOv5s+DeepSORT,并与原算法进行对比,高阶跟踪精度(Higher Order Tracking Accuracy,HOTA)、检测精确度(Detection Accuracy,DetA)、关联精确度(Association Accuracy,AssA)、检测精度(Detection Precision,DetPr)、关联召回(Association Recall,AssRe)、关联精度(Association Precision,AssPr)、定位精度(Localization Accuracy,LocA)结果如图12所示。其中,α为权衡因子,用于平衡定位(LocA)、关联(AssA、AssRe、AssPr)和检测(DetA、DetPr)之间的关系,α越大,表示更重视关联和检测的性能,α越小,表示更侧重于定位的精度。由图12可知,改进算法在HOTA指标上明显提高,从55%提升至71%,表明将主干网络从CNN改变为Transformer对于模型性能具有积极影响。
为了进一步验证所提出算法的检测性能,探究各改进方法的有效性,在YOLOv5s+DeepSORT的基础上设计了3组消融实验,每组实验使用相同的超参数以及训练技巧,实验结果如表3所示。
消融实验结果表明,改进后的YOLOv5在识别准确度方面显著提升,能够将多目标跟踪准确度(Multiple Object Tracking Accuracy,MOTA)提升7.968百分点并降低ID变换总次数。虽然改进后的DeepSORT在精度上有所损失,MOTA降低了1.414百分点,但ID变换总次数下降了12%,表明改进的重识别能够有效提取目标特征,并具有对姿态、遮挡和光照等方面的鲁棒性。最终改进版比原始版本在目标跟踪准确度上提高了9.547%,ID切换总次数减少了26.4%。因此,在DeepSORT中,计算特征之间相似度的准确度得到了提高,从而导致ID转换频率的降低。
本文基于KITTI数据集,验证了改进后目标跟踪算法的有效性,该算法在处理小目标和遮挡物体时性能更优秀,同时具备更强的全局感知能力。试验结果如图13所示,改进后的算法表现更加出色。
本文提出了一种基于改进YOLOv5和DeepSORT的车辆检测及跟踪算法。使用轻量化网络EfficientFormerV2替换了原YOLOv5模型的主干网络CSPDarknet53,在减少模型参数的同时提取到了更多潜在的特征信息,提高了特征的代表性。在跟踪阶段,DeepSORT算法中的重识别网络结构也得到了优化,通过增加正则化和利用Swin Transformer网络模型重新设计网络主干技术,进一步提高了外观信息提取能力和跟踪能力。试验结果表明,该方法在公共数据集上取得了更优的检测和跟踪效果,目标跟踪准确度提高了9.547%,ID切换总次数减少了26.4%。
本文所构建的目标跟踪方法除在交通安全和智慧交通等领域具有研究价值外,也可为其他目标检测和跟踪任务提供新的思路和方法。但该方法未能实现端到端的目标跟踪,在未来的研究中,可以考虑在轻量化Transformer基础上实现端到端的跟踪,以进一步提高跟踪算法的性能。
  • *广西科技重大专项(AA22068001)
  • 广西科技重大专项(AA23062031)
  • 广西重点研发项目(AB21196029)
  • 柳州市科技计划项目(2022AAA0102)
参考文献 引证文献
排序方式:
[1]
HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]// Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 2961-2969.
[2]
TAN M X, LE Q V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks[C]// International Conference on Machine Learning. Long Beach, California: PMLR, 2019: 6105-6114.
[3]
SHEN L Z, TAO H F, NI Y Z, et al. Improved YOLOv3 Model with Feature Map Cropping for Multi-Scale Road Object Detection[J]. Measurement Science and Technology, 2023, 34(4).
[4]
ZHAO Q J, SHENG T, WANG Y T, et al. M2Det: A Single-Shot Object Detector Based on Multi-Level Feature Pyramid Network[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, Hawaii, USA: AAAI, 2019: 9259-9266.
[5]
YU J M, ZHANG W. Face Mask Wearing Detection Algorithm Based on Improved YOLO-v4[J]. Sensors, 2021, 21(9): 3263.
[6]
WU W T, LIU H, LI L L, et al. Application of Local Fully Convolutional Neural Network Combined with YOLO v5 Algorithm in Small Target Detection of Remote Sensing Image[J]. PLoS One, 2021, 16(10).
[7]
BHARATI P, PRAMANIK A. Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey[C]// Computational Intelligence in Pattern Recognition. Singapore: Springer, 2020: 657-668.
[8]
YU F W, LI W B, LI Q Q, et al. POI: Multiple Object Tracking with High Performance Detection and Appearance Feature[C]// Computer Vision-ECCV 2016 Workshops. Cham, Switzerland: Springer, 2016: 36-42.
[9]
YU Z G, DONG Y Y, CHENG J H, et al. Research on Face Recognition Classification Based on Improved GoogleNet[J]. Security and Communication Networks, 2022, 2022.
[10]
谢金龙, 胡勇. 基于深度学习的车辆检测与跟踪系统[J]. 工业控制计算机, 2020, 33(7): 99-101.
XIE J L, HU Y. Vehicle Detection and Tracking System Based on Deep Learning[J]. Industrial Control Computer, 2020, 33(7): 99-101.
[11]
WANG Z D, ZHENG L, LIU Y X, et al. Towards Real-Time Multi-Object Tracking[C]// European Conference on Computer Vision. Cham, Switzerland: Springer, 2020: 107-122.
[12]
CHE J, HE Y T, WU J M. Pedestrian Multiple-Object Tracking Based on FairMOT and Circle Loss[J]. Scientific Reports, 2023, 13(1): 4525.
[13]
WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: A New Backbone That Can Enhance Learning Capability of CNN[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle, WA, USA: IEEE, 2020: 390-391.
[14]
HE K M, ZHANG X Y, REN S Q, et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.
[15]
LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-Based Learning Applied to Document Recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[16]
LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common Objects in Context[C]// 13th European Conference on Computer Vision. Zurich, Switzerland: Springer International Publishing, 2014: 740-755.
[17]
REDMON J, FARHADI A. YOLOv3:An Incremental Improvement[EB/OL]. ( 2018-04-08)[2024-01-18]. https://arxiv.org/abs/1804.02767.
[18]
BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4:Optimal Speed and Accuracy of Object Detection[EB/OL]. ( 2020-04-23)[2024-01-18]. https://arxiv.org/abs/2004.10934.
[19]
LI Y Y, HU J, WEN Y, et al.Rethinking Vision Transformers for MobileNet Size and Speed[C]// 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Paris, France: IEEE, 2023.
[20]
BEWLEY A, GE Z Y, OTT L, et al. Simple Online and Realtime Tracking[C]// 2016 IEEE International Conference on Image Processing (ICIP). Phoenix, AZ, USA: IEEE, 2016: 3464-3468.
[21]
HE K, ZHANG X, REN S, et al. Deep Residual Learning for Image Recognition[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 770-778.
[22]
LIU Z, LIN Y, CAO Y, et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, QC, Canada: IEEE, 2021: 10012-10022.
[23]
GEIGER A, LENZ P, STILLER C, et al. Vision Meets Robotics: The KITTI Dataset[J]. The International Journal of Robotics Research, 2013, 32(11): 1231-1237.
[24]
LIU X C, LIU W, MA H D, et al. Large-Scale Vehicle Re-Identification in Urban Surveillance Videos[C]// 2016 IEEE International Conference on Multimedia and Expo (ICME). Seattle, WA, USA: IEEE, 2016: 1-6.
2024年第卷第7期
PDF下载
233
83
引用本文
BibTeX
文章信息
doi: 10.19620/j.cnki.1000-3703.20231097
  • 首发时间:2025-12-22
  • 出版时间:2024-07-24
补充材料
相关文章
文章信息
作者
出版历史
基金
*广西科技重大专项(AA22068001)
广西科技重大专项(AA23062031)
广西重点研发项目(AB21196029)
柳州市科技计划项目(2022AAA0102)
作者信息
    1 桂林电子科技大学,桂林 541004
    2 桂林航天工业学院,桂林 541001

通讯作者:

莫德赟(1983—),男,副教授,主要研究方向为汽车智能驾驶,
参考文献
分享链接
https://castjournals.cast.org.cn/joweb/qcjs/CN/10.19620/j.cnki.1000-3703.20231097
分享至
全文二维码

扫描看全文

引用本文
BibTeX
本文的引用情况
2种不同金属材料的力学参数

Family
属数
Number of
genus
种数
Number of
species
占总种数比例
Percentage of
total species (%)

Genus
种数
Number of
species
占总种数比例
Percentage of total
species (%)
鹅膏菌科Amanitaceae 2 11 5.26 鹅膏菌属 Amanita 10 4.78
小菇科 Mycenaceae 2 12 5.74 丝盖伞属 Inocybe 5 2.39
多孔菌科 Polyporaceae 8 14 6.70 蜡蘑属 Laccaria 5 2.39
红菇科 Russulaceae 3 23 11.00 小皮伞属 Marasmius 6 2.87
小菇属 Mycena 11 5.26
光柄菇属 Pluteus 5 2.39
红菇属 Russula 17 8.13
栓菌属 Trametes 5 2.39
关闭全屏