Article(id=1149774725922124401, tenantId=1146029695717560320, journalId=1146123166801305609, issueId=1149774724923880044, articleNumber=null, orderNo=null, doi=10.12404/j.issn.1671-1815.2403707, pmid=null, cstr=null, oa=null, hot=null, price=null, onlineType=0, articleFormat=0, articleType=null, articleTypeStr=null, receivedDate=1716048000000, receivedDateStr=2024-05-19, revisedDate=1737993600000, revisedDateStr=2025-01-28, acceptedDate=null, acceptedDateStr=null, onlineDate=1752057256440, onlineDateStr=2025-07-09, pubDate=1745769600000, pubDateStr=2025-04-28, doiRegisterDate=null, doiRegisterDateStr=null, onlineIssueDate=1752057256440, onlineIssueDateStr=2025-07-09, onlineJustAcceptDate=null, onlineJustAcceptDateStr=null, onlineFirstDate=null, onlineFirstDateStr=null, sourceXml=null, magXml=null, createTime=1752057256440, creator=13701087609, updateTime=1752057256440, updator=13701087609, issue=Issue{id=1149774724923880044, tenantId=1146029695717560320, journalId=1146123166801305609, year='2025', volume='25', issue='12', pageStart='4827', pageEnd='5272', issueExtLink='null', onlineDate='null', pubDate='null', beforeIssueId=null, nextIssueId=null, price=null, status=1, issueComplete=1, articleOrder=1, issueType=-1, specialIssue=0, createTime=1752057256203, creator=13701087609, updateTime=1768456746933, updator=13701087609, preIssue=null, nextIssue=null, ext={EN=IssueExt(id=1218559174552764785, tenantId=1146029695717560320, journalId=1146123166801305609, issueId=1149774724923880044, language=EN, specialIssueTitle=, coverIllustrator=, specialIssueEditor=, specialIssueAbout=), CN=IssueExt(id=1218559174552764786, tenantId=1146029695717560320, journalId=1146123166801305609, issueId=1149774724923880044, language=CN, specialIssueTitle=, coverIllustrator=, specialIssueEditor=, specialIssueAbout=)}, issueFiles=null}, startPage=5110, endPage=5118, ext={EN=ArticleExt(id=1149774726106673780, articleId=1149774725922124401, tenantId=1146029695717560320, journalId=1146123166801305609, language=EN, title=Vehicle Small Target Detection Algorithm for UAV Remote Sensing Images Based on YOLOv5, columnId=1156262729162810294, journalTitle=Science Technology and Engineering, columnName=Papers·Automation and Computational Technology, runingTitle=null, highlight=null, articleAbstract=

Remote sensing images are characterized by diverse scales, dense arrangement and small target sizes, etc. Aiming at the problem that there is much background noise in remote sensing images and vehicle targets are small and difficult to be acquired. A vehicle target detection algorithm based on improved feature fusion method, Atiny-YOLO was proposed. Firstly, an additional detection layer for small targets was introduced into the Neck layer of YOLOv5 so as to generated a small target detection algorithm for drone remote sensing images. Neck layer to introduce an additional detection layer for small targets, so as to generated a larger-scale feature map and effectively identified the detailed features of small objects. Secondly, a split operation was added to the C3 module to reuse the image feature information, and the Swin Transformer module was further optimized to improve the usage rate of the effective information. Lastly, by improving the feature fusion channel, the detection accuracy was improved while the model parameters were reducing the model parameters. The Atiny-YOLO algorithm was tested on the AU-AIR(aerial universal autonomous inspection and recognition) dataset. The experimental results show that the average detection accuracy of the Atiny-YOLO algorithm compared to the baseline algorithm is improved by about 2.9%. It reaches 95.5% and the detection speed reaches 234 frames/s. These results verify that the Atiny-YOLO algorithm meets the real-time performance while the model detection accuracy is greatly improved.

, correspAuthors=Meng-ting WANG, authorNote=null, correspAuthorsNote=null, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=null, magXml=null, pdfUrl=null, pdf=null, pdfFileSize=null, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=null, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=null, mapNumber=null, authorCompany=null, fund=null, authors=null, authorsList=Jun-qing BAI, Meng-ting WANG, Shou-ting SHEN), CN=ArticleExt(id=1149774761070392137, articleId=1149774725922124401, tenantId=1146029695717560320, journalId=1146123166801305609, language=CN, title=基于YOLOv5的无人机遥感图像车辆小目标检测算法, columnId=1156262729783567290, journalTitle=科学技术与工程, columnName=论文·自动化技术、计算机技术, runingTitle=null, highlight=null, articleAbstract=

遥感图像具有尺度多样、密集排列和目标尺寸小等特点,针对遥感图像中背景噪声多,车辆目标较小难以获取的问题,提出一种基于改善特征融合方法的车辆目标检测算法Atiny-YOLO。首先,在YOLOv5的Neck层中引入针对小目标的额外检测层,从而生成更大规模的特征图,有效识别小物体的细节特征;其次,向C3模块中添加split操作以复用图像特征信息,进一步优化Swin Transformer模块提高有效信息的使用率;最后,通过改善特征融合通道,提升检测精度的同时减少模型参数。在无人机视角数据集(aerial universal autonomous inspection and recognition, AU-AIR)数据集上验证Atiny-YOLO模型的有效性。实验结果表明:Atiny-YOLO算法相较于基线算法的平均检测精度提高了约2.9%。达到95.5%,检测速度达到234帧/s。这些结果验证了Atiny-YOLO算法在满足实时性的同时,模型检测精度大幅提升。

, correspAuthors=王梦婷, authorNote=null, correspAuthorsNote=
* 王梦婷(2000—),女,汉族,河南驻马店人,硕士研究生。研究方向:目标检测。E-mail:
, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=pH5RN4mqxQM443Y9hA9jvQ==, magXml=7Hq3K6HNbavLjp3tIkohtQ==, pdfUrl=null, pdf=A3yeBDihZNAPsM3hw4j9+Q==, pdfFileSize=null, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=null, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=DF+rPCgE/ZDsupxr6B0xDQ==, mapNumber=null, authorCompany=null, fund=null, authors=

白俊卿(1983—),女,汉族,河南商丘人,博士,副教授,硕士研究生导师。研究方向:机器学习、人工智能。E-mail:

, authorsList=白俊卿, 王梦婷, 沈守婷)}, authors=[Author(id=1179790675450147079, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, orderNo=0, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=13636804262@qq.com, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1179790675517255945, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, authorId=1179790675450147079, language=EN, stringName=Jun-qing BAI, firstName=Jun-qing, middleName=null, lastName=BAI, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=School of Computer Science, Xi'an Petroleum University, Xi'an 710065, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1179790675580170506, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, authorId=1179790675450147079, language=CN, stringName=白俊卿, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=西安石油大学计算机学院, 西安 710065, bio={"content":"

白俊卿(1983—),女,汉族,河南商丘人,博士,副教授,硕士研究生导师。研究方向:机器学习、人工智能。E-mail:

"}, bioImg=null, bioContent=

白俊卿(1983—),女,汉族,河南商丘人,博士,副教授,硕士研究生导师。研究方向:机器学习、人工智能。E-mail:

, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1179790675206877443, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, xref=null, ext=[AuthorCompanyExt(id=1179790675211071748, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, companyId=1179790675206877443, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Computer Science, Xi'an Petroleum University, Xi'an 710065, China), AuthorCompanyExt(id=1179790675219460357, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, companyId=1179790675206877443, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=西安石油大学计算机学院, 西安 710065)])]), Author(id=1179790675651473676, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, orderNo=1, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=2570046402@qq.com, emailSecond=null, emailThird=null, correspondingAuthor=1, authorType=1, ext={EN=AuthorExt(id=1179790675714388238, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, authorId=1179790675651473676, language=EN, stringName=Meng-ting WANG, firstName=Meng-ting, middleName=null, lastName=WANG, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=*, address=School of Computer Science, Xi'an Petroleum University, Xi'an 710065, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1179790675764719887, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, authorId=1179790675651473676, language=CN, stringName=王梦婷, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=*, address=西安石油大学计算机学院, 西安 710065, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1179790675206877443, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, xref=null, ext=[AuthorCompanyExt(id=1179790675211071748, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, companyId=1179790675206877443, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Computer Science, Xi'an Petroleum University, Xi'an 710065, China), AuthorCompanyExt(id=1179790675219460357, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, companyId=1179790675206877443, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=西安石油大学计算机学院, 西安 710065)])]), Author(id=1179790675836023057, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, orderNo=2, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1179790675907326227, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, authorId=1179790675836023057, language=EN, stringName=Shou-ting SHEN, firstName=Shou-ting, middleName=null, lastName=SHEN, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=School of Computer Science, Xi'an Petroleum University, Xi'an 710065, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1179790675974435092, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, authorId=1179790675836023057, language=CN, stringName=沈守婷, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=西安石油大学计算机学院, 西安 710065, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1179790675206877443, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, xref=null, ext=[AuthorCompanyExt(id=1179790675211071748, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, companyId=1179790675206877443, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Computer Science, Xi'an Petroleum University, Xi'an 710065, China), AuthorCompanyExt(id=1179790675219460357, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, companyId=1179790675206877443, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=西安石油大学计算机学院, 西安 710065)])])], keywords=[Keyword(id=1179790676121235733, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, orderNo=1, keyword=remote sensing images), Keyword(id=1179790676179955990, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, orderNo=2, keyword=Swin Transformer), Keyword(id=1179790676238676247, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, orderNo=3, keyword=vehicle detection), Keyword(id=1179790676297396504, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, orderNo=4, keyword=AU-AIR), Keyword(id=1179790676356116761, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, orderNo=5, keyword=Atiny-YOLO), Keyword(id=1179790676406448410, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, orderNo=1, keyword=遥感图像), Keyword(id=1179790676465168667, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, orderNo=2, keyword=Swin Transformer), Keyword(id=1179790676515500316, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, orderNo=3, keyword=车辆检测), Keyword(id=1179790676561637661, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, orderNo=4, keyword=AU-AIR), Keyword(id=1179790676611969310, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, orderNo=5, keyword=Atiny-YOLO)], refs=[Reference(id=1179790679241797976, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2022, volume=null, issue=22, pageStart=96, pageEnd=98, url=null, language=null, rfNumber=[1], rfOrder=0, authorNames=欧阳凯, journalName=工程建设与设计, refType=null, unstructuredReference=欧阳凯. 基于测绘工程测量中无人机遥感技术运用[J]. 工程建设与设计, 2022(22): 96-98., articleTitle=基于测绘工程测量中无人机遥感技术运用, refAbstract=null), Reference(id=1179790679300518233, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2022, volume=null, issue=22, pageStart=96, pageEnd=98, url=null, language=null, rfNumber=[1], rfOrder=1, authorNames=Ouyang Kai, journalName=Engineering Construction and Design, refType=null, unstructuredReference=Ouyang Kai. Application of UAV remote sensing technology in surveying and mapping based engineering measurement[J]. Engineering Construction and Design, 2022(22): 96-98., articleTitle=Application of UAV remote sensing technology in surveying and mapping based engineering measurement, refAbstract=null), Reference(id=1179790679359238491, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2022, volume=null, issue=6, pageStart=18, pageEnd=24, url=null, language=null, rfNumber=[2], rfOrder=2, authorNames=胡义强, 杨骥, 荆文龙, journalName=测绘报, refType=null, unstructuredReference=胡义强, 杨骥, 荆文龙. 基于无人机遥感的海岸带生态环境监测研究综述[J]. 测绘报, 2022(6): 18-24., articleTitle=基于无人机遥感的海岸带生态环境监测研究综述, refAbstract=null), Reference(id=1179790679434735967, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2022, volume=null, issue=6, pageStart=18, pageEnd=24, url=null, language=null, rfNumber=[2], rfOrder=3, authorNames=Hu Yiqiang, Yang Ji, Jing Wenlong, journalName=Surveying and Mapping Journal, refType=null, unstructuredReference=Hu Yiqiang, Yang Ji, Jing Wenlong. A review of coastal zone ecological environment monitoring research based on UAV remote sensing[J]. Surveying and Mapping Journal, 2022(6): 18-24., articleTitle=A review of coastal zone ecological environment monitoring research based on UAV remote sensing, refAbstract=null), Reference(id=1179790679493456225, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2022, volume=11, issue=10, pageStart=502, pageEnd=null, url=null, language=null, rfNumber=[3], rfOrder=4, authorNames=Qin J X, Yang W J, Wu T, journalName=ISPRS International Journal of Geo-Information, refType=null, unstructuredReference=Qin J X, Yang W J, Wu T, et al. Incremental road network update method with trajectory data and UAV remote sensing imagery[J]. ISPRS International Journal of Geo-Information, 2022, 11(10): 502., articleTitle=Incremental road network update method with trajectory data and UAV remote sensing imagery, refAbstract=null), Reference(id=1179790679552176483, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2024, volume=142, issue=null, pageStart=104884, pageEnd=null, url=null, language=null, rfNumber=[4], rfOrder=5, authorNames=Li J, Wei X, journalName=Image and Vision Computing, refType=null, unstructuredReference=Li J, Wei X. Research on efficient detection network method for remote sensing images based on self attention mechanism[J]. Image and Vision Computing, 2024, 142: 104884., articleTitle=Research on efficient detection network method for remote sensing images based on self attention mechanism, refAbstract=null), Reference(id=1179790679606702436, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2022, volume=10, issue=null, pageStart=969846, pageEnd=null, url=null, language=null, rfNumber=[5], rfOrder=6, authorNames=Nan H H, Tianyi Z, Tung C Y, journalName=Frontiers in Public Health, refType=null, unstructuredReference=Nan H H, Tianyi Z, Tung C Y, et al. Image segmentation using transfer learning and Fast R-CNN for diabetic foot wound treatments[J]. Frontiers in Public Health, 2022, 10: 969846., articleTitle=Image segmentation using transfer learning and Fast R-CNN for diabetic foot wound treatments, refAbstract=null), Reference(id=1179790679703171430, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2024, volume=7, issue=1, pageStart=7958815, pageEnd=null, url=null, language=null, rfNumber=[6], rfOrder=7, authorNames=Yang Y, journalName=Journal of Artificial Intelligence Practice, refType=null, unstructuredReference=Yang Y. Vehicle target detection algorithm based on improved faster R-CNN for remote sensing images[J]. Journal of Artificial Intelligence Practice, 2024, 7(1): 7958815., articleTitle=Vehicle target detection algorithm based on improved faster R-CNN for remote sensing images, refAbstract=null), Reference(id=1179790679766085992, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2024, volume=8, issue=5, pageStart=103390, pageEnd=null, url=null, language=null, rfNumber=[7], rfOrder=8, authorNames=Yang X, Xiu J, Liu X, journalName=Drones, refType=null, unstructuredReference=Yang X, Xiu J, Liu X. Research on improved YOLOv5 vehicle target detection algorithm in aerial images[J]. Drones, 2024, 8(5): 103390., articleTitle=Research on improved YOLOv5 vehicle target detection algorithm in aerial images, refAbstract=null), Reference(id=1179790679841583466, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2024, volume=26, issue=null, pageStart=101190, pageEnd=null, url=null, language=null, rfNumber=[8], rfOrder=9, authorNames=Zhao X, Wang Q, Zhang M, journalName=Internet of Things, refType=null, unstructuredReference=Zhao X, Wang Q, Zhang M, et al. CSFF-YOLOv5: improved YOLOv5 based on channel split and feature fusion in femoral neck fracture detection[J]. Internet of Things, 2024, 26: 101190., articleTitle=CSFF-YOLOv5: improved YOLOv5 based on channel split and feature fusion in femoral neck fracture detection, refAbstract=null), Reference(id=1179790679904498028, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2024, volume=40, issue=2, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[9], rfOrder=10, authorNames=Yuan H, Lu Z, Zhang R, journalName=Computational Intelligence, refType=null, unstructuredReference=Yuan H, Lu Z, Zhang R, et al. An effective graph embedded YOLOv5 model for forest fire detection[J]. Computational Intelligence, 2024, 40(2): DOI: 10.1111/coin.12640., articleTitle=An effective graph embedded YOLOv5 model for forest fire detection, refAbstract=null), Reference(id=1179790679963218286, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2024, volume=21, issue=2, pageStart=2547, pageEnd=2561, url=null, language=null, rfNumber=[10], rfOrder=11, authorNames=Wang H L, Qi H M, Feng S, journalName=Journal of Real-Time Image Processing, refType=null, unstructuredReference=Wang H L, Qi H M, Feng S, et al. L-SSD: lightweight SSD target detection based on depth-separable convolution[J]. Journal of Real-Time Image Processing, 2024, 21(2): 2547-2561., articleTitle=L-SSD: lightweight SSD target detection based on depth-separable convolution, refAbstract=null), Reference(id=1179790680038715760, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2021, volume=21, issue=8, pageStart=3192, pageEnd=3198, url=null, language=null, rfNumber=[11], rfOrder=12, authorNames=袁小平, 马绪起, 刘赛, journalName=科学技术与工程, refType=null, unstructuredReference=袁小平, 马绪起, 刘赛. 改进YOLOv3的行人车辆目标检测算法[J]. 科学技术与工程, 2021, 21(8): 3192-3198., articleTitle=改进YOLOv3的行人车辆目标检测算法, refAbstract=null), Reference(id=1179790680110018930, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2021, volume=21, issue=8, pageStart=3192, pageEnd=3198, url=null, language=null, rfNumber=[11], rfOrder=13, authorNames=Yuan Xiaoping, Ma Xuqi, Liu Sai, journalName=Science, Technology and Engineering, refType=null, unstructuredReference=Yuan Xiaoping, Ma Xuqi, Liu Sai. Improved pedestrian-vehicle target detection algorithm for YOLOv3[J]. Science, Technology and Engineering, 2021, 21(8): 3192-3198., articleTitle=Improved pedestrian-vehicle target detection algorithm for YOLOv3, refAbstract=null), Reference(id=1179790680172933492, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2023, volume=39, issue=9, pageStart=134, pageEnd=137, url=null, language=null, rfNumber=[12], rfOrder=14, authorNames=姜淙文, 金立左, journalName=微型电脑应用, refType=null, unstructuredReference=姜淙文, 金立左. Vehicle-YOLO——一种基于航拍影像的车辆检测模型[J]. 微型电脑应用, 2023, 39(9): 134-137., articleTitle=Vehicle-YOLO——一种基于航拍影像的车辆检测模型, refAbstract=null), Reference(id=1179790680252625270, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2023, volume=39, issue=9, pageStart=134, pageEnd=137, url=null, language=null, rfNumber=[12], rfOrder=15, authorNames=Jiang Congwen, Jin Lizuo, journalName=Microcomputer Applications, refType=null, unstructuredReference=Jiang Congwen, Jin Lizuo. Vehicle-YOLO: a vehicle detection model based on aerial images[J]. Microcomputer Applications, 2023, 39(9): 134-137., articleTitle=Vehicle-YOLO: a vehicle detection model based on aerial images, refAbstract=null), Reference(id=1179790680311345528, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2023, volume=44, issue=1, pageStart=211, pageEnd=217, url=null, language=null, rfNumber=[13], rfOrder=16, authorNames=涂媛雅, 汤国放, 张建勋, journalName=小型微型计算机系统, refType=null, unstructuredReference=涂媛雅, 汤国放, 张建勋. Lite-YOLOv3轻量级行人与车辆检测网络[J]. 小型微型计算机系统, 2023, 44(1): 211-217., articleTitle=Lite-YOLOv3轻量级行人与车辆检测网络, refAbstract=null), Reference(id=1179790680382648698, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2023, volume=44, issue=1, pageStart=211, pageEnd=217, url=null, language=null, rfNumber=[13], rfOrder=17, authorNames=Tu Yuanya, Tang Guofang, Zhang Jianxun, journalName=Small Microcomputer Systems, refType=null, unstructuredReference=Tu Yuanya, Tang Guofang, Zhang Jianxun. Lite-YOLOv3 lightweight pedestrian and vehicle detection network[J]. Small Microcomputer Systems, 2023, 44(1): 211-217., articleTitle=Lite-YOLOv3 lightweight pedestrian and vehicle detection network, refAbstract=null), Reference(id=1179790680453951868, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2023, volume=128, issue=1/2, pageStart=937, pageEnd=951, url=null, language=null, rfNumber=[14], rfOrder=18, authorNames=Dong X, Tong F X, Yang G, journalName=The International Journal of Advanced Manufacturing Technology, refType=null, unstructuredReference=Dong X, Tong F X, Yang G, et al. A detection method of spangle defects on zinc-coated steel surfaces based on improved YOLOv5[J]. The International Journal of Advanced Manufacturing Technology, 2023, 128(1/2): 937-951., articleTitle=A detection method of spangle defects on zinc-coated steel surfaces based on improved YOLOv5, refAbstract=null), Reference(id=1179790680512672125, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2024, volume=21, issue=null, pageStart=3111347, pageEnd=null, url=null, language=null, rfNumber=[15], rfOrder=19, authorNames=Yan Y, Liu Z, Xu J, journalName=Mechanical Systems and Signal Processing, refType=null, unstructuredReference=Yan Y, Liu Z, Xu J, et al. A temperature-decoupled impedance-based mass sensing using CBAM-CNN and adaptive weighted average preprocessing with high accuracy[J]. Mechanical Systems and Signal Processing, 2024, 21: 3111347., articleTitle=A temperature-decoupled impedance-based mass sensing using CBAM-CNN and adaptive weighted average preprocessing with high accuracy, refAbstract=null), Reference(id=1179790680579780991, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2023, volume=20, issue=3, pageStart=101007, pageEnd=null, url=null, language=null, rfNumber=[16], rfOrder=20, authorNames=Jin Y F, Gong D X, Zhao S Y, journalName=Journal of Real-Time Image Processing, refType=null, unstructuredReference=Jin Y F, Gong D X, Zhao S Y, et al. A real-time fire detection method from video for electric vehicle-charging stations based on improved YOLOX-tiny[J]. Journal of Real-Time Image Processing, 2023, 20(3): 101007., articleTitle=A real-time fire detection method from video for electric vehicle-charging stations based on improved YOLOX-tiny, refAbstract=null), Reference(id=1179790680667861377, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2023, volume=99, issue=null, pageStart=373, pageEnd=381, url=null, language=null, rfNumber=[17], rfOrder=21, authorNames=Xue L S, Du S H, Wu H T, journalName=Journal of Manufacturing Processes, refType=null, unstructuredReference=Xue L S, Du S H, Wu H T, et al. Defect signal intelligent recognition of weld radiographs based on YOLOv5-improvement[J]. Journal of Manufacturing Processes, 2023, 99: 373-381., articleTitle=Defect signal intelligent recognition of weld radiographs based on YOLOv5-improvement, refAbstract=null), Reference(id=1179790680743358851, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2023, volume=23, issue=28, pageStart=12159, pageEnd=12167, url=null, language=null, rfNumber=[18], rfOrder=22, authorNames=蒲玲玲, 杨柳, journalName=科学技术与工程, refType=null, unstructuredReference=蒲玲玲, 杨柳. 改进YOLOv5的多车辆目标实时检测及跟踪算法[J]. 科学技术与工程, 2023, 23(28): 12159-12167., articleTitle=改进YOLOv5的多车辆目标实时检测及跟踪算法, refAbstract=null), Reference(id=1179790680814662021, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2023, volume=23, issue=28, pageStart=12159, pageEnd=12167, url=null, language=null, rfNumber=[18], rfOrder=23, authorNames=Pu Lingling, Yang Liu, journalName=Science Technology and Engineering, refType=null, unstructuredReference=Pu Lingling, Yang Liu. Improved YOLOv5 real-time multi-vehicle target detection and tracking algorithm[J]. Science Technology and Engineering, 2023, 23(28): 12159-12167., articleTitle=Improved YOLOv5 real-time multi-vehicle target detection and tracking algorithm, refAbstract=null), Reference(id=1179790680877576583, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2022, volume=38, issue=15, pageStart=133, pageEnd=142, url=null, language=null, rfNumber=[19], rfOrder=24, authorNames=王璨, 武新慧, 张燕青, journalName=农业工程学报, refType=null, unstructuredReference=王璨, 武新慧, 张燕青, 等. 基于移位窗口Transformer网络的玉米田间场景下杂草识别[J]. 农业工程学报, 2022, 38(15): 133-142., articleTitle=基于移位窗口Transformer网络的玉米田间场景下杂草识别, refAbstract=null), Reference(id=1179790680969851273, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, doi=null, pmid=null, pmcid=null, year=2022, volume=38, issue=15, pageStart=133, pageEnd=142, url=null, language=null, rfNumber=[19], rfOrder=25, authorNames=Wang Can, Wu Xinhui, Zhang Yanqing, journalName=Journal of Agricultural Engineering, refType=null, unstructuredReference=Wang Can, Wu Xinhui, Zhang Yanqing, et al. Weed recognition in corn field scenarios based on shift-window transformer network[J]. Journal of Agricultural Engineering, 2022, 38(15): 133-142., articleTitle=Weed recognition in corn field scenarios based on shift-window transformer network, refAbstract=null)], funds=[Fund(id=1179790679002722640, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, awardId=2023-JC-YB-601, language=CN, fundingSource=陕西省自然科学基金基础研究计划项目(2023-JC-YB-601), fundOrder=null, country=null), Fund(id=1179790679057248594, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, awardId=23GXFW0077, language=CN, fundingSource=西安市科技计划高校院所人才服务企业项目(23GXFW0077), fundOrder=null, country=null), Fund(id=1179790679111774548, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, awardId=2023-X-YKC-003, language=CN, fundingSource=西安石油大学研究生精品课程建设项目(2023-X-YKC-003), fundOrder=null, country=null)], companyList=[AuthorCompany(id=1179790675206877443, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, xref=null, ext=[AuthorCompanyExt(id=1179790675211071748, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, companyId=1179790675206877443, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Computer Science, Xi'an Petroleum University, Xi'an 710065, China), AuthorCompanyExt(id=1179790675219460357, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, companyId=1179790675206877443, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=西安石油大学计算机学院, 西安 710065)])], figs=[ArticleFig(id=1179790676750381343, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, label=Fig.1, caption=Atiny-YOLO model structure, figureFileSmall=PQDLzzYdiTozfkiuRmTq3w==, figureFileBig=c1t3l93FcI2wtCrzknDyUA==, tableContent=null), ArticleFig(id=1179790676804907296, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, label=图1, caption=Atiny-YOLO模型结构

Conv为卷积;myC3为本文改进C3模块;Concat为相加模块;Upsample为上采样模块;myWinTR为本文改进Transformer模块;SPPF为空间金字塔快速池化模块;DETECT为检测层

, figureFileSmall=PQDLzzYdiTozfkiuRmTq3w==, figureFileBig=c1t3l93FcI2wtCrzknDyUA==, tableContent=null), ArticleFig(id=1179790676855238945, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, label=Fig.2, caption=Small target detection layer, figureFileSmall=GwGygyOtQKSiq/9OAU5Vcg==, figureFileBig=abIAHMUr5WeA8PCJVhSJiQ==, tableContent=null), ArticleFig(id=1179790676905570594, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, label=图2, caption=小目标检测层

Backbone为主干网络;predict为预测输出

, figureFileSmall=GwGygyOtQKSiq/9OAU5Vcg==, figureFileBig=abIAHMUr5WeA8PCJVhSJiQ==, tableContent=null), ArticleFig(id=1179790676964290851, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, label=Fig.3, caption=Structure of C3 module, figureFileSmall=17JX65JHFEqg7cQ+u9iQlw==, figureFileBig=ICh1SILNvHSH+/gdSXnc4w==, tableContent=null), ArticleFig(id=1179790677014622500, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, label=图3, caption=C3模块结构图

Conv为卷积;BN为瓶颈层;conca 为相加模块

, figureFileSmall=17JX65JHFEqg7cQ+u9iQlw==, figureFileBig=ICh1SILNvHSH+/gdSXnc4w==, tableContent=null), ArticleFig(id=1179790677073342757, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, label=Fig.4, caption=Structure of myC3 module, figureFileSmall=79ElklhBKHbem6cMlqbmog==, figureFileBig=m2reGoYBBduB9sya+oqo9w==, tableContent=null), ArticleFig(id=1179790677127868710, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, label=图4, caption=myC3模块结构图

Conv为卷积;Split为分割操作;BN为瓶颈层;Concat为相加模块;

, figureFileSmall=79ElklhBKHbem6cMlqbmog==, figureFileBig=m2reGoYBBduB9sya+oqo9w==, tableContent=null), ArticleFig(id=1179790677186588967, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, label=Fig.5, caption=Structure of Swin Transformer module, figureFileSmall=MzCPxkawJsXUVZNKLlb0PQ==, figureFileBig=EE1SygQgLdr/g1B3/ewuyQ==, tableContent=null), ArticleFig(id=1179790677249503528, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, label=图5, caption=Swin Transformer模块结构图

MLP(multi-layer perceptron)为多层感知机,用于提取特征;LN(layer normalization) 为正则化方法,用于标准化输出;W-MSA(Window based multi-head self-attention) 和 SW-MSA(Shifted Window based multi-head self attention) 均为多头自注意力模块;zl z ^ l分别为第l个 block的MLP模块输出特征和 W-MSA模块输出特征

, figureFileSmall=MzCPxkawJsXUVZNKLlb0PQ==, figureFileBig=EE1SygQgLdr/g1B3/ewuyQ==, tableContent=null), ArticleFig(id=1179790677312418089, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, label=Fig.6, caption=Structure of myWinTR module, figureFileSmall=J5ObVQA72DQDPM1F77BDfw==, figureFileBig=aXUXnQF3hBvSmMmXNEB/Aw==, tableContent=null), ArticleFig(id=1179790677366944042, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, label=图6, caption=myWinTR模块结构图

Conv为卷积;SW为多头自注意力模块;Concat为相加模块

, figureFileSmall=J5ObVQA72DQDPM1F77BDfw==, figureFileBig=aXUXnQF3hBvSmMmXNEB/Aw==, tableContent=null), ArticleFig(id=1179790677425664299, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, label=Fig.7, caption=FPN structure, figureFileSmall=UymSoudoUCSDbjywJnz4Iw==, figureFileBig=e/VXV6+3+85i3VBUnD7bwg==, tableContent=null), ArticleFig(id=1179790677538910508, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, label=图7, caption=FPN结构图

Bottom-up 为从下向上提取图像信息;Top-down 为从上向下合并特征图信息;Lateral connection 为横向连接特征信息;C1、C2、C3为3个卷积层,用于特征提取;P1、P2、P3为3个预测层,用于生成预测结果

, figureFileSmall=UymSoudoUCSDbjywJnz4Iw==, figureFileBig=e/VXV6+3+85i3VBUnD7bwg==, tableContent=null), ArticleFig(id=1179790677622796590, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, label=Fig.8, caption=Experimental data set, figureFileSmall=lQluEahPNH9VJhgTBzDehw==, figureFileBig=zcgPBSW8PHKGHXuJOdNVLA==, tableContent=null), ArticleFig(id=1179790677681516848, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, label=图8, caption=实验数据集, figureFileSmall=lQluEahPNH9VJhgTBzDehw==, figureFileBig=zcgPBSW8PHKGHXuJOdNVLA==, tableContent=null), ArticleFig(id=1179790677757014322, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, label=Fig.9, caption=Faster R-CNN model detection results, figureFileSmall=Mee/ysC5dNfY+X34SXy5fA==, figureFileBig=Hgy6OJ8hfH0JJQVR6MDmog==, tableContent=null), ArticleFig(id=1179790677828317492, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, label=图9, caption=Faster R-CNN模型检测结果, figureFileSmall=Mee/ysC5dNfY+X34SXy5fA==, figureFileBig=Hgy6OJ8hfH0JJQVR6MDmog==, tableContent=null), ArticleFig(id=1179790677899620662, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, label=Fig.10, caption=YOLOv5s model detection results, figureFileSmall=oAnRXVoIZp+6VjSP287sCw==, figureFileBig=zgXmYL304iBPKWRxhfJn7w==, tableContent=null), ArticleFig(id=1179790677970923832, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, label=图10, caption=YOLOv5s模型检测结果, figureFileSmall=oAnRXVoIZp+6VjSP287sCw==, figureFileBig=zgXmYL304iBPKWRxhfJn7w==, tableContent=null), ArticleFig(id=1179790678059004218, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, label=Fig.11, caption=Atiny-YOLO model detection results, figureFileSmall=kyHzqFiJmia6jyScIwD29Q==, figureFileBig=xPxshzvyYOZ/NuzPp0X8sQ==, tableContent=null), ArticleFig(id=1179790678218387772, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, label=图11, caption=Atiny-YOLO模型检测结果, figureFileSmall=kyHzqFiJmia6jyScIwD29Q==, figureFileBig=xPxshzvyYOZ/NuzPp0X8sQ==, tableContent=null), ArticleFig(id=1179790678314856765, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, label=Fig.12, caption=Video stream detection results, figureFileSmall=fjhSyjId0an8RmrixdVnTQ==, figureFileBig=eMsWrR9vnznDgSDJXw9g0g==, tableContent=null), ArticleFig(id=1179790678377771326, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, label=图12, caption=视频流检测结果, figureFileSmall=fjhSyjId0an8RmrixdVnTQ==, figureFileBig=eMsWrR9vnznDgSDJXw9g0g==, tableContent=null), ArticleFig(id=1179790678444880191, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, label=Table 1, caption=

Experimental parameter settings (SGD optimizer)

, figureFileSmall=null, figureFileBig=null, tableContent=
参数 数值
初始化学习率 0.01
动量 0.937
权重衰减 0.000 5
迭代次数 50
Batch size 20
), ArticleFig(id=1179790678520377664, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, label=表1, caption=

实验参数设置(SGD优化器)

, figureFileSmall=null, figureFileBig=null, tableContent=
参数 数值
初始化学习率 0.01
动量 0.937
权重衰减 0.000 5
迭代次数 50
Batch size 20
), ArticleFig(id=1179790678574903617, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, label=Table 2, caption=

Comparison of ablation experiment results

, figureFileSmall=null, figureFileBig=null, tableContent=
检测层 myC3 mywinTR 特征
融合
mAP@0.5/
%
Params/
M
FPS/
(帧·s-1)
92.6 7.0 352
94.5(+1.9) 15.1 120
94.6(+0.1) 16.5 134
95.1(+0.5) 15.9 150
95.5(+0.4) 13.5 234
), ArticleFig(id=1179790678654595397, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, label=表2, caption=

消融实验结果对比

, figureFileSmall=null, figureFileBig=null, tableContent=
检测层 myC3 mywinTR 特征
融合
mAP@0.5/
%
Params/
M
FPS/
(帧·s-1)
92.6 7.0 352
94.5(+1.9) 15.1 120
94.6(+0.1) 16.5 134
95.1(+0.5) 15.9 150
95.5(+0.4) 13.5 234
), ArticleFig(id=1179790678793007433, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=EN, label=Table 3, caption=

Comparison of the performance of different models

, figureFileSmall=null, figureFileBig=null, tableContent=
模型 P/% R/% mAP@0.5/% mAP@0.5-0.95/% Params/M FPS/(帧·s-1)
SSD 92.3 78.2 74.1 47.8 24.1 212
Faster R-CNN 63.5 90.8 76.5 49.5 15.7 73
EfficientDet-D0 75.3 85.0 71.2 46.7 3.9 29
MobileNet V3L 88.1 84.4 80.3 48.8 4.2 89
TPH-YOLOv5 91.0 83.7 89.5 32.1 45.4 208
YOLOv5s 94.3 91.9 92.6 59.5 7.0 352
YOLOv5l 94.6 92.6 93.5 62.8 46.2 176
YOLOv5n 94.7 90.4 93.2 55.1 1.8 256
YOLOv5x 95.4 92.0 94.7 63.6 86.2 291
YOLOv6 96.5 93.0 94.4 59.8 4.1 180
Atiny-YOLO 97.2 92.1 95.5 59.9 13.5 234
), ArticleFig(id=1179790678868504907, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149774725922124401, language=CN, label=表3, caption=

不同模型性能对比

, figureFileSmall=null, figureFileBig=null, tableContent=
模型 P/% R/% mAP@0.5/% mAP@0.5-0.95/% Params/M FPS/(帧·s-1)
SSD 92.3 78.2 74.1 47.8 24.1 212
Faster R-CNN 63.5 90.8 76.5 49.5 15.7 73
EfficientDet-D0 75.3 85.0 71.2 46.7 3.9 29
MobileNet V3L 88.1 84.4 80.3 48.8 4.2 89
TPH-YOLOv5 91.0 83.7 89.5 32.1 45.4 208
YOLOv5s 94.3 91.9 92.6 59.5 7.0 352
YOLOv5l 94.6 92.6 93.5 62.8 46.2 176
YOLOv5n 94.7 90.4 93.2 55.1 1.8 256
YOLOv5x 95.4 92.0 94.7 63.6 86.2 291
YOLOv6 96.5 93.0 94.4 59.8 4.1 180
Atiny-YOLO 97.2 92.1 95.5 59.9 13.5 234
)], attaches=null, journal=Journal(id=1146119176004939786, delFlag=0, nameCn=科学技术与工程, nameEn=Science Technology and Engineering, nameHistory1=null, nameHistory2=null, issn=1671-1815, eissn=, cn=11-4688/T, coden=null, periodic=4, language=CN, oaType=是, ccby=null, superviseOffice=null, ownerOffice=null, pubOffice=null, editorOffice=null, officeType=null, aims=null, clcCode=null, officeProv=null, officeCity=null, officeAddr=null, officeZip=null, officeEmail=null, officePhone=null, editDirector=null, officeDirector=null, officeDirectorPhone=null, officeStaffNum=null, officeEmpNum=null, coverPicUrl=UKU/O7GSka5polgCTkbIIw==, journalPrice=null, startedYear=null, abbrevIsoEn=Sci Technol Eng, journalRemark=null, publicationField=null, createdTime=null, updatedTime=1754445529766, createdBy=null, updatedBy=13701087609, firstLetterCn=S, firstLetterEn=S, subjectCode=Natural Sciences, subjectName=自然科学, subjectCodeEn=Natural Sciences, subjectNameEn=null, picCn=UKU/O7GSka5polgCTkbIIw==, picEn=5hwlULoNwcbj3xUmVi9MAQ==, jcr=null, cjcr=null, exts=[JournalExt(id=1159791870395564357, language=CN, name=科学技术与工程, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=null, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=http://www.stae.com.cn/jsygc/home, createdTime=1754445529793, updatedTime=1754445529793, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=http://www.stae.com.cn/jsygc/site/menus/20090429150146001, submissionAuthorUrl=http://www.stae.com.cn/jsygc/author/login, submissionEditorUrl=http://www.stae.com.cn/jsygc/editor/login, submissionReviewUrl=http://www.stae.com.cn/jsygc/reviewer/login, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""}), JournalExt(id=1159791870441701702, language=EN, name=Science Technology and Engineering, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=null, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=http://www.stae.com.cn/jsygc/home, createdTime=1754445529804, updatedTime=1754445529804, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=http://www.stae.com.cn/jsygc/author/login, submissionEditorUrl=http://www.stae.com.cn/jsygc/editor/login, submissionReviewUrl=http://www.stae.com.cn/jsygc/reviewer/login, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""})], databaseList=null, tenantJournalId=1146123166801305609, websiteList=[Website(id=1148243202391400884, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1146123166801305609, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/kxjsygc/CN, language=CN, createTime=1751692112777, createBy=18614031015, updateTime=1753520965431, updateBy=18614031015, name=科学技术与工程-中文站点, tplId=1146099689490845704, title=科学技术与工程, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1148622798802673703, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1148243202391400884, code=articleTextType, value=kx, createTime=1751782615614, updateTime=1751782615614, creator=18614031015, updator=18614031015), WebsiteProps(id=1148622798781702180, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1148243202391400884, code=banner, value=null, createTime=1751782615609, updateTime=1751782615609, creator=18614031015, updator=18614031015), WebsiteProps(id=1148622798769119267, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1148243202391400884, code=logo, value=https://castjournals.cast.org.cn/joweb/kjdb/CN/file/pic?fileId=j86gbwi+p0Idkyl5SzIlmQ==, createTime=1751782615606, updateTime=1751782615606, creator=18614031015, updator=18614031015), WebsiteProps(id=1148622798794285094, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1148243202391400884, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/kjdb/CN/file/pic, createTime=1751782615612, updateTime=1751782615612, creator=18614031015, updator=18614031015), WebsiteProps(id=1148622798790090789, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1148243202391400884, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_cn_619/, createTime=1751782615611, updateTime=1751782615611, creator=18614031015, updator=18614031015)]), Website(id=1155914124811976731, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1146123166801305609, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/kxjsygc/EN, language=EN, createTime=1753521003206, createBy=18614031015, updateTime=1753521003206, updateBy=18614031015, name=科学技术与工程-英文站点, tplId=1146101810881728533, title=Science Technology and Engineering, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1155914371227308235, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1155914124811976731, code=articleTextType, value=kx, createTime=1753521061952, updateTime=1753521061952, creator=18614031015, updator=18614031015), WebsiteProps(id=1155914371210531016, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1155914124811976731, code=banner, value=null, createTime=1753521061947, updateTime=1753521061947, creator=18614031015, updator=18614031015), WebsiteProps(id=1155914371202142407, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1155914124811976731, code=logo, value=https://castjournals.cast.org.cn/joweb/kjdb/CN/file/pic?fileId=j86gbwi+p0Idkyl5SzIlmQ==, createTime=1753521061945, updateTime=1753521061945, creator=18614031015, updator=18614031015), WebsiteProps(id=1155914371223113930, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1155914124811976731, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/kjdb/CN/file/pic, createTime=1753521061950, updateTime=1753521061950, creator=18614031015, updator=18614031015), WebsiteProps(id=1155914371218919625, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1155914124811976731, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_cn_619/, createTime=1753521061949, updateTime=1753521061949, creator=18614031015, updator=18614031015)])], journalTitle=科学技术与工程, weixinUrl=null, journalUrl=null, iacademicId=null, status=0, seqNo=null, journalTitleEn=Science Technology and Engineering, journalPhotoCn=UKU/O7GSka5polgCTkbIIw==, journalPhotoEn=5hwlULoNwcbj3xUmVi9MAQ==, journalFirstLetter=S, journalRecommend=null, journalNew=null, journalCollection=null, jcrJf=null, cjcrJf=null, jcrJfStr=null, cjcrJfStr=null, submissionFirstDecision=null, sciSubjectClassification=null, casSubjectClassification=null, citeScore=null, totalCitationFrequency=null, icpCode=null, psCode=null, advertisingLicenseCode=null, copyrightInformation=null, country=null, option=null, provinceCode=null, provinceName=null, collectFlag=false), detailUrlCn=https://castjournals.cast.org.cn/joweb/kxjsygc/CN/10.12404/j.issn.1671-1815.2403707, detailUrlEn=https://castjournals.cast.org.cn/joweb/kxjsygc/EN/10.12404/j.issn.1671-1815.2403707, pdfUrlCn=https://castjournals.cast.org.cn/joweb/kxjsygc/CN/PDF/10.12404/j.issn.1671-1815.2403707, pdfUrlEn=https://castjournals.cast.org.cn/joweb/kxjsygc/EN/PDF/10.12404/j.issn.1671-1815.2403707, aliStartDate=null, aliEndDate=null, collectionFlag=false, citedCount=null, citedUrl=null, reference=null)
收藏切换
基于YOLOv5的无人机遥感图像车辆小目标检测算法
收藏切换
PDF下载
白俊卿 , 王梦婷 * , 沈守婷
科学技术与工程 | 论文·自动化技术、计算机技术 2025,25(12): 5110-5118
收起
收藏切换
科学技术与工程 | 论文·自动化技术、计算机技术 2025, 25(12): 5110-5118
基于YOLOv5的无人机遥感图像车辆小目标检测算法
全屏
白俊卿 , 王梦婷* , 沈守婷
作者信息
  • 西安石油大学计算机学院, 西安 710065
  • 白俊卿(1983—),女,汉族,河南商丘人,博士,副教授,硕士研究生导师。研究方向:机器学习、人工智能。E-mail:

通讯作者:

* 王梦婷(2000—),女,汉族,河南驻马店人,硕士研究生。研究方向:目标检测。E-mail:
Vehicle Small Target Detection Algorithm for UAV Remote Sensing Images Based on YOLOv5
Jun-qing BAI , Meng-ting WANG* , Shou-ting SHEN
Affiliations
  • School of Computer Science, Xi'an Petroleum University, Xi'an 710065, China
出版时间: 2025-04-28 doi: 10.12404/j.issn.1671-1815.2403707
文章导航
收藏切换

遥感图像具有尺度多样、密集排列和目标尺寸小等特点,针对遥感图像中背景噪声多,车辆目标较小难以获取的问题,提出一种基于改善特征融合方法的车辆目标检测算法Atiny-YOLO。首先,在YOLOv5的Neck层中引入针对小目标的额外检测层,从而生成更大规模的特征图,有效识别小物体的细节特征;其次,向C3模块中添加split操作以复用图像特征信息,进一步优化Swin Transformer模块提高有效信息的使用率;最后,通过改善特征融合通道,提升检测精度的同时减少模型参数。在无人机视角数据集(aerial universal autonomous inspection and recognition, AU-AIR)数据集上验证Atiny-YOLO模型的有效性。实验结果表明:Atiny-YOLO算法相较于基线算法的平均检测精度提高了约2.9%。达到95.5%,检测速度达到234帧/s。这些结果验证了Atiny-YOLO算法在满足实时性的同时,模型检测精度大幅提升。

遥感图像  /  Swin Transformer  /  车辆检测  /  AU-AIR  /  Atiny-YOLO

Remote sensing images are characterized by diverse scales, dense arrangement and small target sizes, etc. Aiming at the problem that there is much background noise in remote sensing images and vehicle targets are small and difficult to be acquired. A vehicle target detection algorithm based on improved feature fusion method, Atiny-YOLO was proposed. Firstly, an additional detection layer for small targets was introduced into the Neck layer of YOLOv5 so as to generated a small target detection algorithm for drone remote sensing images. Neck layer to introduce an additional detection layer for small targets, so as to generated a larger-scale feature map and effectively identified the detailed features of small objects. Secondly, a split operation was added to the C3 module to reuse the image feature information, and the Swin Transformer module was further optimized to improve the usage rate of the effective information. Lastly, by improving the feature fusion channel, the detection accuracy was improved while the model parameters were reducing the model parameters. The Atiny-YOLO algorithm was tested on the AU-AIR(aerial universal autonomous inspection and recognition) dataset. The experimental results show that the average detection accuracy of the Atiny-YOLO algorithm compared to the baseline algorithm is improved by about 2.9%. It reaches 95.5% and the detection speed reaches 234 frames/s. These results verify that the Atiny-YOLO algorithm meets the real-time performance while the model detection accuracy is greatly improved.

remote sensing images  /  Swin Transformer  /  vehicle detection  /  AU-AIR  /  Atiny-YOLO
白俊卿, 王梦婷, 沈守婷. 基于YOLOv5的无人机遥感图像车辆小目标检测算法. 科学技术与工程, 2025 , 25 (12) : 5110 -5118 . DOI: 10.12404/j.issn.1671-1815.2403707
Jun-qing BAI, Meng-ting WANG, Shou-ting SHEN. Vehicle Small Target Detection Algorithm for UAV Remote Sensing Images Based on YOLOv5[J]. Science Technology and Engineering, 2025 , 25 (12) : 5110 -5118 . DOI: 10.12404/j.issn.1671-1815.2403707
无人机遥感技术[1]具有系统简单、成本较低和快速、宏观、动态的显著特点,很好地弥补了传统遥感技术的应急缺陷[2]。由于无人机拍摄得到的遥感图像中信息含量较多,细节更为复杂,实时性较强,且会受到天气因素干扰,因此对遥感图像中的车辆目标进行检测面临着以下困难。由于拍摄距离和角度的变化,不同类型的车辆在图像中呈现出不同的尺寸和形状,存在显著差异;遥感图像的背景包含了大量与车辆目标相似或其他易受干扰的信息,如建筑物阴影、道路纹理和树木遮挡等,这些背景干扰会降低车辆目标的显著性;小目标的检测具有挑战性,因为在遥感图像中,车辆目标通常只有几个像素大小,与背景的对比度较低,难以捕捉到车辆的局部细节特征[3]
车辆目标检测方法基于神经网络主要分为两种:双阶段检测算法和单阶段检测算法。双阶段检测算法首先提取候选框,然后对这些候选框中的目标进行识别。然而,由于其结构限制,双阶段检测算法通常需要较长的检测时间,难以满足实时性需求。代表性的算法包括R-CNN(region-based convolutional neural networks)系列的R-CNN[4]、Fast R-CNN[5]和Faster R-CNN[6]。相比之下,单阶段算法省略了候选框提取阶段,能够在一个阶段内完成目标的确定和定位。其中非常具有代表性的单阶段算法包括YOLO(you only look once) [7-9]系列和SSD(single shot multibox detector) [10]系列。
近年来,为了解决遥感图像中车辆目标检测精度低的问题,中外学者针对目标检测方法进行了相关改进,文献[11]通过替换YOLOv3的特征提取网络,优化了卷积网络结构,同时引入密集连接提高了检测精度,但不能满足实时性的要求。文献[12]通过引入SPP模块优化YOLOv3算法,同时采用4倍、8倍、16倍下采样对更多小目标信息抓取,实现了检测精度与速度的提升,较YOLOv3精度提高了13.29%,检测速度达到39FPS,但是对于多尺度小目标的检测效果不佳。文献[13]对轻量级车辆和行人检测网络进行了深入研究,尽管网络参数量大幅下降,但模型精度仍有待提高。文献[14]在YOLOv5的颈部引入C3Ghost和Ghost模块,同时在Backbone层中引入CBAM注意力模块[15],以增强对车辆检测任务中重要信息的提取能力,采用完全交并比损失函数(CIOU Loss)[16],以提高算法的定位精度。文献[17]将空洞卷积融入到YOLOv5的空间池化金字塔模块,增强了网络的特征提取能力,提高了网络对小目标的检测精度。文献[18]通过添加特征识别模块,成功提升了YOLOv5s的检测精确度,实现了车辆的多目标跟踪,满足对图像检测实时性的需求。
上述方法在检测精度上都有了不同程度的提升,但是大多数算法在优化模型Backbone时,没有充分利用Backbone提取的多尺度特征图,同时未考虑不同尺度特征对融合后特征的贡献度的问题,导致融合后特征的信息冗杂,削弱了小目标的特征信息;此外,遥感图像中的目标面临着尺度变化巨大的挑战,然而目前只有少数算法能够同时检测到小尺寸目标和大尺寸目标,网络对目标尺度的泛化能力仍需进一步提高。
针对上述问题,为了解决车辆小目标检测精度低且常漏检的问题,提出一种基于特征融合方法的检测方法Atiny-YOLO。在Neck层中添加小目标检测层,将三尺度检测变为四尺度检测,加强了对小尺度目标的学习能力,降低车辆小目标的漏检率;用自定义的myC3模块替换Backbone和head层中C3模块,并进一步提高Swin Transformer的运算速率,减小算法运算量,同时使用自定义的myWinTR来解决在实际检测过程中遇到的负面因素,如天色昏暗、目标物距离较远或体积较小等;修改特征融合的网络层,对较靠前的网络层输出图像额外进行拼接识别操作,这是因为未经过多层卷积的特征图像尺度更大,具有更清晰的特征,有助于提高目标识别的精度。
YOLOv5是一种目标检测模型,它继承了YOLO系列的特点,并在多个方面进行了改进和优化。同时,YOLOv5以其优异的性能、易用性和广泛的适用性,在多个领域得到了广泛的应用,不仅可以实时监控交通流量,检测违章行为,还可以作为城市安防系统和公共设施的监控工具,提高城市的安全管理水平。此外,YOLOv5还被应用于农业和生产业中的产品检测和缺陷识别,大大提高了生产效率。
YOLO基于端到端的理念,将目标识别任务视为一个回归问题,并通过整个图像的全部像素直接获得边界框的坐标、图像所包含的物体、可靠性和类别概率。在此之前的对象检测方法主要依靠大量的潜在边界框,利用潜在边界框来框取候选区域标识的对象,从而判断对象类别的概率或可靠性。YOLOv5模型主要包括5个处理步骤。
(1)图像预处理。输入的图像都会经过预处理步骤,包括大小调整、归一化、增强等,以便于后续模型处理。
(2)特征提取。预处理后图像会被传输到模型中经过Backbone层,该网络可以有效地从图像中提取特征信息。
(3)特征图处理。在Backbone层的基础上,YOLOv5通过添加一系列卷积层和池化层,将特征图进行进一步处理,以提取更加高级的特征信息。
(4)目标检测。通过在特征图上运行卷积操作,YOLOv5可以同时预测图像中多个目标的位置和类别。它将图像分成不同的网格,并在每个网格上预测边界框(bounding box)和相应的类别概率。
(5)后处理。在目标检测完成后,YOLOv5会进行一些后处理步骤,如非极大值抑制(NMS),以去除重叠较多的边界框,并保留置信度较高的边界框。
YOLO系列模型的特点是速度快、体积小、结构简单,在保持高检测速度的同时大大提高了检测精度,其中YOLOv5结合了YOLOv3-SPP和YOLOv4的一些功能,性能上有了更高的提升。故选用YOLOv5s作为车辆小目标检测模型进行实验。
在实际检测过程中,当有两个或更多数量的类似车辆目标出现在同一图像中时,YOLOv5网络往往会误判它们属于同一类别,为了提高模型的检测精度,提出一种改善特征融合的车辆目标检测算法Atiny-YOLO。首先,输入端经过4倍、8倍、16倍、32倍下采样后,分别得到了4种包含不同尺度空间信息和语义信息的特征图,并在Neck层进行特征融合,增强了网络对多尺度目标的检测能力;在Backbone层中,Atiny-YOLO嵌入所提出的myC3和myWinRT模块,其中,myC3模块通过卷积操作有效地提取前景特征信息,同时抑制背景特征信息,而myWinRT模块则利用窗函数技术,进一步增强前景特征的表达能力,从而提升了模型对目标的辨识能力;Atiny-YOLO在网络层的设计上进行了调整,使得结果在输出前能够与初始特征图进行拼接。由于初始特征图包含大量的底层信息,这样可以有效地减少特征信息在传递过程中的损失,在输出结果中保留更多的细节信息,提高了检测结果的精确度和鲁棒性。Atiny-YOLO模型结构如图1所示。
为了提高模型对小目标特征信息的捕捉能力,在YOLOv5s的Neck层中增加一个小目标检测层。原始网络有3个检测头,分别为20×20(小目标)、40×40(中目标)、80×80(小目标),要在80×80上层增加小目标检测层,也就是增加160×160尺寸。由于原始Neck网络中没有160×160的特征图,需要对80×80的特征图进行上采样得到160×160的特征图,随后将这个特征图与Backbone层的第三层输出进行融合,确保它们在通道数和大小上匹配。这样合并的特征图连同其他检测层的输出一同传入Head层进行后续的分类和检测处理。由于Neck层生成了多个不同尺度的特征图,每个特征图对应不同大小的锚点,使得模型在检测过程中能够在较大的特征图上检测较小的物体,在较小的特征图上检测较大的物体,从而增强了对图像中不同尺寸物体的表示能力。小目标检测层流程如图2所示。
在YOLOv5网络结构中,作用在Backbone层和Head层的C3模块取代了传统的Bottleneck CSP模块,C3模块中包含3个标准卷积层以及多个Bottleneck模块,在学习残差特征过程中,C3模块会分为两个分支进行,一个使用标准卷积层和叠加起来的Bottleneck,另一个仅通过基本卷积模块,最后将两个分支整合拼接得到目标特征图。C3模块结构图如图3所示。
本文算法在原基础上加入split方法,将卷积层后输入的n个BN层进行拆分,使其分别输出图像并添加一次拼接操作,将改进后的C3定义为myC3。myC3模块结构图如图4所示。
myC3模块对批BN层输出的所有图像进行拼接操作,以实现细节的融合,从而增加模型接收到的特征信息的丰富性,并显著提升车辆识别的准确率。
传统的Transformer框架在目标检测过程中主要有两个问题,一是当不同场景的视觉对象发生显著变化时,Transformer的性能会下降;二是当图像分辨率较高、像素点的数量较大时,Transformer所使用的基于全局自注意力的计算方式会使计算量增大。文献[19]在Transformer的基础上提出了一种融合滑窗操作且具有分层设计的Swin Transformer,Swin Transformer模块如图5所示。
Swin Transformer在每一次卷积完成后选取2×2组特征向量进行融合和压缩。本文算法在Swin Transformer block中嵌入3层卷积网络,并将其定义为myWinTR,目的是提高网络结构的稳定性和泛化能力,避免图片过度拟合,同时减少权重参数在运算中的占比,降低计算量,从而提高运算速率。myWinTR模块结构图如图6所示。
当待检测图片中出现距离较远或目标物体积较小等情况时,myWinTR能够更准确地识别和捕捉到目标物,对车辆特征的提取有很大帮助。
在YOLOv5中通常使用FPN结构对Backbone所提取的特征图像进行增强融合处理,使后期对目标物体的提取更加准确。FPN结构如图7所示,FPN使用自上而下的特征融合方法来解决目标识别过程中的多尺度变化问题,但FPN在特征融合的过程中通常会遗漏许多小目标特征信息。
本实验在原有的FPN结构上进行添加改进,使用myC3模块替换FPN框架中的C3模块,并在后续的网络框架中加入myWinTR,最终能够有效提高对距离较远和体积较小的难检测物体的识别效率。
将图片输入层看作第零层,紧邻的卷积为第一层,以此类推。当第4层的myC3输出的图片与第15层的上采样特征图完成拼接操作后,模型会将输出结果重新输入新添加的myC3层中,继续添加卷积神经网络并进行上采样。此时网络框架已经叠加到19层,将得到的结果重新与第2层的myC3输出结果进行拼接操作,得到的输出再一次经历myC3层和卷积神经网络,之后加入myWinTR层,最终需要将各层相互卷积和拼接得到的结果进行整理。
在添加的网络框架中,第一次拼接和第2层的输出是同时进行的,这是因为第2层myC3输出的特征图大小为160×160,它更贴合输入图片的特征信息,未经过多层卷积神经网络的特征图尺度更大,特征更清晰,有助于更高效的识别目标物体。最后加入的Detect层,相当于给每个要输出的myC3层增加一个改变通道数的卷积层,它能使对应的输出通道变为3×(n+5),即使myC3层的层数不同,也能保持不同层输出的图片通道数相同,方便后续实验操作。
选择AU-AIR数据集作为实验数据集,AU-AIR(aerial universal autonomous inspection and recognition) 数据集是第一个用于目标检测的多模态无人机数据集,它满足无人机视觉和机器人技术的要求。实验中使用的图像均采集自AU-AIR原始视频中的截图,通过人工筛选规格为640×640的图像并进行人工标注,随后对图像进行批量旋转,扩充训练数据集,增加样本的多样性和数量,通过旋转可模拟不同角度和视角下的图像变化,使模型能够更好地适应不同的旋转变化。最终得到4 980张图片作为数据集,实验中设定每次随机选择360张图像进行测试。数据集如图8所示。
本实验所使用的操作系统为Windows 11,深度学习框架为Pytorch和TorchVision,GPU的型号是RTX4090(24 GB),CPU的型号是AMD EPYC 7T83。实验选择随机梯度下降算法(SGD算法)更新权值,学习衰减策略采用cosin,网络的求解参数设置如表1所示。
针对遥感图像中车辆目标检测任务,使用如下4种性能评价指标来全面评估模型在遥感图像中的表现。
(1)准确率P。衡量模型检测到的车辆目标中有多少是真正的车辆目标,准确率越高,说明检测过程中误检现象少。通过计算真正例(TP)与所有预测为正例(TP+FP)的比值得出。
P= T P T P + F P
(2)召回率R。衡量所有正确的车辆目标中有多少被正确检测的比例,召回率越高,说明正例越少被漏检。通过计算真正例(TP)与所有正例(TP+FN)的比值得出。
R= T P T P + F N
(3)平均精度mAP。综合评估模型在多个类别上的性能。mAP计算每个类别的AP,并取所有类别AP的平均值,能够反映不同类别的检测结果。
AP= 0 1 P(R)dR
mAP= i = 1 k A P i k ( c l a s s e s )
式中:APi为第i张图片的检测精度;k(classes)为类别的总数。
(4)IoU阈值mAP(mAP@0.5和mAP@0.5-0.95)。由于目标检测任务中使用交并比(intersection over union,IoU)来评估预测结果的正确性,因此可以在不同的IoU 阈值下计算mAP。其中,mAP@0.5表示在IoU 阈值为0.5时的平均精度,mAP@0.5-0.95表示在0.5到0.95的IoU阈值上的平均mAP值。
(5)参数量(Params)。Params衡量模型大小和复杂度,较低的参数量有助于模型轻量化。
(6)每秒传输帧数(FPS)。FPS衡量算法处理速度的重要指标,表示模型每秒可以处理多少帧图像,即图像的刷新频率。FPS越高,模型才能满足实时检测的需求。
为了更好地说明改进算法的优良性,同时探究本文算法Atiny-YOLO中每个改进策略对算法性能的贡献和作用,设计以下消融实验,实验结果如表2所示。
表2可知,Head层中额外增加一个小目标检测层后,由于更多低层特征的语义信息得到保留,有利于确定小目标车辆的位置,相较于原始算法,mAP提升了0.9%,参数计算量增加导致模型在检测速度方面有一些损失,达到120帧/s;将C3模块替换成myC3后,模型获得更丰富的梯度信息且推理速度得到加快,在mAP 提升 0.1%的同时,检测速度也有所提升;得益于myWinTR提高了在复杂情况下对目标的检测率,增加myWinTR模块后,检测精度得到了提升;通过改善特征融合通道,模型的检测速度有了很大提升,最终模型的mAP较基线模型提升了2.9个百分比。
为了验证Atiny-YOLO模型检测性能的优越性,将其与YOLOv5系列基准模型YOLOv5s、YOLOv5l、YOLOv5n、YOLOv5x进行对比。同时,目前主流的一些深度学习的方法在目标检测领域也取得了显著的成绩,选择单阶段模型SSD、双阶段模型Faster R-CNN、轻量模型EfficientDet-D0、MobileNet V3L和TPH-YOLOv5模型与Atiny-YOLO进行对比实验。对上述算法从Precise、Recall、mAP@0.5、mAP@0.5-0.95(%)、参数量(Params)和每秒传输帧数(FPS)6个评价指标进行对比,实验结果如表3所示。
表3可知,Atiny-YOLO的准确率达到了97.2%,对比其他基线模型,准确率均有不同程度的提升,这意味着Atiny-YOLO在识别车辆目标时更加精准,这得益于Atiny-YOLO模型中额外的小目标检测层,减少了误判的可能性;从实验结果可以看出,额外增加小目标层使得Atiny-YOLO的参数量 (Params)有了小幅度的提升,但是YOLOv5s作为轻量模型尽管增加了一定的模型复杂度,检测速度FPS依旧能够满足检测需求,在增加少量参数的同时能够获得显著的性能提升,这说明Atiny-YOLO在参数量和检测性能中实现了平衡,模型的改进是有效的;同时,Atiny-YOLO的mAP达到了95.5%,较YOLOv5s、YOLOv6、Faster R-CNN和MobileNet V3L分别提升了2.9个百分点、1.1个百分点、19个百分点和15.2个百分点,mAP值提升明显,从侧面验证算法能够降低误检,证明了本文算法的有效性;其中TPH-YOLOv5模型是专门针对无人机场景目标信息捕获的算法,本文算法相较于TPH-YOLOv5模型在检测精度和检测速度上均有不同程度的提升。实验过程中还发现双阶段检测模型Faster R-CNN在验证集上的检测效果较差,这是因为Faster R-CNN对小目标数据特征的捕捉能力较差,导致训练过程中欠拟合。
为了能更直观地展现所改进的算法的性能,选择数据集中有代表性的3张图像,图像(a)包含清晰车辆目标,同时图像边缘有少量车辆目标信息;图像(b)包含远距离目标,受天气原因车辆目标不清晰;图像(c)天气状况较好,图像右上角存在大量目标密集,同时目标尺寸较小,对目标检测算法要求最高。图9~图11展示了不同算法在检测过程中的表现。
Faster R-CNN模型能够针对无人机遥感图像中的车辆目标实现检测识别,但针对目标信息缺失,目标展示不完全的情况,Faster R-CNN只能检测到部分目标物,依旧存在大量目标物漏检的情况,同时模型反馈的准确率数值有所下降。
与Faster R-CNN模型相比,在目标物较小、距离较远、天色较昏暗、目标物被不完全遮挡等负面因素影响之下,YOLOv5s模型检测精度和检测效果都得到了显著增强。在Faster R-CNN模型实验中目标漏检情况相比,YOLOv5s模型实验中的这种问题得到了明显的改善,不易检测的目标物精确度系数得到提升,但当图像中存在远距离目标密集时,YOLOv5s同样存在目标漏检情况。
Atiny-YOLO模型在检测过程中对汽车的识别和检测的精确度得到了进一步提升,对于图像边缘的车辆信息捕捉能力较强,尤其是在目标密集场景中表现出色,检测精度较前两种模型均有不同程度的改善,这表明本文改进算法相较于原始算法具有较好的检测能力。
针对图11(c)中存在的不同类型车辆目标,包括厢式货车、轿车和电动代步车等,Atiny-YOLO模型均能进行检测,两个厢式货车的检测置信度分别为0.87和0.84,可见Atiny-YOLO模型对特征信息的抓取能力强,能够在训练过程中学习到车辆目标的有效特征,使之在训练集之外的未知数据上也能表现出色,即模型泛化能力较强。
为了检验Atiny-YOLO模型对动态特性的融合能力,选取一条18 s无人机拍摄视频传入模型进行检测,实验结果如图12中视频截图所示。
图12可知,模型在检测视频过程中没有出现误检和漏检情况,画面中出现的车辆在无遮挡,特征信息显示完全的情况下均能正常检测,由于视频中车辆目标变化快,模型对车辆检测的置信度相对图像检测时较低,但可以基本满足持续追踪视频中车辆目标的实际要求,这说明Atiny-YOLO模型对车辆目标的检测能力较强,在图像和视频流中均有良好表现。
提出一种改进YOLOv5s算法的车辆目标检测算法,有效提升了无人机遥感图像中车辆小目标的检测精度,同时降低了误检和漏检的概率。通过实验对比发现,所改进的Atiny-YOLO模型在实际检测场景中检测效果更好,检测精度相较原始算法有明显提升,同时对检测速度未造成显著影响,基本满足实际检测过程中的要求。Atiny-YOLO模型的泛化性和鲁棒性较好,具有一定的实用价值和工程意义。在后续的研究中,需要对现有的深度神经网络做更多训练,提高车辆目标检测的颗粒度,以此更好的分析交通网络现状。
  • 陕西省自然科学基金基础研究计划项目(2023-JC-YB-601)
  • 西安市科技计划高校院所人才服务企业项目(23GXFW0077)
  • 西安石油大学研究生精品课程建设项目(2023-X-YKC-003)
参考文献 引证文献
排序方式:
[1]
欧阳凯. 基于测绘工程测量中无人机遥感技术运用[J]. 工程建设与设计, 2022(22): 96-98.
Ouyang Kai. Application of UAV remote sensing technology in surveying and mapping based engineering measurement[J]. Engineering Construction and Design, 2022(22): 96-98.
[2]
胡义强, 杨骥, 荆文龙. 基于无人机遥感的海岸带生态环境监测研究综述[J]. 测绘报, 2022(6): 18-24.
Hu Yiqiang, Yang Ji, Jing Wenlong. A review of coastal zone ecological environment monitoring research based on UAV remote sensing[J]. Surveying and Mapping Journal, 2022(6): 18-24.
[3]
Qin J X, Yang W J, Wu T, et al. Incremental road network update method with trajectory data and UAV remote sensing imagery[J]. ISPRS International Journal of Geo-Information, 2022, 11(10): 502.
[4]
Li J, Wei X. Research on efficient detection network method for remote sensing images based on self attention mechanism[J]. Image and Vision Computing, 2024, 142: 104884.
[5]
Nan H H, Tianyi Z, Tung C Y, et al. Image segmentation using transfer learning and Fast R-CNN for diabetic foot wound treatments[J]. Frontiers in Public Health, 2022, 10: 969846.
[6]
Yang Y. Vehicle target detection algorithm based on improved faster R-CNN for remote sensing images[J]. Journal of Artificial Intelligence Practice, 2024, 7(1): 7958815.
[7]
Yang X, Xiu J, Liu X. Research on improved YOLOv5 vehicle target detection algorithm in aerial images[J]. Drones, 2024, 8(5): 103390.
[8]
Zhao X, Wang Q, Zhang M, et al. CSFF-YOLOv5: improved YOLOv5 based on channel split and feature fusion in femoral neck fracture detection[J]. Internet of Things, 2024, 26: 101190.
[9]
Yuan H, Lu Z, Zhang R, et al. An effective graph embedded YOLOv5 model for forest fire detection[J]. Computational Intelligence, 2024, 40(2): DOI: 10.1111/coin.12640.
[10]
Wang H L, Qi H M, Feng S, et al. L-SSD: lightweight SSD target detection based on depth-separable convolution[J]. Journal of Real-Time Image Processing, 2024, 21(2): 2547-2561.
[11]
袁小平, 马绪起, 刘赛. 改进YOLOv3的行人车辆目标检测算法[J]. 科学技术与工程, 2021, 21(8): 3192-3198.
Yuan Xiaoping, Ma Xuqi, Liu Sai. Improved pedestrian-vehicle target detection algorithm for YOLOv3[J]. Science, Technology and Engineering, 2021, 21(8): 3192-3198.
[12]
姜淙文, 金立左. Vehicle-YOLO——一种基于航拍影像的车辆检测模型[J]. 微型电脑应用, 2023, 39(9): 134-137.
Jiang Congwen, Jin Lizuo. Vehicle-YOLO: a vehicle detection model based on aerial images[J]. Microcomputer Applications, 2023, 39(9): 134-137.
[13]
涂媛雅, 汤国放, 张建勋. Lite-YOLOv3轻量级行人与车辆检测网络[J]. 小型微型计算机系统, 2023, 44(1): 211-217.
Tu Yuanya, Tang Guofang, Zhang Jianxun. Lite-YOLOv3 lightweight pedestrian and vehicle detection network[J]. Small Microcomputer Systems, 2023, 44(1): 211-217.
[14]
Dong X, Tong F X, Yang G, et al. A detection method of spangle defects on zinc-coated steel surfaces based on improved YOLOv5[J]. The International Journal of Advanced Manufacturing Technology, 2023, 128(1/2): 937-951.
[15]
Yan Y, Liu Z, Xu J, et al. A temperature-decoupled impedance-based mass sensing using CBAM-CNN and adaptive weighted average preprocessing with high accuracy[J]. Mechanical Systems and Signal Processing, 2024, 21: 3111347.
[16]
Jin Y F, Gong D X, Zhao S Y, et al. A real-time fire detection method from video for electric vehicle-charging stations based on improved YOLOX-tiny[J]. Journal of Real-Time Image Processing, 2023, 20(3): 101007.
[17]
Xue L S, Du S H, Wu H T, et al. Defect signal intelligent recognition of weld radiographs based on YOLOv5-improvement[J]. Journal of Manufacturing Processes, 2023, 99: 373-381.
[18]
蒲玲玲, 杨柳. 改进YOLOv5的多车辆目标实时检测及跟踪算法[J]. 科学技术与工程, 2023, 23(28): 12159-12167.
Pu Lingling, Yang Liu. Improved YOLOv5 real-time multi-vehicle target detection and tracking algorithm[J]. Science Technology and Engineering, 2023, 23(28): 12159-12167.
[19]
王璨, 武新慧, 张燕青, 等. 基于移位窗口Transformer网络的玉米田间场景下杂草识别[J]. 农业工程学报, 2022, 38(15): 133-142.
Wang Can, Wu Xinhui, Zhang Yanqing, et al. Weed recognition in corn field scenarios based on shift-window transformer network[J]. Journal of Agricultural Engineering, 2022, 38(15): 133-142.
2025年第25卷第12期
PDF下载
390
159
引用本文
BibTeX
文章信息
doi: 10.12404/j.issn.1671-1815.2403707
  • 接收时间:2024-05-19
  • 首发时间:2025-07-09
  • 出版时间:2025-04-28
补充材料
相关文章
文章信息
作者
出版历史
  • 收稿日期:2024-05-19
  • 修回日期:2025-01-28
基金
陕西省自然科学基金基础研究计划项目(2023-JC-YB-601)
西安市科技计划高校院所人才服务企业项目(23GXFW0077)
西安石油大学研究生精品课程建设项目(2023-X-YKC-003)
作者信息
    西安石油大学计算机学院, 西安 710065

通讯作者:

* 王梦婷(2000—),女,汉族,河南驻马店人,硕士研究生。研究方向:目标检测。E-mail:
参考文献
分享链接
https://castjournals.cast.org.cn/joweb/kxjsygc/CN/10.12404/j.issn.1671-1815.2403707
分享至
全文二维码

扫描看全文

引用本文
BibTeX
本文的引用情况
2种不同金属材料的力学参数

Family
属数
Number of
genus
种数
Number of
species
占总种数比例
Percentage of
total species (%)

Genus
种数
Number of
species
占总种数比例
Percentage of total
species (%)
鹅膏菌科Amanitaceae 2 11 5.26 鹅膏菌属 Amanita 10 4.78
小菇科 Mycenaceae 2 12 5.74 丝盖伞属 Inocybe 5 2.39
多孔菌科 Polyporaceae 8 14 6.70 蜡蘑属 Laccaria 5 2.39
红菇科 Russulaceae 3 23 11.00 小皮伞属 Marasmius 6 2.87
小菇属 Mycena 11 5.26
光柄菇属 Pluteus 5 2.39
红菇属 Russula 17 8.13
栓菌属 Trametes 5 2.39
关闭全屏