Article(id=1251505541227430493, tenantId=1146029695717560320, journalId=1251233954884272221, issueId=1251505536634667461, articleNumber=null, orderNo=null, doi=10.13682/j.issn.2095-6533.2025.06.011, pmid=null, cstr=null, oa=null, hot=null, price=null, onlineType=0, articleFormat=0, articleType=null, articleTypeStr=null, receivedDate=1728576000000, receivedDateStr=2024-10-11, revisedDate=null, revisedDateStr=null, acceptedDate=null, acceptedDateStr=null, onlineDate=1776311772876, onlineDateStr=2026-04-16, pubDate=1762704000000, pubDateStr=2025-11-10, doiRegisterDate=null, doiRegisterDateStr=null, onlineIssueDate=1776311772876, onlineIssueDateStr=2026-04-16, onlineJustAcceptDate=null, onlineJustAcceptDateStr=null, onlineFirstDate=null, onlineFirstDateStr=null, sourceXml=null, magXml=null, createTime=1776311772876, creator=13701087609, updateTime=1776311772876, updator=13701087609, issue=Issue{id=1251505536634667461, tenantId=1146029695717560320, journalId=1251233954884272221, year='2025', volume='30', issue='6', pageStart='1', pageEnd='130', issueExtLink='null', onlineDate='null', pubDate='null', beforeIssueId=null, nextIssueId=null, price=null, status=1, issueComplete=1, articleOrder=1, issueType=1, specialIssue=null, createTime=1776311771782, creator=13701087609, updateTime=1776311824541, updator=13701087609, preIssue=null, nextIssue=null, ext={EN=IssueExt(id=1251505758014226723, tenantId=1146029695717560320, journalId=1251233954884272221, issueId=1251505536634667461, language=EN, specialIssueTitle=, coverIllustrator=null, specialIssueEditor=, specialIssueAbout=), CN=IssueExt(id=1251505758014226724, tenantId=1146029695717560320, journalId=1251233954884272221, issueId=1251505536634667461, language=CN, specialIssueTitle=, coverIllustrator=null, specialIssueEditor=, specialIssueAbout=)}, issueFiles=null}, startPage=94, endPage=103, ext={EN=ArticleExt(id=1251505541416174182, articleId=1251505541227430493, tenantId=1146029695717560320, journalId=1251233954884272221, language=EN, title=Multi-branch fusion self-attention object detection algorithm for remote sensing images, columnId=null, journalTitle=Journal of Xi'an University of Posts and Telecommunications, columnName=null, runingTitle=null, highlight=null, articleAbstract=

To address the challenges of scale variation and dense object distribution in remote sensing imagery caused by varying imaging angles,a novel object detection algorithm is proposed based on multi-branch fusion self-attention(MFS).A multi-branch module that integrates convolutional and self-attention mechanisms is designed to build a feature extraction network,and the fourth detection head is built for small objects to facilitate multi-scale feature fusion.Meanwhile,the resulting model is pruned by the DepGraph method to achieve a lightweight architecture.Experiments on the DOTA and NWPU VHR-10 datasets demonstrate that the proposed algorithm achieves mean average precision(mAP)scores of 77.7%and 96.5%respectively,outperforming the peer detectors of similar algorithm complexity.Notably,the pruned version maintains a mAP of 72.9%on DOTA,with only 6.64 million parameters.

, correspAuthors=null, authorNote=null, correspAuthorsNote=null, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=null, magXml=null, pdfUrl=null, pdf=null, pdfFileSize=null, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=null, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=null, mapNumber=null, authorCompany=null, fund=null, authors=null, authorsList=Hongbo KANG, Jiazheng WEN, Chunjie YANG, Wenqing WANG), CN=ArticleExt(id=1251505545996354391, articleId=1251505541227430493, tenantId=1146029695717560320, journalId=1251233954884272221, language=CN, title=多元分支融合自注意力的遥感图像目标检测算法, columnId=1251505537641300427, journalTitle=西安邮电大学学报, columnName=人工智能目标检测, runingTitle=null, highlight=null, articleAbstract=

针对遥感拍摄目标角度变化导致图像中的检测目标尺度多样且密集分布难以准确检测的问题,提出一种多元分支融合自注意力(Multi Branch Fusion Self-Attention,MFS)的遥感图像目标检测算法。该算法先设计由卷积和自注意力机制组成的多分支模块,形成特征提取网络,再建立针对小物体的第4个检测头,旨在融合不同尺度的特征。同时,利用DepGraph剪枝方法进行剪枝,降低参数规模使其轻量化。实验结果表明,所提算法在航拍图像(Dataset for Object deTection in Aerial Image,DOTA)数据集和NWPU VHR-10(Northwestern Polytechnical University Very High Resolution-10)数据集的平均准确率分别为77.7%和96.5%,优于同等参数规模的检测算法。特别是在剪枝后,参数规模仅有6.64M的情况下,所提算法对DOTA数据集检测精度可以保持在72.9%。

, correspAuthors=null, authorNote=null, correspAuthorsNote=null, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=jZrfr0OfFhgVXqgJYFftlQ==, magXml=AG33xe0rsP9i7do41euFwQ==, pdfUrl=null, pdf=In/gbzHqsyjUUJfEYnjvUQ==, pdfFileSize=3964315, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=nBHvHbf0H4vv+56Gzg/qPQ==, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=PfZLOm6RRcVMHgxmPIrW1A==, mapNumber=null, authorCompany=null, fund=null, authors=

亢红波(1974-),男,陕西凤翔人,硕士,西安邮电大学副教授,主要研究方向为智能控制与信息集成。E-mail:

温家正(1998-),男,陕西榆林人,西安邮电大学硕士研究生,主要研究方向为遥感图像智能处理。E-mail:

杨春杰(1976-),女,甘肃白银人,硕士,西安邮电大学副教授,主要研究方向为智能控制、物联网技术、嵌入式系统开发等。E-mail:

王文庆(1964-),男,北京房山人,博士后,西安邮电大学教授,主要研究方向为复杂系统结构分析与鲁棒控制、智能信息处理、信息系统分析等。E-mail:

, authorsList=亢红波, 温家正, 杨春杰, 王文庆)}, authors=[Author(id=1251505546357064552, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, orderNo=0, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=khb2000@xupt.edc.cn, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1251505546424173420, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, authorId=1251505546357064552, language=EN, stringName=Hongbo KANG, firstName=Hongbo, middleName=null, lastName=KANG, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=School of Artificial Intelligence,School of Automation,Xi'an University of Posts and Telecommunications,Xi'an 710121,China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1251505546512253811, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, authorId=1251505546357064552, language=CN, stringName=亢红波, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=西安邮电大学人工智能学院、自动化学院,陕西西安 710121, bio={"img":"ziQ4n/gnZHOWn9akWWm4Jw==","content":"

亢红波(1974-),男,陕西凤翔人,硕士,西安邮电大学副教授,主要研究方向为智能控制与信息集成。E-mail:

"}, bioImg=ziQ4n/gnZHOWn9akWWm4Jw==, bioContent=

亢红波(1974-),男,陕西凤翔人,硕士,西安邮电大学副教授,主要研究方向为智能控制与信息集成。E-mail:

, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1251505546227041121, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, xref=null, ext=[AuthorCompanyExt(id=1251505546235429730, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, companyId=1251505546227041121, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Artificial Intelligence,School of Automation,Xi'an University of Posts and Telecommunications,Xi'an 710121,China), AuthorCompanyExt(id=1251505546243818339, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, companyId=1251505546227041121, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=西安邮电大学人工智能学院、自动化学院,陕西西安 710121)])]), Author(id=1251505546583556984, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, orderNo=1, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=wjz19980220@163.com, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1251505546667443070, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, authorId=1251505546583556984, language=EN, stringName=Jiazheng WEN, firstName=Jiazheng, middleName=null, lastName=WEN, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=School of Artificial Intelligence,School of Automation,Xi'an University of Posts and Telecommunications,Xi'an 710121,China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1251505546755523456, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, authorId=1251505546583556984, language=CN, stringName=温家正, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=西安邮电大学人工智能学院、自动化学院,陕西西安 710121, bio={"img":"NuuMuIGV0eyfzl+flRaTOA==","content":"

温家正(1998-),男,陕西榆林人,西安邮电大学硕士研究生,主要研究方向为遥感图像智能处理。E-mail:

"}, bioImg=NuuMuIGV0eyfzl+flRaTOA==, bioContent=

温家正(1998-),男,陕西榆林人,西安邮电大学硕士研究生,主要研究方向为遥感图像智能处理。E-mail:

, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1251505546227041121, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, xref=null, ext=[AuthorCompanyExt(id=1251505546235429730, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, companyId=1251505546227041121, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Artificial Intelligence,School of Automation,Xi'an University of Posts and Telecommunications,Xi'an 710121,China), AuthorCompanyExt(id=1251505546243818339, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, companyId=1251505546227041121, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=西安邮电大学人工智能学院、自动化学院,陕西西安 710121)])]), Author(id=1251505546872963975, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, orderNo=2, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=ycj@xupt.edu.cn, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1251505546961044365, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, authorId=1251505546872963975, language=EN, stringName=Chunjie YANG, firstName=Chunjie, middleName=null, lastName=YANG, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=School of Artificial Intelligence,School of Automation,Xi'an University of Posts and Telecommunications,Xi'an 710121,China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1251505547053319054, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, authorId=1251505546872963975, language=CN, stringName=杨春杰, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=西安邮电大学人工智能学院、自动化学院,陕西西安 710121, bio={"img":"iZDJKcbrcrADgXZ8ICWNrw==","content":"

杨春杰(1976-),女,甘肃白银人,硕士,西安邮电大学副教授,主要研究方向为智能控制、物联网技术、嵌入式系统开发等。E-mail:

"}, bioImg=iZDJKcbrcrADgXZ8ICWNrw==, bioContent=

杨春杰(1976-),女,甘肃白银人,硕士,西安邮电大学副教授,主要研究方向为智能控制、物联网技术、嵌入式系统开发等。E-mail:

, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1251505546227041121, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, xref=null, ext=[AuthorCompanyExt(id=1251505546235429730, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, companyId=1251505546227041121, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Artificial Intelligence,School of Automation,Xi'an University of Posts and Telecommunications,Xi'an 710121,China), AuthorCompanyExt(id=1251505546243818339, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, companyId=1251505546227041121, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=西安邮电大学人工智能学院、自动化学院,陕西西安 710121)])]), Author(id=1251505547141399444, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, orderNo=3, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=wwq@xupt.edu.cn, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1251505547225285532, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, authorId=1251505547141399444, language=EN, stringName=Wenqing WANG, firstName=Wenqing, middleName=null, lastName=WANG, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=School of Artificial Intelligence,School of Automation,Xi'an University of Posts and Telecommunications,Xi'an 710121,China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1251505547304977313, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, authorId=1251505547141399444, language=CN, stringName=王文庆, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=西安邮电大学人工智能学院、自动化学院,陕西西安 710121, bio={"img":"wfcIMdDHWHwT5/Wt3e/0aw==","content":"

王文庆(1964-),男,北京房山人,博士后,西安邮电大学教授,主要研究方向为复杂系统结构分析与鲁棒控制、智能信息处理、信息系统分析等。E-mail:

"}, bioImg=wfcIMdDHWHwT5/Wt3e/0aw==, bioContent=

王文庆(1964-),男,北京房山人,博士后,西安邮电大学教授,主要研究方向为复杂系统结构分析与鲁棒控制、智能信息处理、信息系统分析等。E-mail:

, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1251505546227041121, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, xref=null, ext=[AuthorCompanyExt(id=1251505546235429730, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, companyId=1251505546227041121, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Artificial Intelligence,School of Automation,Xi'an University of Posts and Telecommunications,Xi'an 710121,China), AuthorCompanyExt(id=1251505546243818339, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, companyId=1251505546227041121, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=西安邮电大学人工智能学院、自动化学院,陕西西安 710121)])])], keywords=[Keyword(id=1251505547430806439, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=EN, orderNo=1, keyword=target detection), Keyword(id=1251505547535664044, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=EN, orderNo=2, keyword=remote sensing images), Keyword(id=1251505547623744433, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=EN, orderNo=3, keyword=multivariate branching), Keyword(id=1251505547707630519, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=EN, orderNo=4, keyword=self-attention), Keyword(id=1251505547787322300, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=EN, orderNo=5, keyword=lightweight), Keyword(id=1251505547862819775, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=CN, orderNo=1, keyword=目标检测), Keyword(id=1251505547946705859, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=CN, orderNo=2, keyword=遥感图像), Keyword(id=1251505548051563462, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=CN, orderNo=3, keyword=多元分支), Keyword(id=1251505548135449545, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=CN, orderNo=4, keyword=自注意力), Keyword(id=1251505548231918539, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=CN, orderNo=5, keyword=轻量化)], refs=[Reference(id=1251505550329069625, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2016, volume=117, issue=null, pageStart=11, pageEnd=28, url=null, language=null, rfNumber=[1], rfOrder=0, authorNames=CHENG G, HAN J W, journalName=ISPRS Journal of Photogrammetry and Remote Sensing, refType=null, unstructuredReference=CHENG G,HAN J W.Asurvey on object detection in optical remote sensing images[J].ISPRS Journal of Photogrammetry and Remote Sensing, 2016, 117:11-28., articleTitle=Asurvey on object detection in optical remote sensing images, refAbstract=null), Reference(id=1251505550425538619, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2023, volume=36, issue=8, pageStart=269, pageEnd=283, url=null, language=null, rfNumber=[2], rfOrder=1, authorNames=WANG H N, LI Y, FANG Y Q, journalName=Chinese Journal of Aeronautics, refType=null, unstructuredReference=WANG H N,LI Y,FANG Y Q,et al.SRS-Net:Training object detectors from scratch for remote sensing images without pretraining[J].Chinese Journal of Aeronautics,2023,36(8):269-283., articleTitle=SRS-Net:Training object detectors from scratch for remote sensing images without pretraining, refAbstract=null), Reference(id=1251505550635253823, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2023, volume=31, issue=15, pageStart=2295, pageEnd=2318, url=null, language=null, rfNumber=[3], rfOrder=2, authorNames=黄泽贤, 吴凡路, 傅瑶, journalName=光学精密工程, refType=null, unstructuredReference=黄泽贤,吴凡路,傅瑶,.基于深度学习的遥感图像舰船目标检测算法综述[J].光学精密工程,2023, 31(15):2295-2318., articleTitle=基于深度学习的遥感图像舰船目标检测算法综述, refAbstract=null), Reference(id=1251505550731722818, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2023, volume=31, issue=15, pageStart=2295, pageEnd=2318, url=null, language=null, rfNumber=[3], rfOrder=3, authorNames=HUANG Z X, WU F L, FU Y, journalName=Optics and Precision Engineering, refType=null, unstructuredReference=HUANG Z X,WU F L,FU Y,et al.Review of deep learning-based algorithms for ship target detection from remote sensing images[J].Optics and Precision Engineering,2023,31(15):2295-2318.(in Chinese), articleTitle=Review of deep learning-based algorithms for ship target detection from remote sensing images, refAbstract=null), Reference(id=1251505550903689286, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2022, volume=81, issue=13, pageStart=18091, pageEnd=18103, url=null, language=null, rfNumber=[4], rfOrder=4, authorNames=ZHANG Y, SONG C L, ZHANG D W, journalName=Multimedia Tools and Applications, refType=null, unstructuredReference=ZHANG Y,SONG C L,ZHANG D W.Small-scale aircraft detection in remote sensing images based on Faster-RCNN[J].Multimedia Tools and Applications, 2022,81(13):18091-18103., articleTitle=Small-scale aircraft detection in remote sensing images based on Faster-RCNN, refAbstract=null), Reference(id=1251505551004352588, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2022, volume=37, issue=9, pageStart=2305, pageEnd=2313, url=null, language=null, rfNumber=[5], rfOrder=5, authorNames=张哲益, 曹卫华, 朱蕊, journalName=控制与决策, refType=null, unstructuredReference=张哲益,曹卫华,朱蕊,.基于脉冲卷积神经网络稀疏表征的高分辨率遥感图像场景分类方法[J].控制与决策,2022,37(9):2305-2313., articleTitle=基于脉冲卷积神经网络稀疏表征的高分辨率遥感图像场景分类方法, refAbstract=null), Reference(id=1251505551088238672, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2022, volume=37, issue=9, pageStart=2305, pageEnd=2313, url=null, language=null, rfNumber=[5], rfOrder=6, authorNames=ZHANG Z Y, CAO W H, ZHU R, journalName=Control and Decision, refType=null, unstructuredReference=ZHANG Z Y,CAO W H,ZHU R,et al.Sparse representation with spike convolutional neural networks for scene classification of remote sensing images of high resolution[J].Control and Decision,2022,37 (9):2305-2313.(in Chinese), articleTitle=Sparse representation with spike convolutional neural networks for scene classification of remote sensing images of high resolution, refAbstract=null), Reference(id=1251505551180513364, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2022, volume=45, issue=4, pageStart=735, pageEnd=747, url=null, language=null, rfNumber=[6], rfOrder=7, authorNames=谢星星, 程塨, 姚艳清, journalName=计算机学报, refType=null, unstructuredReference=谢星星,程塨,姚艳清,.动态特征融合的遥感图像目标检测[J].计算机学报,2022,45(4):735-747., articleTitle=动态特征融合的遥感图像目标检测, refAbstract=null), Reference(id=1251505551260205141, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2022, volume=45, issue=4, pageStart=735, pageEnd=747, url=null, language=null, rfNumber=[6], rfOrder=8, authorNames=XIE X X, CHENG G, YAO Y Q, journalName=Chinese Journal of Computers, refType=null, unstructuredReference=XIE X X,CHENG G,YAO Y Q,et al.Dynamic feature fusion for object detection in remote sensing images[J].Chinese Journal of Computers,2022,45(4):735-747.(in Chinese), articleTitle=Dynamic feature fusion for object detection in remote sensing images, refAbstract=null), Reference(id=1251505551356674135, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2024, volume=39, issue=2, pageStart=381, pageEnd=390, url=null, language=null, rfNumber=[7], rfOrder=9, authorNames=周葳楠, 吴治海, 张正道, journalName=控制与决策, refType=null, unstructuredReference=周葳楠,吴治海,张正道,.基于弱特征增强的轻量化小目标检测方法[J].控制与决策,2024,39(2):381-390., articleTitle=基于弱特征增强的轻量化小目标检测方法, refAbstract=null), Reference(id=1251505551432171612, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2024, volume=39, issue=2, pageStart=381, pageEnd=390, url=null, language=null, rfNumber=[7], rfOrder=10, authorNames=ZHOU W N, WU Z H, ZHANG Z D, journalName=Control and Decision, refType=null, unstructuredReference=ZHOU W N,WU Z H,ZHANG Z D,et al.Lightweight small target detection method based on weak feature enhancement[J].Control and Decision,2024, 39(2):381-390.(in Chinese), articleTitle=Lightweight small target detection method based on weak feature enhancement, refAbstract=null), Reference(id=1251505551511863390, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2023, volume=38, issue=1, pageStart=239, pageEnd=247, url=null, language=null, rfNumber=[8], rfOrder=11, authorNames=严春满, 王铖, journalName=控制与决策, refType=null, unstructuredReference=严春满,王铖.基于特征增强的SAR图像舰船小目标检测算法[J].控制与决策,2023,38(1):239-247., articleTitle=基于特征增强的SAR图像舰船小目标检测算法, refAbstract=null), Reference(id=1251505551595749473, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2023, volume=38, issue=1, pageStart=239, pageEnd=247, url=null, language=null, rfNumber=[8], rfOrder=12, authorNames=YAN C M, WANG C, journalName=Control and Decision, refType=null, unstructuredReference=YAN C M,WANG C.Aship small target detection algorithm based on feature enhancement in SAR image[J].Control and Decision,2023,38(1):239-247.(in Chinese), articleTitle=Aship small target detection algorithm based on feature enhancement in SAR image, refAbstract=null), Reference(id=1251505551679635557, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2023, volume=48, issue=6, pageStart=104, pageEnd=111, url=null, language=null, rfNumber=[9], rfOrder=13, authorNames=田中原, journalName=测绘科学, refType=null, unstructuredReference=田中原.遥感图像多尺度目标的轻量化检测方法[J].测绘科学,2023,48(6):104-111., articleTitle=遥感图像多尺度目标的轻量化检测方法, refAbstract=null), Reference(id=1251505551759327337, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2023, volume=48, issue=6, pageStart=104, pageEnd=111, url=null, language=null, rfNumber=[9], rfOrder=14, authorNames=TIAN Z Y, journalName=Science of Surveying and Mapping, refType=null, unstructuredReference=TIAN Z Y.Lightweight detection method for multiscale objects in remote sensing images[J].Science of Surveying and Mapping,2023,48(6):104-111.(in Chinese), articleTitle=Lightweight detection method for multiscale objects in remote sensing images, refAbstract=null), Reference(id=1251505551839019116, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2024, volume=16, issue=1, pageStart=51, pageEnd=68, url=null, language=null, rfNumber=[10], rfOrder=15, authorNames=AHMED M, EL-SHEIMY N, LEUNG H, journalName=Remote Sensing, refType=null, unstructuredReference=AHMED M,EL-SHEIMY N,LEUNG H,et al.Enhancing object detection in remote sensing:A hybrid YOLOv7 and transformer approach with automatic model selection[J].Remote Sensing,2024,16(1):51-68., articleTitle=Enhancing object detection in remote sensing:A hybrid YOLOv7 and transformer approach with automatic model selection, refAbstract=null), Reference(id=1251505551943876718, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2024, volume=32, issue=4, pageStart=186, pageEnd=190, url=null, language=null, rfNumber=[11], rfOrder=16, authorNames=赵同祥, 张瑞全, 高树静, journalName=电子设计工程, refType=null, unstructuredReference=赵同祥,张瑞全,高树静,.基于DN-YOLOv5遥感目标快速检测方法[J].电子设计工程,2024,32(4):186-190., articleTitle=基于DN-YOLOv5遥感目标快速检测方法, refAbstract=null), Reference(id=1251505552065511538, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2024, volume=32, issue=4, pageStart=186, pageEnd=190, url=null, language=null, rfNumber=[11], rfOrder=17, authorNames=ZHAO T X, ZHANG R Q, GAO S J, journalName=Electronic Design Engineering, refType=null, unstructuredReference=ZHAO T X,ZHANG R Q,GAO S J,et al.A fast detection method of remote sensing target based on DN-YOLOv5[J].Electronic Design Engineering,2024,32 (4):186-190.(in Chinese), articleTitle=A fast detection method of remote sensing target based on DN-YOLOv5, refAbstract=null), Reference(id=1251505552136814709, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2023, volume=31, issue=14, pageStart=137, pageEnd=141, url=null, language=null, rfNumber=[12], rfOrder=18, authorNames=庄文华, 唐晓刚, 张斌权, journalName=电子设计工程, refType=null, unstructuredReference=庄文华,唐晓刚,张斌权,.基于改进YOLOv5的遥感图像旋转框目标检测[J].电子设计工程,2023,31 (14):137-141., articleTitle=基于改进YOLOv5的遥感图像旋转框目标检测, refAbstract=null), Reference(id=1251505552224895097, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2023, volume=31, issue=14, pageStart=137, pageEnd=141, url=null, language=null, rfNumber=[12], rfOrder=19, authorNames=ZHUANG W H, TANG X G, ZHANG B Q, journalName=Electronic Design Engineering, refType=null, unstructuredReference=ZHUANG W H,TANG X G,ZHANG B Q,et al. Remote sensing image rotatable bounding box object detection based on improved YOLOv5[J].Electronic Design Engineering,2023,31(14):137-141.(in Chinese), articleTitle=Remote sensing image rotatable bounding box object detection based on improved YOLOv5, refAbstract=null), Reference(id=1251505552308781180, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2023, volume=30, issue=4, pageStart=6, pageEnd=11, url=null, language=null, rfNumber=[13], rfOrder=20, authorNames=兰旭婷, 郭中华, 石甜甜, journalName=电光与控制, refType=null, unstructuredReference=兰旭婷,郭中华,石甜甜,.融合SPP与FPN的光学遥感图像飞机目标检测[J].电光与控制,2023,30 (4):6-11., articleTitle=融合SPP与FPN的光学遥感图像飞机目标检测, refAbstract=null), Reference(id=1251505552413638784, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2023, volume=30, issue=4, pageStart=6, pageEnd=11, url=null, language=null, rfNumber=[13], rfOrder=21, authorNames=LAN X T, GUO Z H, SHI T T, journalName=Electronics Optics & Control, refType=null, unstructuredReference=LAN X T,GUO Z H,SHI T T,et al.Aircraft target detection in optical remote sensing images by fusing SPP and FPN[J]. Electronics Optics & Control, 2023,30(4):6-11.(in Chinese), articleTitle=Aircraft target detection in optical remote sensing images by fusing SPP and FPN, refAbstract=null), Reference(id=1251505552493330560, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2023, volume=null, issue=null, pageStart=7464, pageEnd=7475, url=null, language=null, rfNumber=[14], rfOrder=22, authorNames=WANG C Y, BOCHKOVSKIY A, LIAO H M, journalName=null, refType=null, unstructuredReference=WANG C Y, BOCHKOVSKIY A, LIAO H M. YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Vancouver:IEEE,2023:7464-7475., articleTitle=YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors, refAbstract=null), Reference(id=1251505552556245121, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2018, volume=null, issue=null, pageStart=8759, pageEnd=8768, url=null, language=null, rfNumber=[15], rfOrder=23, authorNames=LIU S, QI L, QIN H F, journalName=null, refType=null, unstructuredReference=LIU S,QI L,QIN H F,et al.Path aggregation network for instance segmentation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:8759-8768., articleTitle=Path aggregation network for instance segmentation, refAbstract=null), Reference(id=1251505552631742596, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2021, volume=null, issue=null, pageStart=13728, pageEnd=13737, url=null, language=null, rfNumber=[16], rfOrder=24, authorNames=DING X H, ZHANG X Y, MA N N, journalName=null, refType=null, unstructuredReference=DING X H, ZHANG X Y, MA N N, et al. RepVGG:Making VGG-style ConvNets great again[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:13728-13737., articleTitle=RepVGG:Making VGG-style ConvNets great again, refAbstract=null), Reference(id=1251505552707240070, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2024-10-13, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[17], rfOrder=25, authorNames=ZHANG H, XU C, ZHANG S J, journalName=null, refType=null, unstructuredReference=ZHANG H,XU C,ZHANG S J.Inner-IoU:More effective intersection over union loss with auxiliary bounding box[EB/OL].[2024-10-13].https://arxiv.org/abs/2311.02877., articleTitle=Inner-IoU:More effective intersection over union loss with auxiliary bounding box, refAbstract=null), Reference(id=1251505552778543239, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2023, volume=null, issue=null, pageStart=16091, pageEnd=16101, url=null, language=null, rfNumber=[18], rfOrder=26, authorNames=FANG G F, MA X Y, SONG M L, journalName=null, refType=null, unstructuredReference=FANG G F,MA X Y,SONG M L,et al.DepGraph:Towards any structural pruning[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Vancouver:IEEE,2023:16091-16101., articleTitle=DepGraph:Towards any structural pruning, refAbstract=null), Reference(id=1251505552875012233, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2017, volume=null, issue=null, pageStart=2755, pageEnd=2763, url=null, language=null, rfNumber=[19], rfOrder=27, authorNames=LIU Z, LI J G, SHEN Z Q, journalName=null, refType=null, unstructuredReference=LIU Z,LI J G,SHEN Z Q,et al.Learning efficient convolutional networks through network slimming[C]//2017 IEEE International Conference on Computer Vision.Venice:IEEE,2017:2755-2763., articleTitle=Learning efficient convolutional networks through network slimming, refAbstract=null), Reference(id=1251505552971481229, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2018, volume=null, issue=null, pageStart=3974, pageEnd=3983, url=null, language=null, rfNumber=[20], rfOrder=28, authorNames=XIA G S, BAI X, DING J, journalName=null, refType=null, unstructuredReference=XIA G S,BAI X,DING J,et al.DOTA:A largescale dataset for object detection in aerial images[C]//2018IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:3974-3983., articleTitle=DOTA:A largescale dataset for object detection in aerial images, refAbstract=null), Reference(id=1251505553042784397, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2014, volume=98, issue=null, pageStart=119, pageEnd=132, url=null, language=null, rfNumber=[21], rfOrder=29, authorNames=CHENG G, HAN J W, ZHOU P C, journalName=IS-PRS Journal of Photogrammetry and Remote Sensing, refType=null, unstructuredReference=CHENG G,HAN J W,ZHOU P C,et al.Multi-class geospatial object detection and geographic image classification based on collection of part detectors[J].IS-PRS Journal of Photogrammetry and Remote Sensing, 2014,98:119-132., articleTitle=Multi-class geospatial object detection and geographic image classification based on collection of part detectors, refAbstract=null), Reference(id=1251505553130864783, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2021, volume=14, issue=null, pageStart=5786, pageEnd=5795, url=null, language=null, rfNumber=[22], rfOrder=30, authorNames=RAN Q, WANG Q, ZHAO B Y, journalName=IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, refType=null, unstructuredReference=RAN Q,WANG Q,ZHAO B Y,et al.Lightweight oriented object detection using multiscale context and enhanced channel attention in remote sensing images[J].IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14:5786-5795., articleTitle=Lightweight oriented object detection using multiscale context and enhanced channel attention in remote sensing images, refAbstract=null), Reference(id=1251505553223139472, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2021, volume=null, issue=null, pageStart=2778, pageEnd=2788, url=null, language=null, rfNumber=[23], rfOrder=31, authorNames=ZHU X K, LYU S C, WANG X, journalName=null, refType=null, unstructuredReference=ZHU X K, LYU S C, WANG X, et al. TPHYOLOv5:Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]//2021 IEEE/CVF International Conference on Computer Vision Workshops. Montreal:IEEE,2021:2778-2788., articleTitle=TPHYOLOv5:Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios, refAbstract=null), Reference(id=1251505553332191379, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2016, volume=null, issue=null, pageStart=21, pageEnd=37, url=null, language=null, rfNumber=[24], rfOrder=32, authorNames=LIU W, ANGUELOV D, ERHAN D, journalName=null, refType=null, unstructuredReference=LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single shot MultiBox detector[C]//Computer Vision-ECCV 2016.Cham:Springer,2016:21-37., articleTitle=SSD:Single shot MultiBox detector, refAbstract=null), Reference(id=1251505553453826198, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2017, volume=null, issue=null, pageStart=2999, pageEnd=3007, url=null, language=null, rfNumber=[25], rfOrder=33, authorNames=LIN T Y, GOYAL P, GIRSHICK R, journalName=null, refType=null, unstructuredReference=LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//2017 IEEE International Conference on Computer Vision. Venice:IEEE, 2017:2999-3007., articleTitle=Focal loss for dense object detection, refAbstract=null), Reference(id=1251505553537712282, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2023, volume=5, issue=4, pageStart=1680, pageEnd=1716, url=null, language=null, rfNumber=[26], rfOrder=34, authorNames=TERVEN J, CÓRDOVA-ESPARZA D M, ROMERO-GONZáLEZ J A, journalName=Machine Learning and Knowledge Extraction, refType=null, unstructuredReference=TERVEN J,CÓRDOVA-ESPARZA D M,ROMERO-GONZáLEZ J A.A comprehensive review of YOLO architectures in computer vision:From YOLOv1 to YOLOv8 and YOLO-NAS[J].Machine Learning and Knowledge Extraction,2023,5(4):1680-1716., articleTitle=A comprehensive review of YOLO architectures in computer vision:From YOLOv1 to YOLOv8 and YOLO-NAS, refAbstract=null), Reference(id=1251505553609015453, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2023, volume=44, issue=24, pageStart=7790, pageEnd=7807, url=null, language=null, rfNumber=[27], rfOrder=35, authorNames=HUANGFU P P, DANG L X, journalName=International Journal of Remote Sensing, refType=null, unstructuredReference=HUANGFU P P,DANG L X.A multi-scale pyramid feature fusion-based object detection method for remote sensing images[J].International Journal of Remote Sensing,2023,44(24):7790-7807., articleTitle=A multi-scale pyramid feature fusion-based object detection method for remote sensing images, refAbstract=null), Reference(id=1251505553692901534, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2020, volume=166, issue=null, pageStart=1, pageEnd=14, url=null, language=null, rfNumber=[28], rfOrder=36, authorNames=ZHENG Z, ZHONG Y F, MA A L, journalName=ISPRS Journal of Photogrammetry and Remote Sensing, refType=null, unstructuredReference=ZHENG Z,ZHONG Y F,MA A L,et al.HyNet:Hyper-scale object detection network framework for multiple spatial resolution remote sensing imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing,2020,166:1-14., articleTitle=HyNet:Hyper-scale object detection network framework for multiple spatial resolution remote sensing imagery, refAbstract=null), Reference(id=1251505553760010402, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2024, volume=28, issue=6, pageStart=1602, pageEnd=1614, url=null, language=null, rfNumber=[29], rfOrder=37, authorNames=吕奕龙, 李敏, 吴肇青, journalName=遥感学报, refType=null, unstructuredReference=吕奕龙,李敏,吴肇青,.稠密连接递归特征金字塔的遥感目标检测算法[J].遥感学报,2024,28(6):1602-1614., articleTitle=稠密连接递归特征金字塔的遥感目标检测算法, refAbstract=null), Reference(id=1251505553831313569, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2024, volume=28, issue=6, pageStart=1602, pageEnd=1614, url=null, language=null, rfNumber=[29], rfOrder=38, authorNames=LYU Y L, LI M, WU Z Q, journalName=National Remote Sensing Bulletin, refType=null, unstructuredReference=LYU Y L,LI M,WU Z Q,et al.Object detection in remote sensing images using densely connected recursive feature Pyramids[J].National Remote Sensing Bulletin,2024,28(6):1602-1614.(in Chinese), articleTitle=Object detection in remote sensing images using densely connected recursive feature Pyramids, refAbstract=null), Reference(id=1251505553923588257, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2024-10-13, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[30], rfOrder=39, authorNames=GE Z, LIU S T, WANG F, journalName=null, refType=null, unstructuredReference=GE Z,LIU S T,WANG F,et al.YOLOX:Exceeding YOLO series in 2021[EB/OL].[2024-10-13].https://arxiv.org/abs/2107.08430., articleTitle=YOLOX:Exceeding YOLO series in 2021, refAbstract=null), Reference(id=1251505554020057252, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2024, volume=142, issue=null, pageStart=104898, pageEnd=104913, url=null, language=null, rfNumber=[31], rfOrder=40, authorNames=LIU J P, ZHENG K Y, LIU X Y, journalName=Image and Vision Computing, refType=null, unstructuredReference=LIU J P,ZHENG K Y,LIU X Y,et al.SDSDet:A real-time objectdetector for small,dense,multi-scale remote sensing objects[J].Image and Vision Computing,2024,142:104898-104913., articleTitle=SDSDet:A real-time objectdetector for small,dense,multi-scale remote sensing objects, refAbstract=null), Reference(id=1251505554112331943, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2024, volume=46, issue=8, pageStart=912, pageEnd=922, url=null, language=null, rfNumber=[32], rfOrder=41, authorNames=刘富宽, 罗素云, 何佳, journalName=红外技术, refType=null, unstructuredReference=刘富宽,罗素云,何佳,.FVIT-YOLOv8:基于多尺度融合注意机制的改进YOLO v8小目标检测[J].红外技术,2024,46(8):912-922., articleTitle=FVIT-YOLOv8:基于多尺度融合注意机制的改进YOLO v8小目标检测, refAbstract=null), Reference(id=1251505554196218026, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2024, volume=46, issue=8, pageStart=912, pageEnd=922, url=null, language=null, rfNumber=[32], rfOrder=42, authorNames=LIU F K, LUO S Y, HE J, journalName=Infrared Technology, refType=null, unstructuredReference=LIU F K,LUO S Y,HE J,et al.FVIT-YOLOv8:Improved YOLOv8 small object detection based on multi-scale fusion attention mechanism[J]. Infrared Technology,2024,46(8):912-922.(in Chinese), articleTitle=FVIT-YOLOv8:Improved YOLOv8 small object detection based on multi-scale fusion attention mechanism, refAbstract=null), Reference(id=1251505554288492717, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2020, volume=58, issue=5, pageStart=3377, pageEnd=3390, url=null, language=null, rfNumber=[33], rfOrder=43, authorNames=WANG P J, SUN X, DIAO W H, journalName=IEEE Transactions on Geoscience and Remote Sensing, refType=null, unstructuredReference=WANG P J,SUN X,DIAO W H,et al.FMSSD:Feature-merged single-shot detection for multiscale objects in large-scale remote sensing imagery[J].IEEE Transactions on Geoscience and Remote Sensing, 2020,58(5):3377-3390., articleTitle=FMSSD:Feature-merged single-shot detection for multiscale objects in large-scale remote sensing imagery, refAbstract=null), Reference(id=1251505554380767408, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, doi=null, pmid=null, pmcid=null, year=2023, volume=15, issue=16, pageStart=3970, pageEnd=3988, url=null, language=null, rfNumber=[34], rfOrder=44, authorNames=MIN L T, FAN Z M, LV Q Y, journalName=Remote Sensing, refType=null, unstructuredReference=MIN L T,FAN Z M,LV Q Y,et al.YOLO-DCTI:Small object detection in remote sensing base on contextual transformer enhancement[J].Remote Sensing, 2023,15(16):3970-3988., articleTitle=YOLO-DCTI:Small object detection in remote sensing base on contextual transformer enhancement, refAbstract=null)], funds=null, companyList=[AuthorCompany(id=1251505546227041121, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, xref=null, ext=[AuthorCompanyExt(id=1251505546235429730, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, companyId=1251505546227041121, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Artificial Intelligence,School of Automation,Xi'an University of Posts and Telecommunications,Xi'an 710121,China), AuthorCompanyExt(id=1251505546243818339, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, companyId=1251505546227041121, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=西安邮电大学人工智能学院、自动化学院,陕西西安 710121)])], figs=[ArticleFig(id=1251505548319998926, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=EN, label=null, caption=null, figureFileSmall=Irr/A6KaFBeTLS5ishMKoA==, figureFileBig=nBHvHbf0H4vv+56Gzg/qPQ==, tableContent=null), ArticleFig(id=1251505548391302098, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=CN, label=图1, caption=多元分支融合自注意力的遥感图像目标检测算法架构示意图, figureFileSmall=Irr/A6KaFBeTLS5ishMKoA==, figureFileBig=nBHvHbf0H4vv+56Gzg/qPQ==, tableContent=null), ArticleFig(id=1251505548559074266, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=EN, label=null, caption=null, figureFileSmall=EYPS/mBn42js7LxK0ho9yg==, figureFileBig=4UNfa4XnqaZfDSnnAKYFDA==, tableContent=null), ArticleFig(id=1251505548626183134, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=CN, label=图2, caption=MFS模块结构示意图, figureFileSmall=EYPS/mBn42js7LxK0ho9yg==, figureFileBig=4UNfa4XnqaZfDSnnAKYFDA==, tableContent=null), ArticleFig(id=1251505548743623651, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=EN, label=null, caption=null, figureFileSmall=/x64WyuYlkoW+egFrvASCA==, figureFileBig=HUgTmj4uKR3KzqtVuKGYIA==, tableContent=null), ArticleFig(id=1251505548819121128, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=CN, label=图3, caption=多元分支融合自注意力的特征提取网络结构示意图, figureFileSmall=/x64WyuYlkoW+egFrvASCA==, figureFileBig=HUgTmj4uKR3KzqtVuKGYIA==, tableContent=null), ArticleFig(id=1251505548923978733, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=EN, label=null, caption=null, figureFileSmall=sxzf9UvDcxOcPta/P2Zjmg==, figureFileBig=wRZCo5RRqDKhRyPgP4quwg==, tableContent=null), ArticleFig(id=1251505548995281904, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=CN, label=图4, caption=Inner-IoU损失原理示意图, figureFileSmall=sxzf9UvDcxOcPta/P2Zjmg==, figureFileBig=wRZCo5RRqDKhRyPgP4quwg==, tableContent=null), ArticleFig(id=1251505549083362291, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=EN, label=null, caption=null, figureFileSmall=F0wss5U7tyDTSoX1ZBl9SA==, figureFileBig=2V3sURlHW/FDcm5Oc/Z0rw==, tableContent=null), ArticleFig(id=1251505549150471158, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=CN, label=图5, caption=结构化通道剪枝原理图, figureFileSmall=F0wss5U7tyDTSoX1ZBl9SA==, figureFileBig=2V3sURlHW/FDcm5Oc/Z0rw==, tableContent=null), ArticleFig(id=1251505549213385723, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=EN, label=null, caption=null, figureFileSmall=3j0I0E15Tpfpd0JoHr9K/w==, figureFileBig=nCZbHRa3x+IQT59eUE2QuQ==, tableContent=null), ArticleFig(id=1251505549276300286, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=CN, label=图6, caption=BN层缩放因子γ分布情况, figureFileSmall=3j0I0E15Tpfpd0JoHr9K/w==, figureFileBig=nCZbHRa3x+IQT59eUE2QuQ==, tableContent=null), ArticleFig(id=1251505549343408130, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=EN, label=null, caption=null, figureFileSmall=null, figureFileBig=null, tableContent=
实验基线模型Inner-IoUMFS第4个检测头查准率/%召回率/%mAP/%mAP0.5~0.95/%参数规模/MGFOLPs
实验178.273.576.449.536.9104.7
实验278.673.576.650.336.9104.7
实验378.674.377.350.039.1134.3
实验478.974.777.750.539.7139.9
), ArticleFig(id=1251505549439877128, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=CN, label=表1, caption=

模型检测消融实验

, figureFileSmall=null, figureFileBig=null, tableContent=
实验基线模型Inner-IoUMFS第4个检测头查准率/%召回率/%mAP/%mAP0.5~0.95/%参数规模/MGFOLPs
实验178.273.576.449.536.9104.7
实验278.673.576.650.336.9104.7
实验378.674.377.350.039.1134.3
实验478.974.777.750.539.7139.9
), ArticleFig(id=1251505549523763214, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=EN, label=null, caption=null, figureFileSmall=null, figureFileBig=null, tableContent=
λ查准率召回率mAP
0.0010072.863.065.4
0.0007573.468.770.0
0.0005076.970.172.9
0.0002576.667.971.4
0.0001071.366.967.2
), ArticleFig(id=1251505549624426515, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=CN, label=表2, caption=

不同λ权重下模型检测精度/%

, figureFileSmall=null, figureFileBig=null, tableContent=
λ查准率召回率mAP
0.0010072.863.065.4
0.0007573.468.770.0
0.0005076.970.172.9
0.0002576.667.971.4
0.0001071.366.967.2
), ArticleFig(id=1251505549725089817, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=EN, label=null, caption=null, figureFileSmall=null, figureFileBig=null, tableContent=
算法AP/%mAP/%参数规模/M
飞机棒球场桥梁跑道小型车大型车船舶网球场篮球场储罐足球场环岛港口游泳池直升机
TPH-92.071.046.664.764.483.388.190.753.773.461.554.281.966.046.568.545.4
YOLOv5
SSD86.261.521.138.546.571.585.492.072.960.257.759.664.962.844.261.626.1
Retinanet75.067.268.053.227.561.870.683.274.350.358.864.968.156.850.562.037.9
YOLOv8l94.778.355.075.058.284.291.193.660.078.263.663.885.468.366.674.443.7
YOLOv795.279.057.576.464.086.690.294.370.982.874.664.285.170.554.076.436.9
文献[29]*77.074.154.272.748.163.170.688.055.548.454.863.178.534.244.961.866.3
MPF*92.876.950.968.066.986.488.194.569.973.055.564.585.266.959.173.3
HyNet*86.958.743.754.064.480.587.885.153.860.441.847.676.648.740.362.0
所提算法95.279.358.778.865.486.590.194.472.983.975.064.785.569.965.677.739.7
), ArticleFig(id=1251505549821558815, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=CN, label=表3, caption=

不同检测算法在DOTA数据集上的实验结果

, figureFileSmall=null, figureFileBig=null, tableContent=
算法AP/%mAP/%参数规模/M
飞机棒球场桥梁跑道小型车大型车船舶网球场篮球场储罐足球场环岛港口游泳池直升机
TPH-92.071.046.664.764.483.388.190.753.773.461.554.281.966.046.568.545.4
YOLOv5
SSD86.261.521.138.546.571.585.492.072.960.257.759.664.962.844.261.626.1
Retinanet75.067.268.053.227.561.870.683.274.350.358.864.968.156.850.562.037.9
YOLOv8l94.778.355.075.058.284.291.193.660.078.263.663.885.468.366.674.443.7
YOLOv795.279.057.576.464.086.690.294.370.982.874.664.285.170.554.076.436.9
文献[29]*77.074.154.272.748.163.170.688.055.548.454.863.178.534.244.961.866.3
MPF*92.876.950.968.066.986.488.194.569.973.055.564.585.266.959.173.3
HyNet*86.958.743.754.064.480.587.885.153.860.441.847.676.648.740.362.0
所提算法95.279.358.778.865.486.590.194.472.983.975.064.785.569.965.677.739.7
), ArticleFig(id=1251505549901250599, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=EN, label=null, caption=null, figureFileSmall=null, figureFileBig=null, tableContent=
模型查准率/%召回率/%mAP/%参数规模/MGFLOPs
YOLOv5m73.568.970.121.249.0
YOLOX-s69.360.267.39.026.8
SDSDet*67.54.811.3
文献[32]*50.05.4
所提算法74.772.372.96.629.0
), ArticleFig(id=1251505549976748075, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=CN, label=表4, caption=

不同轻量模型在DOTA数据集上的实验结果

, figureFileSmall=null, figureFileBig=null, tableContent=
模型查准率/%召回率/%mAP/%参数规模/MGFLOPs
YOLOv5m73.568.970.121.249.0
YOLOX-s69.360.267.39.026.8
SDSDet*67.54.811.3
文献[32]*50.05.4
所提算法74.772.372.96.629.0
), ArticleFig(id=1251505550073217071, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=EN, label=null, caption=null, figureFileSmall=null, figureFileBig=null, tableContent=
算法AP/%mAP/%
飞机船舶储罐棒球场网球场篮球场跑道港口桥梁车辆
TPH-YOLOv599.085.199.197.888.066.098.593.280.176.988.4
SSD99.990.898.099.185.089.098.593.285.788.392.7
Retinanet99.979.385.392.094.193.794.697.489.775.990.8
YOLOv8l99.488.692.097.397.797.199.596.289.889.294.7
YOLOv799.589.796.698.894.792.799.593.285.091.294.1
FMSSD*99.789.990.398.286.096.899.675.680.188.290.4
YOLO-DCTI*99.693.096.899.590.995.099.291.598.990.395.5
所提算法99.393.594.999.398.398.399.595.792.394.396.5
), ArticleFig(id=1251505550178074676, tenantId=1146029695717560320, journalId=1251233954884272221, articleId=1251505541227430493, language=CN, label=表5, caption=

不同检测算法在NWPU VHR-10数据集上的实验结果

, figureFileSmall=null, figureFileBig=null, tableContent=
算法AP/%mAP/%
飞机船舶储罐棒球场网球场篮球场跑道港口桥梁车辆
TPH-YOLOv599.085.199.197.888.066.098.593.280.176.988.4
SSD99.990.898.099.185.089.098.593.285.788.392.7
Retinanet99.979.385.392.094.193.794.697.489.775.990.8
YOLOv8l99.488.692.097.397.797.199.596.289.889.294.7
YOLOv799.589.796.698.894.792.799.593.285.091.294.1
FMSSD*99.789.990.398.286.096.899.675.680.188.290.4
YOLO-DCTI*99.693.096.899.590.995.099.291.598.990.395.5
所提算法99.393.594.999.398.398.399.595.792.394.396.5
)], attaches=null, journal=Journal(id=1251231493423411294, delFlag=0, nameCn=西安邮电大学学报, nameEn=Journal of Xi'an University of Posts and Telecommunications, nameHistory1=null, nameHistory2=null, issn=2095-6533, eissn=, cn=61-1493/TN, coden=null, periodic=双月刊, language=CN, oaType=1, ccby=null, superviseOffice=null, ownerOffice=null, pubOffice=null, editorOffice=null, officeType=null, aims=null, clcCode=null, officeProv=null, officeCity=null, officeAddr=null, officeZip=null, officeEmail=, officePhone=, editDirector=null, officeDirector=null, officeDirectorPhone=null, officeStaffNum=null, officeEmpNum=null, coverPicUrl=tH6CrDBQFqwqmwUccjpjow==, journalPrice=null, startedYear=null, abbrevIsoEn=Journal of Xi'an University of Posts and Telecommunications, journalRemark=null, publicationField=null, createdTime=1776246434792, updatedTime=1776251858439, createdBy=18614031015, updatedBy=13701087609, firstLetterCn=J, firstLetterEn=J, subjectCode=Natural Sciences, subjectName=自然科学, subjectCodeEn=Natural Sciences, subjectNameEn=null, picCn=tH6CrDBQFqwqmwUccjpjow==, picEn=TGVaHCgPGQlZdesVp9aDew==, jcr=null, cjcr=null, exts=[JournalExt(id=1251254241990493156, language=CN, name=西安邮电大学学报, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=, createdTime=1776251858470, updatedTime=1776251858470, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=https://xayd.cbpt.cnki.net/index.aspx?t=1, submissionEditorUrl=https://xayd.cbpt.cnki.net/index.aspx?t=3, submissionReviewUrl=https://xayd.cbpt.cnki.net/index.aspx?t=2, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""}), JournalExt(id=1251254242036630501, language=EN, name=Journal of Xi'an University of Posts and Telecommunications, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=, createdTime=1776251858481, updatedTime=1776251858481, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=https://xayd.cbpt.cnki.net/index.aspx?t=1, submissionEditorUrl=https://xayd.cbpt.cnki.net/index.aspx?t=3, submissionReviewUrl=https://xayd.cbpt.cnki.net/index.aspx?t=2, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""})], databaseList=null, tenantJournalId=1251233954884272221, websiteList=[Website(id=1251257283410346025, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1251233954884272221, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/xayddxxb/CN, language=CN, createTime=1776252583601, createBy=18614031015, updateTime=1776252942185, updateBy=18614031015, name=西安邮电大学学报-中文, tplId=1146099689490845704, title=西安邮电大学学报, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1251258885185683591, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283410346025, code=articleTextType, value=kx, createTime=1776252965494, updateTime=1776252965494, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258885160517764, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283410346025, code=banner, value=null, createTime=1776252965488, updateTime=1776252965488, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258885202460810, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283410346025, code=grayFlag, value=0, createTime=1776252965498, updateTime=1776252965498, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258885152129155, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283410346025, code=logo, value=https://castjournals.cast.org.cn/joweb/xayddxxb/CN/file/pic?fileId=oIRdMBF7r3ynDYM5hP49NA==, createTime=1776252965486, updateTime=1776252965486, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258885219238028, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283410346025, code=minRunFlag, value=0, createTime=1776252965502, updateTime=1776252965502, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258885177294982, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283410346025, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/xayddxxb/CN/file/pic, createTime=1776252965492, updateTime=1776252965492, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258885210849419, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283410346025, code=silenceFlag, value=0, createTime=1776252965500, updateTime=1776252965500, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258885168906373, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283410346025, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_cn_619/, createTime=1776252965490, updateTime=1776252965490, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258885194072200, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283410346025, code=themeColor, value=null, createTime=1776252965496, updateTime=1776252965496, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258885198266505, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283410346025, code=themeStyle, value=null, createTime=1776252965497, updateTime=1776252965497, creator=18614031015, updator=18614031015)]), Website(id=1251257283594895425, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1251233954884272221, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/xayddxxb/EN, language=EN, createTime=1776252583645, createBy=18614031015, updateTime=1776252932924, updateBy=18614031015, name=西安邮电大学学报-英文, tplId=1146101810881728533, title=Journal of Xi'an University of Posts and Telecommunications, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1251258912788398232, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283594895425, code=articleTextType, value=kx, createTime=1776252972075, updateTime=1776252972075, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258912767426709, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283594895425, code=banner, value=null, createTime=1776252972070, updateTime=1776252972070, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258912805175451, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283594895425, code=grayFlag, value=0, createTime=1776252972079, updateTime=1776252972079, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258912759038100, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283594895425, code=logo, value=https://castjournals.cast.org.cn/joweb/xayddxxb/EN/file/pic?fileId=oIRdMBF7r3ynDYM5hP49NA==, createTime=1776252972068, updateTime=1776252972068, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258912813564061, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283594895425, code=minRunFlag, value=0, createTime=1776252972081, updateTime=1776252972081, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258912780009623, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283594895425, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/xayddxxb/EN/file/pic, createTime=1776252972073, updateTime=1776252972073, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258912809369756, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283594895425, code=silenceFlag, value=0, createTime=1776252972080, updateTime=1776252972080, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258912771621014, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283594895425, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_en_623/, createTime=1776252972071, updateTime=1776252972071, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258912792592537, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283594895425, code=themeColor, value=null, createTime=1776252972076, updateTime=1776252972076, creator=18614031015, updator=18614031015), WebsiteProps(id=1251258912796786842, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283594895425, code=themeStyle, value=null, createTime=1776252972077, updateTime=1776252972077, creator=18614031015, updator=18614031015)])], journalTitle=西安邮电大学学报, weixinUrl=null, journalUrl=https://xayd.cbpt.cnki.net/, iacademicId=null, status=1, seqNo=null, journalTitleEn=Journal of Xi'an University of Posts and Telecommunications, journalPhotoCn=tH6CrDBQFqwqmwUccjpjow==, journalPhotoEn=TGVaHCgPGQlZdesVp9aDew==, journalFirstLetter=J, journalRecommend=null, journalNew=null, journalCollection=null, jcrJf=null, cjcrJf=null, jcrJfStr=null, cjcrJfStr=null, submissionFirstDecision=null, sciSubjectClassification=null, casSubjectClassification=null, citeScore=null, totalCitationFrequency=null, icpCode=null, psCode=null, advertisingLicenseCode=null, copyrightInformation=null, country=null, option=, provinceCode=null, provinceName=null, collectFlag=false), detailUrlCn=https://castjournals.cast.org.cn/joweb/xayddxxb/CN/10.13682/j.issn.2095-6533.2025.06.011, detailUrlEn=https://castjournals.cast.org.cn/joweb/xayddxxb/EN/10.13682/j.issn.2095-6533.2025.06.011, pdfUrlCn=https://castjournals.cast.org.cn/joweb/xayddxxb/CN/PDF/10.13682/j.issn.2095-6533.2025.06.011, pdfUrlEn=https://castjournals.cast.org.cn/joweb/xayddxxb/EN/PDF/10.13682/j.issn.2095-6533.2025.06.011, aliStartDate=null, aliEndDate=null, collectionFlag=false, citedCount=null, citedUrl=null, reference=null)
收藏切换
多元分支融合自注意力的遥感图像目标检测算法
收藏切换
PDF下载
亢红波 , 温家正 , 杨春杰 , 王文庆
西安邮电大学学报 | 人工智能目标检测 2025,30(6): 94-103
收起
收藏切换
西安邮电大学学报 | 人工智能目标检测 2025, 30(6): 94-103
多元分支融合自注意力的遥感图像目标检测算法
全屏
亢红波 , 温家正 , 杨春杰 , 王文庆
作者信息
  • 西安邮电大学人工智能学院、自动化学院,陕西西安 710121
  • 亢红波(1974-),男,陕西凤翔人,硕士,西安邮电大学副教授,主要研究方向为智能控制与信息集成。E-mail:

    温家正(1998-),男,陕西榆林人,西安邮电大学硕士研究生,主要研究方向为遥感图像智能处理。E-mail:

    杨春杰(1976-),女,甘肃白银人,硕士,西安邮电大学副教授,主要研究方向为智能控制、物联网技术、嵌入式系统开发等。E-mail:

    王文庆(1964-),男,北京房山人,博士后,西安邮电大学教授,主要研究方向为复杂系统结构分析与鲁棒控制、智能信息处理、信息系统分析等。E-mail:

Multi-branch fusion self-attention object detection algorithm for remote sensing images
Hongbo KANG , Jiazheng WEN , Chunjie YANG , Wenqing WANG
Affiliations
  • School of Artificial Intelligence,School of Automation,Xi'an University of Posts and Telecommunications,Xi'an 710121,China
出版时间: 2025-11-10 doi: 10.13682/j.issn.2095-6533.2025.06.011
文章导航
收藏切换

针对遥感拍摄目标角度变化导致图像中的检测目标尺度多样且密集分布难以准确检测的问题,提出一种多元分支融合自注意力(Multi Branch Fusion Self-Attention,MFS)的遥感图像目标检测算法。该算法先设计由卷积和自注意力机制组成的多分支模块,形成特征提取网络,再建立针对小物体的第4个检测头,旨在融合不同尺度的特征。同时,利用DepGraph剪枝方法进行剪枝,降低参数规模使其轻量化。实验结果表明,所提算法在航拍图像(Dataset for Object deTection in Aerial Image,DOTA)数据集和NWPU VHR-10(Northwestern Polytechnical University Very High Resolution-10)数据集的平均准确率分别为77.7%和96.5%,优于同等参数规模的检测算法。特别是在剪枝后,参数规模仅有6.64M的情况下,所提算法对DOTA数据集检测精度可以保持在72.9%。

目标检测  /  遥感图像  /  多元分支  /  自注意力  /  轻量化

To address the challenges of scale variation and dense object distribution in remote sensing imagery caused by varying imaging angles,a novel object detection algorithm is proposed based on multi-branch fusion self-attention(MFS).A multi-branch module that integrates convolutional and self-attention mechanisms is designed to build a feature extraction network,and the fourth detection head is built for small objects to facilitate multi-scale feature fusion.Meanwhile,the resulting model is pruned by the DepGraph method to achieve a lightweight architecture.Experiments on the DOTA and NWPU VHR-10 datasets demonstrate that the proposed algorithm achieves mean average precision(mAP)scores of 77.7%and 96.5%respectively,outperforming the peer detectors of similar algorithm complexity.Notably,the pruned version maintains a mAP of 72.9%on DOTA,with only 6.64 million parameters.

target detection  /  remote sensing images  /  multivariate branching  /  self-attention  /  lightweight
亢红波, 温家正, 杨春杰, 王文庆. 多元分支融合自注意力的遥感图像目标检测算法. 西安邮电大学学报, 2025 , 30 (6) : 94 -103 . DOI: 10.13682/j.issn.2095-6533.2025.06.011
Hongbo KANG, Jiazheng WEN, Chunjie YANG, Wenqing WANG. Multi-branch fusion self-attention object detection algorithm for remote sensing images[J]. Journal of Xi'an University of Posts and Telecommunications, 2025 , 30 (6) : 94 -103 . DOI: 10.13682/j.issn.2095-6533.2025.06.011
遥感图像[1]是通过不同波段记录各种地形地貌的图像,可为科学探索提供丰富的数据资源。目标检测作为遥感图像处理领域的基础任务之一[2],具有重要的应用价值。面对丰富的遥感影像资源,依赖人工提取目标有效信息不仅效率低、成本高,且在实际应用场景中无法快速准确获取目标信息,因此对目标检测算法提出了更高的技术要求。
基于深度学习的目标检测算法可分为两阶段目标检测算法和单阶段目标检测算法[3]。两阶段的目标检测算法是将目标检测任务分为区域建议和检测两个阶段。如文献[4]采用K-means对遥感图像中的飞机数据进行聚类,并改进定位点,利用改进的VGG16(Visual Geometry Group 16)提取出小飞机的定位特征。文献[5]提出一种基于脉冲卷积神经网络(Spatial Convolutional Neural Network,SCNN)稀疏表征的场景分类方法,去除遥感图像中与场景无关的冗余信息。文献[6]使用多尺度特征的动态融合,强调了图像的目标尺度对特征融合的影响。基于回归的单阶段目标检测算法最大的优点是检测速度快,可充分应用在实时性要求高的检测任务中。如文献[7]提出一种增强弱特征表达的一阶段轻量级小目标检测算法,提高了弱特征的解码能力,增强了小目标检测的性能。文献[8]提出基于单次多盒检测器的一种特征增强小目标检测算法,对小目标特征进行了进一步增强。文献[9]通过压缩ResNeXt获得轻量化主干,并通过自蒸馏头部提高了定位识别能力。文献[10-12]通过改进YOLO(You Only Look Once)目标检测网络,提升了遥感图像目标检测的精度。文献[13]提出融合注意力机制的单次多边框检测(Single Shot Multi-Box Detector,SSD)网络,并结合空洞卷积得到更丰富的特征信息。
上述相关研究虽然在遥感图像的检测中取得了一定的成果,但是遥感图像因成像角度差异导致特征变化大,目标方向难以确定[2]。此外,较大的参数规模和计算量,导致遥感图像目标检测网络延时较大。因此,本文拟提出一种多元分支融合自注意力(Multi Branch Fusion Self-Attention,MFS)的遥感图像检测算法。通过构建多元分支结构级联的特征提取网络,设计多分支连接融合自注意力以达到充分提取特征信息的目的。添加针对小目标检测的第4个检测头,从而增强网络对小尺寸目标的检测能力。最后,使用结构化通道剪枝降低其参数量和计算量,得到轻量化的遥感图像目标检测算法。
基于YOLOv7目标检测算法,多元分支融合自注意力的遥感图像检测算法主要包括多元分支融合自注意力的特征提取网络、路径聚合网络和多尺度目标检测头,具体架构示意图如图1所示。输入的遥感图像,首先经过MFS模块和MPConv(Maxpooling+Conv)级联结构进行特征提取,然后进入全连接空间金字塔池化卷积(Spatial Pyramid Pooling and Fully Connected Spatial Pyramid Convolution,SPPCSPC)模块进行信息聚合。特征提取网络将提取到的特征输入至特征融合网络,充分利用高效层聚合网络(Extended Efficient Layer Aggregation Networks,E-ELAN)模块[14]进行特征提取,并使用自下而上和自上而下的路径聚合网络(Path Aggregation Network,PANet)结构[15]聚合信息,将不同层的特征信息进行特征融合。最后,构建4个基于重参数化卷积[16](Re-param Conv,RepConv)的目标检测器进行目标检测。
遥感图像中目标尺度变化大,单一分支结构的特征提取模块难以充分提取多尺度目标的特征信息,且仅使用单一卷积结构难以精确捕捉多尺度的目标特征。多元分支融合自注意力的特征提取网络结构是由3个MFS模块级联,再连接MPConv作为输入,至特征提取网络的最后一层SPPCSPC层完成图像特征信息特征提取。
MFS模块充分利用类残差式拼接多路分支的不同卷积特征提取结果,通过卷积的权值在不同分支的特征图上形成不同的感受野,为特征图带来不可或缺的归纳偏差[15],以增加模块对输入图像的多样性表示能力,从而更全面地捕捉多尺度目标的信息特征。在多路分支之中,融合嵌入自注意力机制,通过计算每个输入元素与其他元素之间的相似度,动态分配权重,在关注局部特征的同时,关注全局特征,捕捉最佳图像。不同区域的特征,通过评价目标像素和该像素周围的像素之间的相似度,动态调整注意力的分布,尽可能减少漏检或误检现象。
输入特征图x进行多分支特征提取,经过3个卷积核为1×1卷积,输出从上至下形成3个分支,对应输入为x1x2x3。MFS模块的结构示意图如图2所示。
MFS模块的第一分支为融合自注意力机制的细粒特征提取通道。首先将输入特征图x1的图像特征转换为QK两个特征矩阵用来计算注意力[16],其表达式分别为
式中:WQWK分别表示QK可学习的权值矩阵,以卷积核为1×1卷积实现。
Qx1)转置与Kx1)相乘得到注意力矩阵sij,反映各个像素之间的相关性,再将sij矩阵进行归一化(Softmax)处理,得到
式中:Naij)表示以(ij)为中心,宽度为a的区域;dQx1)的特征维数。
细粒特征提取通道的最终输出的表达式为
其中,
Vx1)=WVx1
式中:WV为特征矩阵V可学习的权重矩阵,以卷积核为1×1卷积实现。
在第二分支中,对特征图x2进行两次3×3卷积,获得更大更充分的感受野和权重共享,并引入更多非线性变换,进一步增加特征的提取能力。第二分支两个输出y21y2的表达式分别为
式中:Up(·)表示Upsample上采样操作;Conv(·)表示卷积操作。
第三分支MFS模块的输出y的表达式为
其中,
y3=Conv1×1x3)。
式中:Concat(·)表示通道拼接操作。
MFS模块嵌入自注意力机制,将局部特征与全局特征相结合,使模型具备更好的上下文感知能力,提升小尺度目标特征提取的准确性,同时利用卷积操作形成多分支结构,融合输入特征图中的不同层次特征,类残差结构组合特征信息,充分提高特征表达的综合能力。
输入图像依次经过3个3×3卷积层(Conv2D,Batch Normalization,Sigmoid-Weighted Linear Unit,Conv2D_BN_SiLU),旨在以较小的参数规模获取一定的感受野,得到宽、高、通道数为[160,160,128]的特征图。将该特征图送入首个MFS模块,学习特征信息的分布规律,进行特征重新校准并聚焦于目标位置,输出尺寸为[160,160,256]的特征图。
MFS模块的输出紧接着与1个MPConv模块级联,进行类下采样的操作,进一步整理特征图的通道并增大感受野。此过程中,上下分支通过Concat通道拼接,最终输出尺寸为[80,80,256]的特征图。此级联过程被重复3次,其中3次MFS模块分别输出尺寸为[160,160,128]、[80,80,512]、[40,40,1024]的特征图。最终,最后1次级联的结果再次通过MFS模块和额外的卷积操作后,被输入到SPPCPSC模块中。SPPCPSC模块采用多种尺度的最大池化操作,旨在全面获取不同的感受野,以进一步捕捉图像中不同尺寸目标的特征。
将第2次和第3次MFS模块经过卷积操作后的输出,以及SPPCPSC模块的最终输出,作为特征融合网络PANet的3个输入。鉴于遥感图像场景中普遍存在大量小尺寸目标,且这些目标在浅层特征提取网络中通常具有更高分辨率的特征图,同时包含更丰富的特征信息,在PANet原有基础上进行改进。具体而言,将第1次MFS模块的输出通过1个3×3卷积处理后,与PANet自下而上路径的最顶层输出[160,160,32]特征图进行Concat,生成尺寸为[160,160,64]的特征图,作为新增的1个小目标检测头的输入,即第4个检测头。该新增检测头对小尺寸目标具有较高的敏感性,并且其结构设计能够有效缓解较大目标尺度差异对模型检测精度的不利影响。
多元分支融合自注意力的特征提取网络结构示意图如图3所示。
现有IoU(Intersection over Union)损失仍集中在通过添加新的损失函数项以达到加速收敛的目的,并且泛化能力弱,因此,使用Inner-IoU损失[17]结合LCIoU(Complete IoU Loss)计算IoU损失。在Inner-IoU损失中,使用尺度比例因子控制辅助边界框的比例大小,由于辅助边界框与实际边界框之间的尺寸差异,回归过程中的IoU值变化趋势与实际边界框的IoU变化趋势一致,更加充分地反映实际边界框回归结果的质量,并且小规模的辅助边界框计算IoU损失,更容易使高IoU样本回归,实现加速收敛。
Inner-IoU损失的原理示意图如图4所示,图中bgt为真实框,b为锚框,真实框和内部真实框的中心点为(),锚框和内部锚框的中心点为(xcyc),真实框的宽度和高度分别为wgthgt,锚框的宽度和高度分别为wh
与IoU损失相比,当尺度因子小于1时,辅助框尺寸小于实际框,其回归的有效范围小于IoU损失,但其梯度绝对值大于IoU损失所得的梯度,能够加速高IoU样本的收敛。与之相反,当尺度因子大于1时,较大尺度的辅助边框扩大了回归的有效范围,对于低IoU的回归有所增益,因此选择尺度因子等于1[17]
使用Inner-IoU结合LCIoU,边框回归损失函数的表达式为
式中:LIoU表示预测框与真实框之间的IoU损失;LInner-CIoU表示Inner-IoU损失。
为了平衡模型精度与检测效率之间的矛盾,使用基于DepGraph的结构化剪枝[18]方法,降低模型的参数量和计算复杂度,从而得到轻量化的遥感图像目标检测算法。结构化通道剪枝原理示意图如图5所示,结构化通道剪枝通过先进行稀疏化训练,确定哪些通道“不重要”,即权重较小,再对其剪枝,从而在尽可能少损失检测精度的前提下,显著提升模型的检测速度,增强其在实际应用中的部署能力。
向模型训练任务相关的损失函数lfxiW),yi)添加惩罚项,以获得稀疏化模型的表达式为
式中:λ为权重系数;γ为可学习参数;Rs(·)表示L1范数计算。
批归一化(Batch Normalization,BN)层的可学习参数与滤波器的数量相等,可使用BN层作为通道剪枝的“门”,对通道重要性进行评估[19],BN层对输出原理的表达式为
式中:zinzout分别为BN层的输入和输出;μBσB分别为当前mini-batch的平均值和标准差;ε为防止分母为零的小量;γβ为可学习参数,分别表示尺度因子和位移。稀疏正则化算法使用L1范数,有Rsγ)=‖γ1,即给每个通道的γ尺度因子施加稀疏处理,可以将非重要通道的尺度因子趋于零,达到获得稀疏模型的目的。
实验操作平台为Ubuntu 20.04系统,采用Pytorch1.10.0深度学习框架,CPU的型号为AMD EPYC 7542 32-Core Processor,显卡的型号为NVIDIA GeForce RTX A5000,显卡内存24GiB,编程语言为Python。设置实验初始学习率为0.001,采用随机梯度下降法(Stochastic Gradient Descent,SGD)更新网络参数,学习动量为0.9,权重衰减率为0.0005。
采用平均精确率(Average Precision,AP)和平均精确率均值(Mean Average Precision,mAP)对多元分支融合自注意力的遥感图像目标检测模型的检测效果进行评估。AP由召回率与查准率决定,以召回率为横坐标,每个召回率对应的查准率最大值为纵坐标,绘制出Precision-Recall曲线,对曲线取积分求曲线下面积即为平均精确率(Average Precision,AP)值。在得到多个单一类别值后,对其求平均值得到mAP值,mAP值为所有类别平均精确率均值,通过mAP值衡量模型对所有类别的检测效果。最后,采用参数规模和浮点运算(Giga Floating Point Operations,GFLOPs)充分评估模型选取。
实验采用航拍图像(Dataset for Object deTection in Aerial Image,DOTA)数据集[20]和(Northwestern Polytechnical University Very High Resolution-10,NWPU VHR-10)数据集[21]。DOTA数据集一共有2806张航拍图像,图像尺寸集中在800×800像素至4000×4000像素,将原图裁剪为多个1024×1024像素的子图,对于分辨率不足的图像,将其填充为1024×1024大小。数据集共包含15个类别,分别是飞机、船舶、储罐、棒球场、网球场、篮球场、跑道、港口、桥梁、大型车辆、小型车辆、直升机、环岛、足球场和游泳池。训练集、验证集和测试集按照8∶1∶1划分。桥梁、小型车辆、船舶、网球场、篮球场等类别的大小存在显著差异[22],同一物体表现出不同的大小,对精确检测造成了困难。
NWPU数据集包含800张图像,数据集中分为一个具有标注的图像集和一个负图像集,负图像集共有150张图像,实验仅使用具有标注的650张图像,共10种类别,分别是飞机、船舶、储罐、棒球场、网球场、篮球场、跑道、港口、桥梁和车辆,将数据集随机划分为训练集350幅图像、验证集150幅图像和测试集150幅图像。
采用消融实验在DOTA数据集上验证所提算法的性能,模型训练后使用测试集进行测试,共4组实验。以标准YOLOv7网络为基线模型,从网络的检测精度和参数规模等方面进行对比,消融实验结果如表1所示。
表1可知,在实验2中,辅助边框IoU损失提升锚框精准度,显著提高港口目标检测精度,mAP0.5~0.95较YOLOv7提升0.8%。实验3采用多元分支融合自注意力网络,增强特征提取,虽参数增加,但在跑道等场景达到最佳精度,召回率和mAP分别提高0.8%和0.7%。实验4引入第4个检测头,小型车辆和储罐检测精度分别达65.4%和83.9%,至少提升1.2%和1.1%,且精确率、召回率和mAP最高,分别为78.9%、74.7%和77.7%。因此,对比YOLOv7基线模型,所提算法在保证参数规模不大幅度上升的前提下,得到了优于YOLOv7的检测精度,验证了对于遥感目标检测的有效性。
为更加准确地控制目标检测模型在稀疏化训练过程中的正则化强度,通过调整式(9)中的λ权重系数,以便获得同时兼具高稀疏度和高精度的检测模型。不同λ权重下模型检测精度如表2所示。
表2可知,当λ=0.00100时,正则化约束程度较大,模型不能够较好地保证其检测精度,逐步调整λ取值进行稀疏训练。当λ=0.00050时,查准率达到76.9%,召回率达到70.1%,mAP达到72.9%。因此最终选取λ=0.00050作为损失函数惩罚项的权重系数。
λ=0.00050时,BN层缩放因子γ分布情况如图6所示。
在原始模型中,也就是稀疏训练0次时,γ接近正态分布,分布中心接近1,此时无法判断通道的重要程度。随着稀疏训练的进行,γ分布开始趋近于0,在经过约50轮稀疏训练后,缩放因子γ大部分接近0并趋于稳定,此时,BN层的γ越接近0,代表此通道的重要程度越低,在剪枝过程中可以考虑被剪去,达到模型压缩的目的。最终得到压缩后的剪枝MFS-YOLO的参数量为6.64M,计算量为29.05GFLOPs,mAP为72.9%。
为了验证所提算法的优越性,并更全面地评价模型性能,在DOTA数据集上分别与TPH-YOLOV5s[23](Transformer Prediction Head YOLOV5 Small)、SSD[24]、Retinanet[25]、YOLOv8l[26]、YOLOv7[14]、MPF[27]、HyNet[28]和文献[29]算法进行对比实验,结果如表3所示。表中“*”表示可直接从论文中获得的结果;“-”表示在原始文本中没有提供相应值。
表3可知,所提算法的mAP是77.7%,为对比算法中的最优结果,证明了所提算法的有效性。TPH-YOLOv5算法通过引入大量注意力机制提升了检测精度,尤其对小尺寸目标如小型车,mAP为64.4%,对比其他算法有一定检测能力,但多尺度目标,例如桥梁目标,检测能力仍有待提高。YOLOv8l算法的直升机mAP为66.6%,在所有对比算法上表现最优,其43.7M的较大参数规模和优秀的网络设计增强了目标与背景的区别能力,但训练成本显著增加。SSD的整体mAP仅有61.6%,算法因缺乏特征信息融合和特征图过小易导致信息丢失,检测能力较弱。Retinanet算法的桥梁mAP为68.0%,对于宽高比差异大的目标上检测精度高,但对小而密的目标,例如小型车mAP为27.5%,检测能力最差。尽管文献[29]及MPF、HyNet等算法在解决遥感图像尺度变化问题上有所改进,但总体检测精度为61.8%、73.3%、62.0%,对比其他算法仍精度较差。通过细化其特征提取网络的设计,进一步区分了对象信息和背景信息,使得所提算法在物体边界模糊的跑道和足球场类别上都达到了最佳的检测精度,分别为78.8%和75.0%。此外,所提算法对棒球场、储罐和港口的检测精度最高,准确率分别为79.3%、83.9%和85.5%。
为验证剪枝后的MFS-YOLO在同等规模模型的优越性,在DOTA数据集上分别与YOLOV5m[26]、YOLOX-s[30]、SDSDet[31](Self-Distillation Sampling for Object Detection)、文献[32]等不同轻量模型进行对比实验,结果如表4所示。表中“*”表示可直接从论文中获得的结果;“-”表示在原始文本中没有提供相应值。
表4可知,剪枝后所提算法在参数规模仅有6.64M的同时,mAP达到72.9%,为同规模对比算法中最优结果。通过剪枝获得的紧凑模型与通过轻量化网络结构获得的轻量模型相比,拥有更好的检测精度,证明了所提算法的有效性。
为了进一步验证所提算法在航拍数据集上的有效性和泛化性,在NWPU VHR-10数据集上分别与TPH-YOLOv5[23]、SSD[24]、Retinanet[25]、YOLOv8l[26]、YOLOv7[14]、FMSSD[33](FeatureMerged Single Shot MultiBox Detector)和YOLO-DCTI[34](You Only Look Once-Dynamic Convolutional Temporal Interpolation)进行对比实验,结果如表5所示。表中“*”表示可直接从论文中获得的结果。
表5结果可知,FMSSD使用空洞卷积以及多个平行层的卷积统一来自不同特征图的上下文信息,极易导致对特定尺度目标关注不足,如港口mAP为75.6%、网球场mAP为86.0%,尤其是当目标尺度与感受野不匹配时;YOLO-DCTI设计DCTI结构旨在解耦分类和回归任务,检测精度达到95.5%,但解耦不够有效,导致任务之间的信息共享不足。所提算法在NWPU VHR-10数据集上达到了96.5%的最佳检测精度。
通过两组对比实验可知,所提算法在DOTA数据集和NWPU VHR-10数据集检测任务中,均达到最佳检测精度。
多分支融合自注意力机制的遥感图像目标检测算法设计了一种融合卷积与自注意力机制的多分支模块,构建了高效的特征提取网络,并针对遥感图像中普遍存在的小目标检测难题,引入了第4个专用检测头以增强对小尺度物体的感知能力。实验结果表明,该算法在DOTA数据集上实现了77.7%的mAP,在NWPU VHR-10数据集上达到了96.5%的mAP,展现出优异的检测性能。为进一步提升模型效率,采用结构化剪枝技术对网络进行压缩,压缩后模型参数量仅为6.64M,且在DOTA数据集上的mAP仍保持在72.9%,显著优于同规模的其他检测模型,兼具高精度与轻量化优势。
参考文献 引证文献
排序方式:
[1]
CHENG G,HAN J W.Asurvey on object detection in optical remote sensing images[J].ISPRS Journal of Photogrammetry and Remote Sensing, 2016, 117:11-28.
[2]
WANG H N,LI Y,FANG Y Q,et al.SRS-Net:Training object detectors from scratch for remote sensing images without pretraining[J].Chinese Journal of Aeronautics,2023,36(8):269-283.
[3]
黄泽贤,吴凡路,傅瑶,.基于深度学习的遥感图像舰船目标检测算法综述[J].光学精密工程,2023, 31(15):2295-2318.
HUANG Z X,WU F L,FU Y,et al.Review of deep learning-based algorithms for ship target detection from remote sensing images[J].Optics and Precision Engineering,2023,31(15):2295-2318.(in Chinese)
[4]
ZHANG Y,SONG C L,ZHANG D W.Small-scale aircraft detection in remote sensing images based on Faster-RCNN[J].Multimedia Tools and Applications, 2022,81(13):18091-18103.
[5]
张哲益,曹卫华,朱蕊,.基于脉冲卷积神经网络稀疏表征的高分辨率遥感图像场景分类方法[J].控制与决策,2022,37(9):2305-2313.
ZHANG Z Y,CAO W H,ZHU R,et al.Sparse representation with spike convolutional neural networks for scene classification of remote sensing images of high resolution[J].Control and Decision,2022,37 (9):2305-2313.(in Chinese)
[6]
谢星星,程塨,姚艳清,.动态特征融合的遥感图像目标检测[J].计算机学报,2022,45(4):735-747.
XIE X X,CHENG G,YAO Y Q,et al.Dynamic feature fusion for object detection in remote sensing images[J].Chinese Journal of Computers,2022,45(4):735-747.(in Chinese)
[7]
周葳楠,吴治海,张正道,.基于弱特征增强的轻量化小目标检测方法[J].控制与决策,2024,39(2):381-390.
ZHOU W N,WU Z H,ZHANG Z D,et al.Lightweight small target detection method based on weak feature enhancement[J].Control and Decision,2024, 39(2):381-390.(in Chinese)
[8]
严春满,王铖.基于特征增强的SAR图像舰船小目标检测算法[J].控制与决策,2023,38(1):239-247.
YAN C M,WANG C.Aship small target detection algorithm based on feature enhancement in SAR image[J].Control and Decision,2023,38(1):239-247.(in Chinese)
[9]
田中原.遥感图像多尺度目标的轻量化检测方法[J].测绘科学,2023,48(6):104-111.
TIAN Z Y.Lightweight detection method for multiscale objects in remote sensing images[J].Science of Surveying and Mapping,2023,48(6):104-111.(in Chinese)
[10]
AHMED M,EL-SHEIMY N,LEUNG H,et al.Enhancing object detection in remote sensing:A hybrid YOLOv7 and transformer approach with automatic model selection[J].Remote Sensing,2024,16(1):51-68.
[11]
赵同祥,张瑞全,高树静,.基于DN-YOLOv5遥感目标快速检测方法[J].电子设计工程,2024,32(4):186-190.
ZHAO T X,ZHANG R Q,GAO S J,et al.A fast detection method of remote sensing target based on DN-YOLOv5[J].Electronic Design Engineering,2024,32 (4):186-190.(in Chinese)
[12]
庄文华,唐晓刚,张斌权,.基于改进YOLOv5的遥感图像旋转框目标检测[J].电子设计工程,2023,31 (14):137-141.
ZHUANG W H,TANG X G,ZHANG B Q,et al. Remote sensing image rotatable bounding box object detection based on improved YOLOv5[J].Electronic Design Engineering,2023,31(14):137-141.(in Chinese)
[13]
兰旭婷,郭中华,石甜甜,.融合SPP与FPN的光学遥感图像飞机目标检测[J].电光与控制,2023,30 (4):6-11.
LAN X T,GUO Z H,SHI T T,et al.Aircraft target detection in optical remote sensing images by fusing SPP and FPN[J]. Electronics Optics & Control, 2023,30(4):6-11.(in Chinese)
[14]
WANG C Y, BOCHKOVSKIY A, LIAO H M. YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Vancouver:IEEE,2023:7464-7475.
[15]
LIU S,QI L,QIN H F,et al.Path aggregation network for instance segmentation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:8759-8768.
[16]
DING X H, ZHANG X Y, MA N N, et al. RepVGG:Making VGG-style ConvNets great again[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:13728-13737.
[17]
ZHANG H,XU C,ZHANG S J.Inner-IoU:More effective intersection over union loss with auxiliary bounding box[EB/OL].[2024-10-13].https://arxiv.org/abs/2311.02877.
[18]
FANG G F,MA X Y,SONG M L,et al.DepGraph:Towards any structural pruning[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Vancouver:IEEE,2023:16091-16101.
[19]
LIU Z,LI J G,SHEN Z Q,et al.Learning efficient convolutional networks through network slimming[C]//2017 IEEE International Conference on Computer Vision.Venice:IEEE,2017:2755-2763.
[20]
XIA G S,BAI X,DING J,et al.DOTA:A largescale dataset for object detection in aerial images[C]//2018IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:3974-3983.
[21]
CHENG G,HAN J W,ZHOU P C,et al.Multi-class geospatial object detection and geographic image classification based on collection of part detectors[J].IS-PRS Journal of Photogrammetry and Remote Sensing, 2014,98:119-132.
[22]
RAN Q,WANG Q,ZHAO B Y,et al.Lightweight oriented object detection using multiscale context and enhanced channel attention in remote sensing images[J].IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14:5786-5795.
[23]
ZHU X K, LYU S C, WANG X, et al. TPHYOLOv5:Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]//2021 IEEE/CVF International Conference on Computer Vision Workshops. Montreal:IEEE,2021:2778-2788.
[24]
LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single shot MultiBox detector[C]//Computer Vision-ECCV 2016.Cham:Springer,2016:21-37.
[25]
LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//2017 IEEE International Conference on Computer Vision. Venice:IEEE, 2017:2999-3007.
[26]
TERVEN J,CÓRDOVA-ESPARZA D M,ROMERO-GONZáLEZ J A.A comprehensive review of YOLO architectures in computer vision:From YOLOv1 to YOLOv8 and YOLO-NAS[J].Machine Learning and Knowledge Extraction,2023,5(4):1680-1716.
[27]
HUANGFU P P,DANG L X.A multi-scale pyramid feature fusion-based object detection method for remote sensing images[J].International Journal of Remote Sensing,2023,44(24):7790-7807.
[28]
ZHENG Z,ZHONG Y F,MA A L,et al.HyNet:Hyper-scale object detection network framework for multiple spatial resolution remote sensing imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing,2020,166:1-14.
[29]
吕奕龙,李敏,吴肇青,.稠密连接递归特征金字塔的遥感目标检测算法[J].遥感学报,2024,28(6):1602-1614.
LYU Y L,LI M,WU Z Q,et al.Object detection in remote sensing images using densely connected recursive feature Pyramids[J].National Remote Sensing Bulletin,2024,28(6):1602-1614.(in Chinese)
[30]
GE Z,LIU S T,WANG F,et al.YOLOX:Exceeding YOLO series in 2021[EB/OL].[2024-10-13].https://arxiv.org/abs/2107.08430.
[31]
LIU J P,ZHENG K Y,LIU X Y,et al.SDSDet:A real-time objectdetector for small,dense,multi-scale remote sensing objects[J].Image and Vision Computing,2024,142:104898-104913.
[32]
刘富宽,罗素云,何佳,.FVIT-YOLOv8:基于多尺度融合注意机制的改进YOLO v8小目标检测[J].红外技术,2024,46(8):912-922.
LIU F K,LUO S Y,HE J,et al.FVIT-YOLOv8:Improved YOLOv8 small object detection based on multi-scale fusion attention mechanism[J]. Infrared Technology,2024,46(8):912-922.(in Chinese)
[33]
WANG P J,SUN X,DIAO W H,et al.FMSSD:Feature-merged single-shot detection for multiscale objects in large-scale remote sensing imagery[J].IEEE Transactions on Geoscience and Remote Sensing, 2020,58(5):3377-3390.
[34]
MIN L T,FAN Z M,LV Q Y,et al.YOLO-DCTI:Small object detection in remote sensing base on contextual transformer enhancement[J].Remote Sensing, 2023,15(16):3970-3988.
2025年第30卷第6期
PDF下载
88
37
引用本文
BibTeX
文章信息
doi: 10.13682/j.issn.2095-6533.2025.06.011
  • 接收时间:2024-10-11
  • 首发时间:2026-04-16
  • 出版时间:2025-11-10
补充材料
相关文章
文章信息
作者
出版历史
  • 收稿日期:2024-10-11
基金
作者信息
    西安邮电大学人工智能学院、自动化学院,陕西西安 710121
参考文献
分享链接
https://castjournals.cast.org.cn/joweb/xayddxxb/CN/10.13682/j.issn.2095-6533.2025.06.011
分享至
全文二维码

扫描看全文

引用本文
BibTeX
本文的引用情况
2种不同金属材料的力学参数

Family
属数
Number of
genus
种数
Number of
species
占总种数比例
Percentage of
total species (%)

Genus
种数
Number of
species
占总种数比例
Percentage of total
species (%)
鹅膏菌科Amanitaceae 2 11 5.26 鹅膏菌属 Amanita 10 4.78
小菇科 Mycenaceae 2 12 5.74 丝盖伞属 Inocybe 5 2.39
多孔菌科 Polyporaceae 8 14 6.70 蜡蘑属 Laccaria 5 2.39
红菇科 Russulaceae 3 23 11.00 小皮伞属 Marasmius 6 2.87
小菇属 Mycena 11 5.26
光柄菇属 Pluteus 5 2.39
红菇属 Russula 17 8.13
栓菌属 Trametes 5 2.39
关闭全屏