Article(id=1263819611607282051, tenantId=1146029695717560320, journalId=1263530845441638439, issueId=1263818962224165389, articleNumber=null, orderNo=null, doi=10.19693/j.issn.1673-3185.04311, pmid=null, cstr=null, oa=null, hot=null, price=null, onlineType=0, articleFormat=0, articleType=null, articleTypeStr=null, receivedDate=1734019200000, receivedDateStr=2024-12-13, revisedDate=1743955200000, revisedDateStr=2025-04-07, acceptedDate=null, acceptedDateStr=null, onlineDate=1779247676040, onlineDateStr=2026-05-20, pubDate=1777478400000, pubDateStr=2026-04-30, doiRegisterDate=null, doiRegisterDateStr=null, onlineIssueDate=1779247676040, onlineIssueDateStr=2026-05-20, onlineJustAcceptDate=null, onlineJustAcceptDateStr=null, onlineFirstDate=null, onlineFirstDateStr=null, sourceXml=null, magXml=null, createTime=1779247676040, creator=13041195026, updateTime=1779247676040, updator=13041195026, issue=Issue{id=1263818962224165389, tenantId=1146029695717560320, journalId=1263530845441638439, year='2026', volume='21', issue='2', pageStart='1', pageEnd='444', issueExtLink='null', onlineDate='null', pubDate='null', beforeIssueId=null, nextIssueId=null, price=null, status=1, issueComplete=1, articleOrder=1, issueType=-1, specialIssue=null, createTime=1779247521215, creator=13041195026, updateTime=1779247861438, updator=13041195026, preIssue=null, nextIssue=null, ext={EN=IssueExt(id=1263820389638070544, tenantId=1146029695717560320, journalId=1263530845441638439, issueId=1263818962224165389, language=EN, specialIssueTitle=, coverIllustrator=null, specialIssueEditor=, specialIssueAbout=), CN=IssueExt(id=1263820389638070545, tenantId=1146029695717560320, journalId=1263530845441638439, issueId=1263818962224165389, language=CN, specialIssueTitle=, coverIllustrator=null, specialIssueEditor=, specialIssueAbout=)}, issueFiles=null}, startPage=424, endPage=434, ext={EN=ArticleExt(id=1263819613486330248, articleId=1263819611607282051, tenantId=1146029695717560320, journalId=1263530845441638439, language=EN, title=MITD-YOLO: an improved YOLOv8n-based method for maritime infrared target detection, columnId=1263819608264458931, journalTitle=Chinese Journal of Ship Research, columnName=Weapon, Electronic and Information System, runingTitle=null, highlight=null, articleAbstract=
Objective

Complex backgrounds, significant target size variations, and severe sea clutter in maritime infrared imagery often result in missed or false detections. To address this challenge, an improved method based on YOLOv8n, termed maritime infrared target detection-YOLO (MITD-YOLO), is proposed to enhance target detection accuracy in maritime infrared images.

Method

MITD-YOLO incorporates a diverse branch module (DBB) and enhanced multi-scale convolution (EMSConv) to leverage multi-scale convolutions, enabling the model to more effectively capture complex features. A triple attention mechanism is employed to facilitate spatial and channel-wise feature interaction, thereby improving key feature extraction. Additionally, the powerful-IoUv2 (PIoUv2) loss function is introduced to address the anchor box expansion problem, leading to improved detection accuracy and enhanced model robustness.

Results

Experimental results show that the improved model significantly enhances the efficiency of maritime infrared target detection, with a 2.3% increase in precision and a 1.7% increase in recall. The model achieves an average precision of 88.9%, and 132.8 FPS, outperforming the original model.

Conclusion

MITD-YOLO enhances maritime infrared target detection performance and provides a more reliable target detection technology for applications such as maritime surveillance and ship navigation, contributing to the advancement of intelligent maritime systems.

, correspAuthors=Xuefeng YANG, authorNote=null, correspAuthorsNote=null, copyrightStatement=Copyright © 2026 Chinese Journal of Ship Research. All rights reserved., copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=null, magXml=null, pdfUrl=null, pdf=null, pdfFileSize=null, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=null, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=null, mapNumber=null, authorCompany=null, fund=null, authors=null, authorsList=Xuefeng YANG, Jiayao LIU, Changhua ZHOU), CN=ArticleExt(id=1263819680813297831, articleId=1263819611607282051, tenantId=1146029695717560320, journalId=1263530845441638439, language=CN, title=MITD-YOLO:改进YOLOv8n的海上红外目标检测方法, columnId=1263819609539527351, journalTitle=中国舰船研究, columnName=武器与电子信息系统, runingTitle=null, highlight=null, articleAbstract=
目的

海上红外图像背景复杂、目标尺寸变化大、海浪杂波干扰严重,容易导致目标漏检和误检。为提高红外图像中的目标检测准确率,提出一种基于YOLOv8n的海上红外目标检测方法——面向海上红外目标检测任务的YOLO(MITD-YOLO)。

方法

首先引入多样化分支模块(DBB)和多尺度卷积(EMSConv),利用多个不同尺度的卷积使模型能够更好地捕捉复杂特征。然后,采用三重注意力机制(triple attention)实现空间和通道维度的特征交互,强化关键特征提取。最后,使用Powerful-IoUv2(PIoUv2)对原模型的损失函数进行改进,以解决锚框扩展问题,提高检测精度并增强模型的鲁棒性。

结果

实验结果表明,改进的MITD-YOLO模型对海上红外图像目标的检测效果有所提升:准确率提升2.3%,召回率提升1.7%;平均准确率达88.9%,帧率(FPS)达到132.8,优于原模型。

结论

该方法可提高海上红外目标检测效果,为海上安全监控和船舶导航等领域提供更可靠的目标检测技术,助力智能海洋系统发展。

, correspAuthors=杨雪锋, authorNote=null, correspAuthorsNote=
* 杨雪锋
, copyrightStatement=版权所有 © 《中国舰船研究》编辑部 2026, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=xFjXEJo/mnm9W71ssXFgNg==, magXml=njzXHx3XKefyjYKEgZMRuw==, pdfUrl=null, pdf=ZAMMC8RjGtlPb2PGPiRZ4A==, pdfFileSize=5160000, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=85n6A/st42T/qRJwiSRqAQ==, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=Nn9pQlIftm5Ga1iwDj+0cQ==, mapNumber=null, authorCompany=null, fund=null, authors=

杨雪锋,男,1987年生,博士,讲师。研究方向:水上智能运输系统,船舶航行环境感知,航迹规划。E-mail:

, authorsList=杨雪锋, 刘佳尧, 周昌华)}, authors=[Author(id=1263819682302275775, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, orderNo=0, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=yangxuefeng536@cqjtu.edu.cn, emailSecond=null, emailThird=null, correspondingAuthor=1, authorType=1, ext={EN=AuthorExt(id=1263819682532962501, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, authorId=1263819682302275775, language=EN, stringName=Xuefeng YANG, firstName=Xuefeng, middleName=null, lastName=YANG, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=*, 1, 2, address=1School of Shipping and Naval Architecture, Chongqing Jiaotong University, Chongqing 400074, China
2National Engineering Laboratory of Transport Safety and Emergency Informatics, Beijing 100011, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1263819682679763144, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, authorId=1263819682302275775, language=CN, stringName=杨雪锋, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=*, 1, 2, address=1重庆交通大学 航运与船舶工程学院,重庆 400074
2交通安全应急信息技术国家工程实验室,北京 100011, bio={"content":"

杨雪锋,男,1987年生,博士,讲师。研究方向:水上智能运输系统,船舶航行环境感知,航迹规划。E-mail:

"}, bioImg=null, bioContent=

杨雪锋,男,1987年生,博士,讲师。研究方向:水上智能运输系统,船舶航行环境感知,航迹规划。E-mail:

, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1263819681245311152, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, xref=1, ext=[AuthorCompanyExt(id=1263819681295642801, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, companyId=1263819681245311152, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1School of Shipping and Naval Architecture, Chongqing Jiaotong University, Chongqing 400074, China), AuthorCompanyExt(id=1263819681329197234, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, companyId=1263819681245311152, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1重庆交通大学 航运与船舶工程学院,重庆 400074)]), AuthorCompany(id=1263819681744433336, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, xref=2, ext=[AuthorCompanyExt(id=1263819681773793465, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, companyId=1263819681744433336, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2National Engineering Laboratory of Transport Safety and Emergency Informatics, Beijing 100011, China), AuthorCompanyExt(id=1263819681790570682, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, companyId=1263819681744433336, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2交通安全应急信息技术国家工程实验室,北京 100011)])]), Author(id=1263819682906255565, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, orderNo=1, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1263819685074710740, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, authorId=1263819682906255565, language=EN, stringName=Jiayao LIU, firstName=Jiayao, middleName=null, lastName=LIU, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1School of Shipping and Naval Architecture, Chongqing Jiaotong University, Chongqing 400074, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1263819685418643672, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, authorId=1263819682906255565, language=CN, stringName=刘佳尧, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1重庆交通大学 航运与船舶工程学院,重庆 400074, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1263819681245311152, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, xref=1, ext=[AuthorCompanyExt(id=1263819681295642801, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, companyId=1263819681245311152, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1School of Shipping and Naval Architecture, Chongqing Jiaotong University, Chongqing 400074, China), AuthorCompanyExt(id=1263819681329197234, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, companyId=1263819681245311152, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1重庆交通大学 航运与船舶工程学院,重庆 400074)])]), Author(id=1263819685745799390, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, orderNo=2, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1263819685963903203, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, authorId=1263819685745799390, language=EN, stringName=Changhua ZHOU, firstName=Changhua, middleName=null, lastName=ZHOU, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1School of Shipping and Naval Architecture, Chongqing Jiaotong University, Chongqing 400074, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1263819686278476007, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, authorId=1263819685745799390, language=CN, stringName=周昌华, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1重庆交通大学 航运与船舶工程学院,重庆 400074, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1263819681245311152, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, xref=1, ext=[AuthorCompanyExt(id=1263819681295642801, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, companyId=1263819681245311152, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1School of Shipping and Naval Architecture, Chongqing Jiaotong University, Chongqing 400074, China), AuthorCompanyExt(id=1263819681329197234, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, companyId=1263819681245311152, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1重庆交通大学 航运与船舶工程学院,重庆 400074)])])], keywords=[Keyword(id=1263819686790181102, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, orderNo=1, keyword=target tracking), Keyword(id=1263819686995702002, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, orderNo=2, keyword=infrared target detection), Keyword(id=1263819687134114039, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, orderNo=3, keyword=multi-scale convolution), Keyword(id=1263819687398355194, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, orderNo=4, keyword=triple attention mechanism), Keyword(id=1263819689197711613, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, orderNo=1, keyword=目标检测), Keyword(id=1263819689612947714, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, orderNo=2, keyword=红外目标检测), Keyword(id=1263819689814274309, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, orderNo=3, keyword=多尺度卷积), Keyword(id=1263819690229510410, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, orderNo=4, keyword=三重注意力机制)], refs=[Reference(id=1263819704838271359, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=1, rfOrder=0, authorNames=null, journalName=null, refType=null, unstructuredReference=ZHU D Y, TANG J W, FU X X, et al. Detection of infrared small target based on background subtraction local contrast measure and Gaussian structural similarity[J]. Heliyon, 2023, 9(6): e16998., articleTitle=null, refAbstract=null), Reference(id=1263819704972489089, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=2, rfOrder=1, authorNames=null, journalName=null, refType=null, unstructuredReference=LU Z L, LIU S X, YILAHUN H, et al. Infrared small target detection based on background estimation and scale fusion[J]. IEEE Geoscience and Remote Sensing Letters, 2024, 21: 1–5., articleTitle=null, refAbstract=null), Reference(id=1263819705119289732, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=3, rfOrder=2, authorNames=null, journalName=null, refType=null, unstructuredReference=ZHANG Z Z, DING C, GAO Z S, et al. ANLPT: self-adaptive and non-local patch-tensor model for infrared small target detection[J]. Remote Sensing, 2023, 15(4): 1021., articleTitle=null, refAbstract=null), Reference(id=1263819705278673287, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=4, rfOrder=3, authorNames=null, journalName=null, refType=null, unstructuredReference=GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH: IEEE, 2014: 580−587. DOI:10.1109/CVPR.2014.81., articleTitle=null, refAbstract=null), Reference(id=1263819705400308106, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=5, rfOrder=4, authorNames=null, journalName=null, refType=null, unstructuredReference=GUPTA V, MANDLOI A, PAWAR S, et al. Deep learning based object detection using mask RCNN[M]//AWASTHI S, SINGH NARUKA M, PRAKASH YADAV S, et al. AI and IoT−based Intelligent Health Care & Sanitation. Singapore: Bentham Science Publishers, 2023: 207−221., articleTitle=null, refAbstract=null), Reference(id=1263819707115778443, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=6, rfOrder=5, authorNames=null, journalName=null, refType=null, unstructuredReference=OU J H, WANG J G, XUE J, et al. Infrared image target detection of substation electrical equipment using an improved faster R-CNN[J]. IEEE Transactions on Power Delivery, 2023, 38(1): 387–396., articleTitle=null, refAbstract=null), Reference(id=1263819707380019598, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=7, rfOrder=6, authorNames=null, journalName=null, refType=null, unstructuredReference=ZHAO X F, XIA Y T, XU M Y, et al. An infrared small vehicle target detection method based on deep learning[C]//Third International Seminar on Artificial Intelligence, Networking, and Information Technology (AINIT 2022). Shanghai: SPIE, 2023. DOI:10.1117/12.2667313., articleTitle=null, refAbstract=null), Reference(id=1263819707640066449, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=8, rfOrder=7, authorNames=null, journalName=null, refType=null, unstructuredReference=陈德海, 邵恒, 张军令. 改进YOLOv3的船舶检测算法研究[J]. 现代电子技术, 2023, 46(2): 101–106., articleTitle=null, refAbstract=null), Reference(id=1263819708034331028, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=8, rfOrder=8, authorNames=null, journalName=null, refType=null, unstructuredReference=CHEN D H, SHAO H, ZHANG J L. Research on improved YOLOv3 ship detection algorithm[J]. Modern Electronics Technique, 2023, 46(2): 101–106 (in Chinese)., articleTitle=null, refAbstract=null), Reference(id=1263819708575396247, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=9, rfOrder=9, authorNames=null, journalName=null, refType=null, unstructuredReference=WANG D, DU H Q, MA Z F. Object detection in infrared images using modified YOLOV4 models and an image enhancement module[C]//Fourteenth International Conference on Graphics and Image Processing (ICGIP 2022). Nanjing: SPIE, 2023. DOI:10.1117/12.2680173., articleTitle=null, refAbstract=null), Reference(id=1263819708898357658, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=10, rfOrder=10, authorNames=null, journalName=null, refType=null, unstructuredReference=张炳焱, 张闯, 石振男, 等. 基于YOLO-FNC模型的轻量化船舶检测方法[J]. 中国舰船研究, 2024, 19(5): 180–187., articleTitle=null, refAbstract=null), Reference(id=1263819709330370973, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=10, rfOrder=11, authorNames=null, journalName=null, refType=null, unstructuredReference=ZHANG B Y, ZHANG C, SHI Z N, et al. Lightweight ship detection method based on YOLO-FNC model[J]. Chinese Journal of Ship Research, 2024, 19(5): 180–187 (in both Chinese and English)., articleTitle=null, refAbstract=null), Reference(id=1263819709527503265, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=11, rfOrder=12, authorNames=null, journalName=null, refType=null, unstructuredReference=HU S M, ZHAO F, LU H Z, et al. Improving YOLOv7-tiny for infrared and visible light image object detection on drones[J]. Remote Sensing, 2023, 15(13): 3214., articleTitle=null, refAbstract=null), Reference(id=1263819709636555172, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=12, rfOrder=13, authorNames=null, journalName=null, refType=null, unstructuredReference=张瑶, 陈姚节. 改进YOLOv8的水面小目标检测算法[J]. 计算机系统应用, 2024, 33(4): 152–161., articleTitle=null, refAbstract=null), Reference(id=1263819709938545061, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=12, rfOrder=14, authorNames=null, journalName=null, refType=null, unstructuredReference=ZHANG Y, CHEN Y J. Improved YOLOv8 algorithm for small surface object detection on water surface[J]. Computer Systems & Applications, 2024, 33(4): 152–161 (in Chinese)., articleTitle=null, refAbstract=null), Reference(id=1263819711662404006, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=13, rfOrder=15, authorNames=null, journalName=null, refType=null, unstructuredReference=DING X H, ZHANG X Y, HAN J G, et al. Diverse branch block: building a convolution as an inception-like unit[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, TN: IEEE, 2021: 10881−10890. DOI:10.1109/CVPR46437.2021.01074., articleTitle=null, refAbstract=null), Reference(id=1263819711897285034, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=14, rfOrder=16, authorNames=null, journalName=null, refType=null, unstructuredReference=MISRA D, NALAMADA T, ARASANIPALAI A U, et al. Rotate to attend: convolutional triplet attention module[C]//2021 IEEE Winter Conference on Applications of Computer Vision (WACV). Waikoloa, HI: IEEE, 2021: 3138−3147. DOI:10.1109/WACV48630.2021.00318., articleTitle=null, refAbstract=null), Reference(id=1263819712153137580, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=15, rfOrder=17, authorNames=null, journalName=null, refType=null, unstructuredReference=LIU C, WANG K G, LI Q, et al. Powerful-IoU: more straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism[J]. Neural Networks, 2024, 170: 276–284., articleTitle=null, refAbstract=null), Reference(id=1263819712685814192, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=16, rfOrder=18, authorNames=null, journalName=null, refType=null, unstructuredReference=ZHENG Z H, WANG P, REN D W, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Transactions on Cybernetics, 2022, 52(8): 5874–8586., articleTitle=null, refAbstract=null), Reference(id=1263819713021358513, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=17, rfOrder=19, authorNames=null, journalName=null, refType=null, unstructuredReference=ZHENG Z H, WANG P, LIU W, et al. Distance−IoU loss: faster and better learning for bounding box regression[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York: AAAI, 2020: 12993−13000. DOI:10.1609/aaai.v34i07.6999., articleTitle=null, refAbstract=null), Reference(id=1263819713256239542, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=18, rfOrder=20, authorNames=null, journalName=null, refType=null, unstructuredReference=REZATOFIGHI H, TSOI N, GWAK J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, CA: IEEE, 2019: 658−666. DOI:10.1109/CVPR.2019.00075., articleTitle=null, refAbstract=null), Reference(id=1263819713449177529, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=19, rfOrder=21, authorNames=null, journalName=null, refType=null, unstructuredReference=GEVORGYAN Z. SIoU loss: more powerful learning for bounding box regression[Z/OL]. (2022-05-25)[2024-07-26]. https://arxiv.org/abs/2205.12740., articleTitle=null, refAbstract=null), Reference(id=1263819713675669948, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=20, rfOrder=22, authorNames=null, journalName=null, refType=null, unstructuredReference=ZHANG H, XU C, ZHANG S J. Inner-IoU: more effective intersection over union loss with auxiliary bounding box[Z/OL]. (2023-11-14)[2024-07-25]. https://arxiv.org/abs/2311.02877., articleTitle=null, refAbstract=null), Reference(id=1263819713910550974, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=21, rfOrder=23, authorNames=null, journalName=null, refType=null, unstructuredReference=REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1131–1149., articleTitle=null, refAbstract=null), Reference(id=1263819714137043392, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=22, rfOrder=24, authorNames=null, journalName=null, refType=null, unstructuredReference=HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[Z/OL]. (2018-01-24)[2025-02-26]. https://arxiv.org/abs/1703.06870., articleTitle=null, refAbstract=null)], funds=null, companyList=[AuthorCompany(id=1263819681245311152, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, xref=1, ext=[AuthorCompanyExt(id=1263819681295642801, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, companyId=1263819681245311152, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1School of Shipping and Naval Architecture, Chongqing Jiaotong University, Chongqing 400074, China), AuthorCompanyExt(id=1263819681329197234, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, companyId=1263819681245311152, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1重庆交通大学 航运与船舶工程学院,重庆 400074)]), AuthorCompany(id=1263819681744433336, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, xref=2, ext=[AuthorCompanyExt(id=1263819681773793465, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, companyId=1263819681744433336, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2National Engineering Laboratory of Transport Safety and Emergency Informatics, Beijing 100011, China), AuthorCompanyExt(id=1263819681790570682, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, companyId=1263819681744433336, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2交通安全应急信息技术国家工程实验室,北京 100011)])], figs=[ArticleFig(id=1263819691475218704, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, label=Fig.1, caption=Network structure of MITD-YOLOv8, figureFileSmall=UZjkW00Pv7NB1CKOK2xhQw==, figureFileBig=85n6A/st42T/qRJwiSRqAQ==, tableContent=null), ArticleFig(id=1263819691773014290, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, label=图1, caption=MITD-YOLOv8网络结构图, figureFileSmall=UZjkW00Pv7NB1CKOK2xhQw==, figureFileBig=85n6A/st42T/qRJwiSRqAQ==, tableContent=null), ArticleFig(id=1263819693593342233, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, label=Fig.2, caption=Network structure of C2f_DBB, figureFileSmall=m1S0lrung1W+PeuScUEd3Q==, figureFileBig=8VxaN3Sdr3ip/nDl6jg0SA==, tableContent=null), ArticleFig(id=1263819693979218208, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, label=图2, caption=C2f_DBB网络结构图, figureFileSmall=m1S0lrung1W+PeuScUEd3Q==, figureFileBig=8VxaN3Sdr3ip/nDl6jg0SA==, tableContent=null), ArticleFig(id=1263819694214099236, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, label=Fig.3, caption=Network structure of DBB, figureFileSmall=CsmUY9vgVKYVAlYaFAcp6g==, figureFileBig=QaAyYEQsK7hYzlpGnksNdQ==, tableContent=null), ArticleFig(id=1263819694373482792, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, label=图3, caption=DBB网络结构图, figureFileSmall=CsmUY9vgVKYVAlYaFAcp6g==, figureFileBig=QaAyYEQsK7hYzlpGnksNdQ==, tableContent=null), ArticleFig(id=1263819694574809385, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, label=Fig.4, caption=Feature extraction comparison between C2f convolution and DBB convolution, figureFileSmall=JutxwUuLhi8KDBhyuJFd3Q==, figureFileBig=8uGMOwq7nNeBMlJ5ohlOaA==, tableContent=null), ArticleFig(id=1263819694771941678, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, label=图4, caption=C2f卷积与DBB卷积特征提取对比图, figureFileSmall=JutxwUuLhi8KDBhyuJFd3Q==, figureFileBig=8uGMOwq7nNeBMlJ5ohlOaA==, tableContent=null), ArticleFig(id=1263819694952296753, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, label=Fig.5, caption=Network structure of the detection head, figureFileSmall=DH0qaNPCJV/4btVqD4B97w==, figureFileBig=AgtrWBhJwLVUOBaPx/tnYQ==, tableContent=null), ArticleFig(id=1263819695254286645, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, label=图5, caption=检测头网络结构图, figureFileSmall=DH0qaNPCJV/4btVqD4B97w==, figureFileBig=AgtrWBhJwLVUOBaPx/tnYQ==, tableContent=null), ArticleFig(id=1263819695564665144, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, label=Fig.6, caption=Network structure of the triple attention mechanism, figureFileSmall=J4XWtjSfdNbAN/eim6D2Lg==, figureFileBig=tCtWwNCj31RCuBUvAjUCgQ==, tableContent=null), ArticleFig(id=1263819695845683516, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, label=图6, caption=三重注意力机制网络结构图, figureFileSmall=J4XWtjSfdNbAN/eim6D2Lg==, figureFileBig=tCtWwNCj31RCuBUvAjUCgQ==, tableContent=null), ArticleFig(id=1263819696101536066, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, label=Fig.7, caption=Example images of the supplemental dataset, figureFileSmall=4pRW5sJ/xM5dr5uS9OINfA==, figureFileBig=YpF78hY4yWMoqBvbPPMIQg==, tableContent=null), ArticleFig(id=1263819696453857605, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, label=图7, caption=补充数据集图片示例, figureFileSmall=4pRW5sJ/xM5dr5uS9OINfA==, figureFileBig=YpF78hY4yWMoqBvbPPMIQg==, tableContent=null), ArticleFig(id=1263819698127384909, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, label=Fig.8, caption=Comparison of improved effects, figureFileSmall=Y46PztflU/06jgyi1rTYog==, figureFileBig=mmD3M9Jpr1tJXkTGNf5d3g==, tableContent=null), ArticleFig(id=1263819698282574160, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, label=图8, caption=改进效果对比, figureFileSmall=Y46PztflU/06jgyi1rTYog==, figureFileBig=mmD3M9Jpr1tJXkTGNf5d3g==, tableContent=null), ArticleFig(id=1263819698450346323, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, label=Fig.9, caption=Comparative experiment of scenario 1, figureFileSmall=AzSfuD2dQihrw/MaHs7LMA==, figureFileBig=Y6cSZD9spWTpa/DWnG1Gbw==, tableContent=null), ArticleFig(id=1263819698857193815, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, label=图9, caption=场景1对比实验, figureFileSmall=AzSfuD2dQihrw/MaHs7LMA==, figureFileBig=Y6cSZD9spWTpa/DWnG1Gbw==, tableContent=null), ArticleFig(id=1263819699092074841, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, label=Fig.10, caption=Comparative experiment of scenario 2, figureFileSmall=N5Q6lNskUTl+kzi3D06SPw==, figureFileBig=Wt3nuEGaqLGq99Zjsw+JQA==, tableContent=null), ArticleFig(id=1263819699607974236, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, label=图10, caption=场景2对比实验, figureFileSmall=N5Q6lNskUTl+kzi3D06SPw==, figureFileBig=Wt3nuEGaqLGq99Zjsw+JQA==, tableContent=null), ArticleFig(id=1263819700211954014, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, label=Fig.11, caption=Comparative experiment of scenario 3, figureFileSmall=kbOLorAxxaJXir4HzvTEgQ==, figureFileBig=GU15GXzWYx6FS8loPkAHvQ==, tableContent=null), ArticleFig(id=1263819700404892000, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, label=图11, caption=场景3对比实验, figureFileSmall=kbOLorAxxaJXir4HzvTEgQ==, figureFileBig=GU15GXzWYx6FS8loPkAHvQ==, tableContent=null), ArticleFig(id=1263819700690104676, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, label=Fig.12, caption=Comparative experiment of scenario 4, figureFileSmall=TsnDsRUgdw93QdteujyZOA==, figureFileBig=JhGbi9CuldzVlje2MmdztA==, tableContent=null), ArticleFig(id=1263819700874654051, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, label=图12, caption=场景4对比实验, figureFileSmall=TsnDsRUgdw93QdteujyZOA==, figureFileBig=JhGbi9CuldzVlje2MmdztA==, tableContent=null), ArticleFig(id=1263819702573347175, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, label=Tab.1, caption=

Training parameters setting

, figureFileSmall=null, figureFileBig=null, tableContent=
参数数值
输入分辨率640×640
学习率浮点数(lrf)0.01
动量0.937
权重衰减0.0005
批大小8
工作线程数4
训练轮数100
优化器SGD
锚框交并比0.7
), ArticleFig(id=1263819702715953512, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, label=表1, caption=

训练参数设置

, figureFileSmall=null, figureFileBig=null, tableContent=
参数数值
输入分辨率640×640
学习率浮点数(lrf)0.01
动量0.937
权重衰减0.0005
批大小8
工作线程数4
训练轮数100
优化器SGD
锚框交并比0.7
), ArticleFig(id=1263819702866948458, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, label=Tab.2, caption=

Results of the comparative experiments

, figureFileSmall=null, figureFileBig=null, tableContent=
模型P/%R/%mAP@0.5/%FPS
YOLOv8n89.480.786.7135.7
MITD-YOLO91.782.488.9125.5
), ArticleFig(id=1263819703160549741, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, label=表2, caption=

对比实验结果

, figureFileSmall=null, figureFileBig=null, tableContent=
模型P/%R/%mAP@0.5/%FPS
YOLOv8n89.480.786.7135.7
MITD-YOLO91.782.488.9125.5
), ArticleFig(id=1263819703659671920, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, label=Tab.3, caption=

Model performance comparison after improving the loss function

, figureFileSmall=null, figureFileBig=null, tableContent=
模型P/%R/%mAP@0.5/%FPS
CIoU89.480.786.7135.7
DIoU89.980.986.5139.7
GIoU89.980.786.2134.0
SIoU89.981.686.9136.5
MPDIoU89.880.386.0136.4
Inner-CIoU90.480.586.0132.6
PIoUv290.782.088.3135.5
), ArticleFig(id=1263819703907135859, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, label=表3, caption=

改进损失函数后模型性能对比

, figureFileSmall=null, figureFileBig=null, tableContent=
模型P/%R/%mAP@0.5/%FPS
CIoU89.480.786.7135.7
DIoU89.980.986.5139.7
GIoU89.980.786.2134.0
SIoU89.981.686.9136.5
MPDIoU89.880.386.0136.4
Inner-CIoU90.480.586.0132.6
PIoUv290.782.088.3135.5
), ArticleFig(id=1263819704049742197, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, label=Tab.4, caption=

Detection performance comparison of different models

, figureFileSmall=null, figureFileBig=null, tableContent=
模型P/%R/%mAP@0.5/%FPS
Faster-RCNN87.777.083.694.35
Mask-RCNN88.079.585.882.70
YOLOv588.981.987.5129.80
YOLOv791.388.789.086.75
YOLOv8n89.480.786.7135.70
YOLOv1082.675.282.5112.65
YOLOv1189.379.385.896.90
MITD-YOLO91.782.488.9125.5
), ArticleFig(id=1263819704347537784, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, label=表4, caption=

不同模型的检测性能对比

, figureFileSmall=null, figureFileBig=null, tableContent=
模型P/%R/%mAP@0.5/%FPS
Faster-RCNN87.777.083.694.35
Mask-RCNN88.079.585.882.70
YOLOv588.981.987.5129.80
YOLOv791.388.789.086.75
YOLOv8n89.480.786.7135.70
YOLOv1082.675.282.5112.65
YOLOv1189.379.385.896.90
MITD-YOLO91.782.488.9125.5
), ArticleFig(id=1263819704490144122, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=EN, label=Tab.5, caption=

Results of the ablation experiments

, figureFileSmall=null, figureFileBig=null, tableContent=
改进模块检测结果
C2f_DBBEMSConvTriple AttentionPIoUv2P/%R/%mAP@0.5/%FPS
××××89.480.786.7135.7
×××90.679.786.2135.1
×××89.580.687.0136.9
×××90.781.287.1125.1
×××90.782.088.3135.5
××90.280.485.9130.9
×91.081.888.1124.6
91.782.488.9125.5
), ArticleFig(id=1263819704603390333, tenantId=1146029695717560320, journalId=1263530845441638439, articleId=1263819611607282051, language=CN, label=表5, caption=

消融实验结果

, figureFileSmall=null, figureFileBig=null, tableContent=
改进模块检测结果
C2f_DBBEMSConvTriple AttentionPIoUv2P/%R/%mAP@0.5/%FPS
××××89.480.786.7135.7
×××90.679.786.2135.1
×××89.580.687.0136.9
×××90.781.287.1125.1
×××90.782.088.3135.5
××90.280.485.9130.9
×91.081.888.1124.6
91.782.488.9125.5
)], attaches=null, journal=Journal(id=1263530641632018469, delFlag=0, nameCn=中国舰船研究, nameEn=Chinese Journal of Ship Research, nameHistory1=null, nameHistory2=null, issn=1673-3185, eissn=null, cn=42-1755/TJ, coden=null, periodic=1, language=CN, oaType=null, ccby=null, superviseOffice=null, ownerOffice=null, pubOffice=null, editorOffice=null, officeType=null, aims=null, clcCode=null, officeProv=null, officeCity=null, officeAddr=null, officeZip=null, officeEmail=null, officePhone=null, editDirector=null, officeDirector=null, officeDirectorPhone=null, officeStaffNum=null, officeEmpNum=null, coverPicUrl=uuiC2KHI0RbgmeHEYieSVQ==, journalPrice=null, startedYear=null, abbrevIsoEn=Chinese Journal of Ship Research, journalRemark=null, publicationField=null, createdTime=1779178780231, updatedTime=1779179141739, createdBy=18614031015, updatedBy=13701087609, firstLetterCn=C, firstLetterEn=C, subjectCode=Engineering, subjectName=null, subjectCodeEn=Engineering, subjectNameEn=null, picCn=uuiC2KHI0RbgmeHEYieSVQ==, picEn=742/gRTuoSZweF2ujSJArQ==, jcr=null, cjcr=null, exts=[JournalExt(id=1263532158132564178, language=CN, name=中国舰船研究, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=, createdTime=1779179141791, updatedTime=1779179141791, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=https://zgjcyjauthor.manuscriptcloud.com/, submissionEditorUrl=https://zgjcyjeditor.manuscriptcloud.com/, submissionReviewUrl=https://zgjcyjauthor.manuscriptcloud.com/, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""}), JournalExt(id=1263532158254198995, language=EN, name=Chinese Journal of Ship Research, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=, createdTime=1779179141820, updatedTime=1779179141820, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=https://zgjcyjauthor.manuscriptcloud.com/, submissionEditorUrl=https://zgjcyjeditor.manuscriptcloud.com/, submissionReviewUrl=https://zgjcyjauthor.manuscriptcloud.com/, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""})], databaseList=null, tenantJournalId=1263530845441638439, websiteList=[Website(id=1263532309169451247, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1263530845441638439, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/zgjcyj/CN, language=CN, createTime=1779179177801, createBy=18614031015, updateTime=1779180752761, updateBy=18614031015, name=中国舰船研究-中文, tplId=1146099689490845704, title=中国舰船研究, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1263552162215375681, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309169451247, code=articleTextType, value=kx, createTime=1779183911136, updateTime=1779183911136, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552162181821246, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309169451247, code=banner, value=null, createTime=1779183911128, updateTime=1779183911128, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552162253124420, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309169451247, code=grayFlag, value=0, createTime=1779183911145, updateTime=1779183911145, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552162173432637, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309169451247, code=logo, value=https://castjournals.cast.org.cn/joweb/zgjcyj/CN/file/pic?fileId=nJodoAVDNU0dVNGTgYrzsA==, createTime=1779183911126, updateTime=1779183911126, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552162299261766, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309169451247, code=minRunFlag, value=0, createTime=1779183911156, updateTime=1779183911156, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552162198598464, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309169451247, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/zgjcyj/CN/file/pic, createTime=1779183911132, updateTime=1779183911132, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552162286678853, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309169451247, code=silenceFlag, value=0, createTime=1779183911153, updateTime=1779183911153, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552162190209855, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309169451247, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_cn_619/, createTime=1779183911130, updateTime=1779183911130, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552162232152898, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309169451247, code=themeColor, value=null, createTime=1779183911140, updateTime=1779183911140, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552162240541507, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309169451247, code=themeStyle, value=null, createTime=1779183911142, updateTime=1779183911142, creator=18614031015, updator=18614031015)]), Website(id=1263532309249143025, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1263530845441638439, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/zgjcyj/EN, language=EN, createTime=1779179177820, createBy=18614031015, updateTime=1779180748021, updateBy=18614031015, name=中国舰船研究-英文, tplId=1146101810881728533, title=Chinese Journal of Ship Research, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1263552187725132620, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309249143025, code=articleTextType, value=kx, createTime=1779183917218, updateTime=1779183917218, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552187704161097, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309249143025, code=banner, value=null, createTime=1779183917213, updateTime=1779183917213, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552187754492751, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309249143025, code=grayFlag, value=0, createTime=1779183917225, updateTime=1779183917225, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552187695772488, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309249143025, code=logo, value=https://castjournals.cast.org.cn/joweb/zgjcyj/EN/file/pic?fileId=nJodoAVDNU0dVNGTgYrzsA==, createTime=1779183917211, updateTime=1779183917211, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552187779658577, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309249143025, code=minRunFlag, value=0, createTime=1779183917231, updateTime=1779183917231, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552187720938315, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309249143025, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/zgjcyj/EN/file/pic, createTime=1779183917217, updateTime=1779183917217, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552187762881360, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309249143025, code=silenceFlag, value=0, createTime=1779183917228, updateTime=1779183917228, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552187712549706, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309249143025, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_en_623/, createTime=1779183917215, updateTime=1779183917215, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552187733521229, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309249143025, code=themeColor, value=null, createTime=1779183917220, updateTime=1779183917220, creator=18614031015, updator=18614031015), WebsiteProps(id=1263552187737715534, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1263532309249143025, code=themeStyle, value=null, createTime=1779183917222, updateTime=1779183917222, creator=18614031015, updator=18614031015)])], journalTitle=中国舰船研究, weixinUrl=null, journalUrl=https://www.ship-research.com/, iacademicId=null, status=1, seqNo=null, journalTitleEn=Chinese Journal of Ship Research, journalPhotoCn=uuiC2KHI0RbgmeHEYieSVQ==, journalPhotoEn=742/gRTuoSZweF2ujSJArQ==, journalFirstLetter=C, journalRecommend=null, journalNew=null, journalCollection=null, jcrJf=null, cjcrJf=null, jcrJfStr=null, cjcrJfStr=null, submissionFirstDecision=null, sciSubjectClassification=null, casSubjectClassification=null, citeScore=null, totalCitationFrequency=null, icpCode=null, psCode=null, advertisingLicenseCode=null, copyrightInformation=null, country=null, option=, provinceCode=null, provinceName=null, collectFlag=false), detailUrlCn=https://castjournals.cast.org.cn/joweb/zgjcyj/CN/10.19693/j.issn.1673-3185.04311, detailUrlEn=https://castjournals.cast.org.cn/joweb/zgjcyj/EN/10.19693/j.issn.1673-3185.04311, pdfUrlCn=https://castjournals.cast.org.cn/joweb/zgjcyj/CN/PDF/10.19693/j.issn.1673-3185.04311, pdfUrlEn=https://castjournals.cast.org.cn/joweb/zgjcyj/EN/PDF/10.19693/j.issn.1673-3185.04311, aliStartDate=null, aliEndDate=null, collectionFlag=false, citedCount=null, citedUrl=null, reference=null)
收藏切换
MITD-YOLO:改进YOLOv8n的海上红外目标检测方法
收藏切换
PDF下载
杨雪锋 *, 1, 2 , 刘佳尧 1 , 周昌华 1
中国舰船研究 | 武器与电子信息系统 2026,21(2): 424-434
收起
收藏切换
中国舰船研究 | 武器与电子信息系统 2026, 21(2): 424-434
MITD-YOLO:改进YOLOv8n的海上红外目标检测方法
全屏
杨雪锋*, 1, 2 , 刘佳尧1, 周昌华1
作者信息
  • 1重庆交通大学 航运与船舶工程学院,重庆 400074
  • 2交通安全应急信息技术国家工程实验室,北京 100011
  • 杨雪锋,男,1987年生,博士,讲师。研究方向:水上智能运输系统,船舶航行环境感知,航迹规划。E-mail:

通讯作者:

* 杨雪锋
MITD-YOLO: an improved YOLOv8n-based method for maritime infrared target detection
Xuefeng YANG*, 1, 2 , Jiayao LIU1, Changhua ZHOU1
Affiliations
  • 1School of Shipping and Naval Architecture, Chongqing Jiaotong University, Chongqing 400074, China
  • 2National Engineering Laboratory of Transport Safety and Emergency Informatics, Beijing 100011, China
出版时间: 2026-04-30 doi: 10.19693/j.issn.1673-3185.04311
文章导航
收藏切换
目的

海上红外图像背景复杂、目标尺寸变化大、海浪杂波干扰严重,容易导致目标漏检和误检。为提高红外图像中的目标检测准确率,提出一种基于YOLOv8n的海上红外目标检测方法——面向海上红外目标检测任务的YOLO(MITD-YOLO)。

方法

首先引入多样化分支模块(DBB)和多尺度卷积(EMSConv),利用多个不同尺度的卷积使模型能够更好地捕捉复杂特征。然后,采用三重注意力机制(triple attention)实现空间和通道维度的特征交互,强化关键特征提取。最后,使用Powerful-IoUv2(PIoUv2)对原模型的损失函数进行改进,以解决锚框扩展问题,提高检测精度并增强模型的鲁棒性。

结果

实验结果表明,改进的MITD-YOLO模型对海上红外图像目标的检测效果有所提升:准确率提升2.3%,召回率提升1.7%;平均准确率达88.9%,帧率(FPS)达到132.8,优于原模型。

结论

该方法可提高海上红外目标检测效果,为海上安全监控和船舶导航等领域提供更可靠的目标检测技术,助力智能海洋系统发展。

目标检测  /  红外目标检测  /  多尺度卷积  /  三重注意力机制
Objective

Complex backgrounds, significant target size variations, and severe sea clutter in maritime infrared imagery often result in missed or false detections. To address this challenge, an improved method based on YOLOv8n, termed maritime infrared target detection-YOLO (MITD-YOLO), is proposed to enhance target detection accuracy in maritime infrared images.

Method

MITD-YOLO incorporates a diverse branch module (DBB) and enhanced multi-scale convolution (EMSConv) to leverage multi-scale convolutions, enabling the model to more effectively capture complex features. A triple attention mechanism is employed to facilitate spatial and channel-wise feature interaction, thereby improving key feature extraction. Additionally, the powerful-IoUv2 (PIoUv2) loss function is introduced to address the anchor box expansion problem, leading to improved detection accuracy and enhanced model robustness.

Results

Experimental results show that the improved model significantly enhances the efficiency of maritime infrared target detection, with a 2.3% increase in precision and a 1.7% increase in recall. The model achieves an average precision of 88.9%, and 132.8 FPS, outperforming the original model.

Conclusion

MITD-YOLO enhances maritime infrared target detection performance and provides a more reliable target detection technology for applications such as maritime surveillance and ship navigation, contributing to the advancement of intelligent maritime systems.

target tracking  /  infrared target detection  /  multi-scale convolution  /  triple attention mechanism
杨雪锋, 刘佳尧, 周昌华. MITD-YOLO:改进YOLOv8n的海上红外目标检测方法. 中国舰船研究, 2026 , 21 (2) : 424 -434 . DOI: 10.19693/j.issn.1673-3185.04311
Xuefeng YANG, Jiayao LIU, Changhua ZHOU. MITD-YOLO: an improved YOLOv8n-based method for maritime infrared target detection[J]. Chinese Journal of Ship Research, 2026 , 21 (2) : 424 -434 . DOI: 10.19693/j.issn.1673-3185.04311
近年来,红外成像技术因其能够在完全黑暗或恶劣天气条件下探测目标,受到越来越多研究人员的关注。在海上环境中,红外成像技术可以穿透海雾,减少海面反射干扰,提高对远距离小目标的探测能力,已广泛应用于船舶航行环境感知、海事监管、搜救巡逻、海上军事行动等领域。然而,由于海上红外图像背景复杂、目标尺寸变化大、海浪杂波干扰严重等,海上红外图像目标检测的准确率还不高。
现阶段,红外目标的检测方法主要分为背景差法、优化方法和深度学习方法3类。背景差法[1-2]的主要思路是对图像背景建模,然后从原始图像中减去背景,灰度值为0的区域为背景,非0区域为目标。这种方法假设背景可预测,在静态背景且无明显光照变时具有较好的效果,但难以适应动态背景或目标运动场景。海上红外图像背景中的海浪杂波难以用该方法建模。基于优化的红外目标检测方法[3]是将目标检测问题转化为稀疏低秩矩阵的优化问题,通过找到原始图像的最佳稀疏表示,使重构图像与原始图像差异最小化,同时确保目标特征被准确表示。该方法可以通过模型优化提升性能,但低秩背景的假设使得其对背景均匀性的要求很高,海面杂波会降低目标检测的准确率。基于深度学习的方法主要通过构建深度网络模型,利用大量数据训练模型,挖掘图像特征,再利用训练好的网络模型检测目标。该方法具有较高准确率和快速检测能力,是目前红外图像目标检测领域的重要发展方向。
基于深度学习的目标检测方法主要分为双阶段方法和单阶段方法。双阶段方法通常先生成候选区域,然后对每个候选区域进行分类和回归。常用的双阶段方法包括区域卷积神经网络特征(regions with CNN features,R-CNN) [4]、掩码区域卷积神经网络(mask R-CNN)等。Gupta[5]采用Mask R-CNN算法进行行人检测,展示了该算法在高质量图像分割和检测中的优势。Ou等[6]提出一种改进的Faster R-CNN模型,用于变电站电气设备的红外图像检测,通过优化特征提取网络和增加锚框比例,提高了检测精度和速度。单阶段方法将目标检测问题直接转化为回归问题,常用的单阶段算法有YOLO系列算法和单镜头多盒检测器(single shot multibox detector,SSD)算法。Zhao等[7]提出一种改进的SSD算法,通过使用跨步卷积层替换最大池化层,增强浅层特征信息,并引入残差单元和MSRA函数初始化权重,显著提升了红外小车目标检测的精度。陈德海等[8]基于YOLOv3 DarkNet-53主干网络结构,增添一个浅层检测尺度,并添加8倍上采样使其与最后一层检测尺度进行信息融合,有效提升了小目标的检测效果。Wang等[9]设计细节增强模块,结合空间注意力机制以增强图像纹理和细节,并通过修改YOLOv4模型,引入Alpha-IoU损失和加权非极大值抑制(non-maximum suppression,NMS),显著提升了检测精度。张炳焱等[10]运用基于FasterNet思想改进的FasterNeXt模块替换YOLO模型中的C3模块,同时添加神经注意力模块(neural attention module,NAM)注意力机制,在不影响准确性的条件下提高了运行速度。Hu等[11]提出一种基于YOLOv7-tiny的改进算法,根据纵横比分配锚框和使用困难样本挖掘损失函数,显著提升了无人机图像中小目标的检测精度。张瑶等[12]提出一种改进的YOLOv8水面小目标检测算法,基于BiFormer双层路由注意力机制构建的C2fBF模块,并添加小目标检测头,使用多视角距离交并比(multi-perspective distance intersection over union,MPD IoU)替换损失函数,有效解决了水面目标检测中的噪声干扰和小目标漏检问题。
从上述分析可以看出,YOLO系列模型在目标检测领域的改进主要围绕网络结构优化、多尺度特征融合、注意力机制、损失函数改进、锚框优化策略5个维度展开,具体改进方法需与应用场景目标特征相匹配。海上红外图像存在背景复杂、目标尺寸变化大、海浪杂波干扰严重等问题,导致现有方法对海上红外目标检测的准确率不高,存在误检和漏检现象,需结合海上红外目标特征对目标模型进行优化。因此,本文拟利用YOLOv8模型进行海上红外目标检测实验,结合检测结果和红外图像特征,设计一种基于YOLOv8n的海上红外目标检测方法MITD-YOLO,并通过实验验证所提改进方法在提高海上红外目标检测准确率方面的效果。
海上红外图像通常具有背景复杂、对比度较低、船舶种类繁多且尺度差异大、遮挡重叠等问题。这些特性使得原始的YOLOv8n模型在处理海上船舶红外数据集时,存在漏检、误检、检测精度低等问题。针对这些问题,本文以YOLOv8n模型为基础,进行了以下改进:
1) 使用C2f动态分支模块C2f_DBB[13]替换C2f模块。通过并行分支的多样化卷积操作,在保持计算成本相当的情况下提高模型的检测精度。
2) 采用EMSConv对检测头进行改进。通过精简的多尺度卷积捕捉不同尺度的红外目标特征,并减少参数数量,提升检测精度和速度。
3) 引入三重注意力机制(triplet attention)[14]。在处理复杂场景和小目标时,该机制能增强特征表示,分离有用特征和噪声,从而提高模型的鲁棒性和可靠性。
4) 将损失函数替换为PIoUv2[15]。该函数能更准确地评估检测框的重叠情况,处理密集和重叠目标的检测问题,对不同形状和大小的目标具有更好的适应性,减少误检和漏检。
改进后的网络模型如图1所示,图中,Contact表示拼接矩阵,Upsample表示上采样,k表示卷积核的尺寸,s表示卷积的步幅,p表示填充的像素数。
在YOLOv8n模型中,特征提取和融合部分采用C2f模块。该模块中的Bottleneck卷积使用固定卷积核,缺乏灵活性,导致网络感受野受限,仅能捕捉目标的局部信息。由于海上红外目标尺度变化大,固定卷积核无法有效捕捉不同尺度下目标的特征,且不能根据目标尺度进行自适应调整,因此检测效果不理想。针对该问题,本文构建多样化分支模块C2f_DBB,以替换YOLOv8n模型中的C2f模块,其网络结构如图2所示。
DBB采用多分支拓扑结构,包括多尺度卷积(conv)、顺序的1×1和k×k卷积、平均池化(average pooling)、批标准化(batch norm)和分支相加,其网络结构如图3所示。
在DBB中,1×1卷积快速聚合通道信息,K×K卷积扩展感受野,使得这些分支具有不同的感受野和不同复杂程度的路径,结合不同尺度的卷积,能够丰富特征空间,使模型能够捕获从局部细节到更广泛区域的特征,改善在复杂海洋背景中的识别效果。在DBB中,平均池化通常与卷积操作结合,作为下采样的手段,帮助模型在保留关键信息的同时减少冗余特征。在海上红外图像目标检测中,池化操作有助于减少海面背景噪声的影响。同时,DBB可以在运算时等效地转换为单一卷积。因此,可以在训练阶段用DBB替换常规卷积层来构建C2f_DBB模块,训练完成再将其转换回原始的卷积结构,从而降低运算成本。DBB在每次前向传递之前不会推导出参数,而会在训练结束后一次性转换模型,然后只保存并使用转换后的模型。尽管DBB和常规卷积层在推理时的结构相同,但应用DBB可以提高模型的特征表达能力以及模型的鲁棒性,具有很高的灵活性和扩展性。C2f卷积与DBB卷积特征提取对比图如图4所示。
YOLOv8n检测头中采用了2个不同分支的卷积层模块cv2和cv3用来表示分类和回归,其中cv2主要用于回归输出预测物体的边界框信息,cv3主要用于分类输出判断每个物体的类别。cv2和cv3每个分支有2个3×3卷积和1个1×1卷积,在训练过程中,需要遍历3个检测头,因此YOLOv8n模型中的检测头占据了较大的计算量,如图5(a)所示。针对这种情况,本文采用EMSConv模块构建一个高效的具有多尺度信息共享参数的检测头,其结构如图5(b)所示。
EMSConv基于分组卷积思想,对每组的通道数进行卷积特征提取操作。EMSConv将通道分为两组,一半通道生成廉价特征图,另一半再分为两组:其中一组采用3×3卷积提取特征,另一组采用5×5卷积提取特征。这些特征在单一特征通道上完成,各通道间信息独立。最后通过1×1逐点卷积交换通道信息,对特征通道进行降维融合,实现多尺度特征提取。
目前,应用于YOLOv8图像检测的注意力机制主要有通道注意力机制、空间注意力机制与混合注意力机制。这些机制在低对比度、高噪声情形下的特征提取能力较弱,且会增加模型复杂度。为了解决这些问题,本文引入三重注意力机制,通过3个分支结构捕捉跨维度信息,能够自适应地突出关键特征,其网络结构如图5所示。
输入张量被传递至三重注意力模块的各分支。第1个分支将输入张量${\boldsymbol{\chi}} $绕着H轴逆时针旋转90°,得到张量${\hat {\boldsymbol{\chi}} _1}$,其形状为$(W \times H \times C)$,经${{Z}}$池化降维后得到形状为$(2 \times H \times C)$的张量$\hat {\boldsymbol{\chi}} _1^ * $,后通过$k \times k$的卷积层和批归一化处理,将其降维至$(1 \times H \times C)$,并经Sigmoid激活层$(\sigma )$生成注意力权重。最终绕着H轴顺时针旋转恢复原始输入形状。第2个分支绕着W轴逆时针旋转输入张量${\boldsymbol{\chi}} $得到形状为$(H \times C \times W)$的张量${\hat {\boldsymbol{\chi}} _2}$,再经过${{Z}}$池化降维至$(2 \times C \times W)$$\hat {\boldsymbol{\chi}}_2^ * $,后经卷积和批归一化处理后,输出形状为$(1 \times C \times W)$的张量。随后经激活层生成注意力权重,再旋转回原形状${\boldsymbol{\chi}} $。第3个分支中,张量$ {\boldsymbol{\chi}} $通过Z池化压缩到$(2 \times H \times W)$,后将压缩张量传递给一个由卷积核大小k定义的标准卷积层和一个批归一化层,最后经激活层生成形状为$(1 \times H \times W)$的注意力权重。三重注意力机制的网络结构如图6所示。
对每个分支产生的形状为$(C \times H \times W)$的精炼张量进行简单平均聚合。其输出应用张量y的公式为
$ y = \frac{1}{3}(\overline {{{\hat \chi }_1}\sigma ({\psi _1}(\hat \chi _1^ * ))} + \overline {{{\hat \chi }_2}\sigma ({\psi _2}(\hat \chi _2^ * ))} + \chi \sigma ({\psi _3}({\hat \chi _3}))) $
式中:σ为Sigmoid激活函数;${\psi _1}$${\psi _2}$${\psi _3}$为由3个分支中卷积核大小为k的标准二维卷积层定义的注意力三元组。
上述过程中,${{Z}}$池化层Zpool通过将平均池化特征${\text{Avg}}_{{\text{Pool}}{0d}}(\chi) $和最大池化特征${\text{Max}}_{{\text{Pool}}{0d}}(\chi) $在该维度上进行拼接,将张量的第0维度缩减为2。这使得该层在保留实际张量丰富表示的同时,缩小其深度,从而使后续的计算更为轻量化。其公式为
$ Z_{\text{pool}}(\chi)=[{\text{Max}}_{{\text{Pool}}{0d}}(\chi),{\text{Avg}}_{{\text{Pool}} {0d}}(\chi)] $
式中,下标$0d$为执行最大池化和平均池化操作的第0维度。例如,形状为$(C \times H \times W)$的张量经过${{Z}}$池化操作后,会变为形状为$(2 \times H \times W)$的张量。
在目标检测中,定位误差是指预测的边界框与真实边界框之间的位置偏差。边界框回归损失用于衡量和减少定位误差。相较于传统的L1或L2损失,交并比(IoU)成为目前评价模型预测边界框与真实边界框重叠程度的主流标准。交并比损失函数LIoU计算公式为
$ {L_{\rm{IoU}}} = \frac{{\left| {{B^{{\text{pred}}}} \cap {B^{{\text{gt}}}}} \right|}}{{\left| {{B^{{\text{pred}}}} \cup {B^{{\text{gt}}}}} \right|}} $
式中:$ {B^{{\text{pred}}}} $为预测框(预测区域);$ {B^{{\text{gt}}}} $为真实框(真实区域)。
YOLOv8针对IoU的改进有许多,如距离交并比(DIoU)、完全交并比(CIoU)和广义交并比(GIoU)等。通过优化边界框损失,模型能够更精准地确定目标位置和大小。YOLOv8n模型中采用了LCIoU[16],其定义为
$ {L_{\rm{CIoU}}} = {L_{\rm{IoU}}} - \left(\frac{{{\rho ^2}({B^{{\text{pred}}}},{B^{{\text{gt}}}})}}{{{c^2}}} + \alpha v\right) $
式中:c为包围预测框和真实框的最小外接框的对角线长度;α为平衡参数;v为衡量长宽比差异的度量;$ {\rho ^2}({B^{{\text{pred}}}},{B^{{\text{gt}}}}) $为预测框与真实框中心点之间的欧氏距离。
然而,这些基于IoU改进的损失函数由于存在恒定的惩罚因子,导致先验框在回归过程中扩展,显著降低了收敛速度,并且无法充分反映锚框与目标框之间的差异,对尺度差异较大的目标、遮挡目标检测效果较差。强交并比(PIoU)根据目标大小自适应的惩罚因子和基于先验框质量的梯度调整函数,引导先验框沿着更有效路径进行回归。在海上红外图像目标检测中,PIoU相较于其他IoU变体直接最小化锚框和目标框四边之间的距离,提高了检测效率。PIoUv2引入了一个非单调的注意力机制和一个自适应惩罚因子,这些改进有助于更有效地进行复杂场景中的目标检测,如遮挡和目标密集的环境。因此,本文采用改进的PIoUv2。
PIoU定义为
$ {L_{\rm{PIoU}}} = {L_{\rm{IoU}}} - \left( {1 - {{\text{e}}^{ - {{P'}^2}}}} \right) $
式中,$P'$为与目标大小相适应的惩罚因子,其定义为
$ P' = \left(\frac{{{\text{d}}{w_1}}}{{{w_{{\text{gt}}}}}} + \frac{{{\text{d}}{w_2}}}{{{w_{{\text{gt}}}}}} + \frac{{{\text{d}}{h_1}}}{{{h_{{\text{gt}}}}}} + \frac{{{\text{d}}{h_2}}}{{{h_{{\text{gt}}}}}}\right)/{\text{4}} $
式中:${\text{d}}{w_1}$$ {\text{d}}{w_2} $${\text{d}}{h_1}$${\text{d}}{h_2}$为预测框与目标框对应边缘之间距离的绝对值;${w_{{\text{gt}}}}$${h_{{\text{gt}}}}$分别为目标框的宽度和高度。
由于惩罚因子$P'$的分母只取决于目标框的大小,与锚框和目标框中最小外部框的大小无关,所以使用$P'$作为损失函数中的惩罚因子不会引起锚框膨胀问题。同时$P'$对目标的大小具有适应性,除非锚框完全与目标框重叠,否则$P'$不会退化为零。
通过向PIoU损失添加注意力层得到的PIoUv2损失增强了对中高置信度锚框的关注能力,提高了目标检测器的性能。注意力函数和PIoUv2损失定义为
$ u(x) = 3x \cdot {{\text{e}}^{ - {x^2}}} $
$ q = {{\text{e}}^{ - P}},{\text{ }}q \in (0,1] $
$ {L_{{\text{PIoUv2}}}} = 1 - u(\lambda q) \cdot {L_{\text{PIoU}}} = 1 - 3 \cdot (\lambda q) \cdot {{\text{e}}^{ - {{(\lambda q)}^2}}} \cdot \left( {1 - {L_{\rm{PIoU}}}} \right) $
式中,$u(\lambda q)$为注意力函数,其中用测量锚框质量的质量因子q替换了惩罚因子$P'$q的范围为$(0,1]$。当$q = 1$时,则$P' = 0$,意味着锚框与目标框完全对齐。随着$P'$增加,q逐渐减小,表示低质量的锚框。非单调注意力函数$u(\lambda q)$增强了PIoU关注中等质量锚框的能力。
实验环境配置如下:硬件设备CPU为AMD Ryzen 75700X8-Core Processor@3.40 GHz,GPU为NVIDIA GeForce RTX 3070;软件包括PyTorch 2.2.1 深度学习框架、CUDA11.8图形加速器和Python 3.8.18。在使用MITD-YOLO模型检测海上船舶红外图像时,其训练的超参数设置为默认值,具体如表1所示。
本文采用的是烟台艾睿光电科技有限公司公开发布的红外海上船舶数据集。该数据集包括8402张红外图像,涵盖邮轮、散货船、军舰、帆船、小艇、集装箱船、渔船共 7 类典型目标。数据集按照 7:2:1 的比例严格划分为训练集、验证集与测试集,各子集图片数量分别为 5 881,1 680和 841 张。
为验证算法的有效性,测试模型的泛化能力,本文补充采集了海上红外图像,扩充海上船舶红外数据集。补充图像具有以下特点,一是红外目标尺度小,仅占几个至几十个像素;二是目标背景复杂,包含港口、狭水道等交通繁忙水域的场景;三是单张图像中包含多个目标,且目标尺度大小不一。共补充海上红外图像387张,其中包含小目标的图像308张,背景复杂的图像79张(图7)。补充数据集仅用于模型性能验证,不参与模型训练。
实验中模型评价指标为精确率(precision)、召回率(recall)、平均精度均值(mean of average precision)。
精确率P是指模型预测为正类的样本中,实际为正类的比例,计算公式为
$ P = \frac{{{T_P}}}{{{T_P} + {F_P}}} $
式中:TP为图片中正确预测部件的数量;FP为图片中错误预测部件的数量。
召回率R是指所有正类样本中,被模型正确预测为正类的比例,计算公式为
$ R = \frac{{{T_P}}}{{{T_P} + {F_N}}} $
式中,FN为图片中漏检的数量。
平均精度均值mAP是所有类别平均精确率AP的平均值,其反映了模型在多个类别和多个阈值上的精确度和召回率的平均效果,计算公式为
$ AP = \int_0^1 {P(R)} \cdot {\text{d}}R $
$ mAP = \frac{1}{n}\left(\sum\limits_{i = 1}^n {A{P_i}} \right) $
式中:n为总类别数;APi为第i个分类的平均精确率。
选取帧率(frames per second,FPS)作为衡量算法运行速率的指标,即单位时间内模型完成任务的图像个数。
为验证改进算法对海上红外船舶数据集目标检测的优越性,本文开展了对比试验。采用上述评价指标,在同一平台进行训练,并采用相同的训练集、验证集和测试集。
与原YOLOv8n模型对比实验结果见表2,可见MITD-YOLO模型的检测效果有所改善,其精确率P提升了2.3%,召回率R提升了1.7%,平均精度mAP@0.5提升2.2%,FPS达到125.5,相较于YOLOv8n下降了7.5%。
YOLOv8n模型和MITD-YOLO模型的训练过程如图8所示,可见改进模型的整体性能更优,拟合效果更好。
将PIoUv2损失函数和常用的损失函数CIoU,DIoU[17],GIoU[18],SIoU[19]和Inner-CIoU[20]进行实验对比,结果如表3所示。由表可见,相比使用CIoU损失函数的原YOLOv8n模型,使用PIoUv2损失函数的YOLOv8n模型检测准确率更高,其精确率P、召回率R和平均精度mAP@0.5分别提升1.3%,1.3%和1.6%,FPS基本保持不变。这说明使用PIoUv2损失函数能够使模型对边界框的回归更加稳定,预测精度更高。
为进一步验证所提模型的优越性,与当前主流目标检测模型进行了比较,包括双阶段目标检测算法中的Faster-RCNN[21]和Mask-RCNN[22]算法,以及单阶段目标检测YOLO系列算法中的YOLOv5,YOLOv7,YOLOv8n,YOLOv10和最新YOLOv11模型。具体检测指标如表4所示。
Faster-RCNN使用区域提议网络(RPN)高效生成目标提议,由于其计算复杂性,FPS相对较低。Mask-RCNN在Faster-RCNN的基础上增加一个分支用于生成目标的掩码,适用于实例分割任务,召回率和平均精度均值表现更优,但处理速度(FPS)略低。YOLOv5是YOLO系列中的快速检测模型,其采用多尺度训练、更好的锚定机制和更高效的网络结构,在保持较高检测精度的同时大幅提升了处理速度。YOLOv7使用更深更复杂的网络结构和更细致的特征提取策略,在召回率和平均精度均值上表现最好,但牺牲了一定的速度。YOLOv8n是一种超级轻量模型,在保持极高推理速度的同时,仍能提供相对不错的检测精度。YOLOv10由于模型在特定数据集或配置下的适应性不如其他版本,虽然其FPS表现良好,但精确度和召回率相对较低,可能需要进一步优化模型架构或训练过程来提升准确度。YOLOv11在精确度、召回率和平均精度均值上均表现稳定,但其设计更注重在多种性能指标上取得平衡,而不是单一指标的极端优化,适合需要较均衡性能的应用场景。
尽管YOLOv7的准确率较高,但其模型参数较多,检测速度较慢。改进后的模型在检测速度FPS方面优于YOLOv7模型,召回率优于YOLOv8n模型。
为验证改进模型中每个改进模块对海上船舶红外图像目标检测性能的影响,对海上红外船舶数据集进行消融实验,结果如表5所示。
1) 添加C2f-DBB后,相对于YOLOv8n模型,YOLOv8n模型+C2f-DBB模块的目标检测精度有所提升,其精确率P、召回率R、平均精度mAP@0.5分别达到90.6%,79.7%和86.2%,FPS为135.1。这说明C2f-DBB模块的多分支结构及多尺度特征提取能够提供更丰富的特征表示和更强的泛化能力,通过多样性增强和特征融合机制,能更好地捕捉目标特征,提升海上船舶红外图像中复杂场景和多样化目标的检测精度。
2) 使用EMSConv替换检测头卷积后,虽然精确率P有所下降,但是召回率R和平均精度mAP@0.5有所提升。
3) 引入Triple Attention注意力机制后,模型通过使用三分支结构在不同维度信息交互计算注意力权重,能够提高海上船舶红外图像目标检测的稳定性,使模型效果有明显提升。
4) 改进PIoUv2损失函数后,模型对交并比度量更合理、对目标边界框的回归更加准确,且能提供更平滑的梯度信息,在海上船舶红外数据集测试集上的精确率P、召回率R和平均精度mAP@0.5分别达到90.7%,82%和88.3%,FPS达到135.5。这证明了PIoUv2损失函数能够提高复杂海上船舶红外图像检测的精度和稳定性。
综上所述,同时使用C2f-DBB模块、EMSConv卷积、三重注意力机制、PIoUv2损失函数,经过100轮迭代训练,训练得到的MITD-YOLO模型的精确率P、召回率R和平均精度mAP@0.5比原YOLOv8n模型分别提高2.3%,1.7%和2.2%,检测速度FPS达到125.5。
为验证改进MITD-YOLO模型的性能,图9图12分别为YOLOv8n模型和改进的MITD-YOLO模型在同一实验条件和参数下训练后,针对不同场景海上红外目标的检测效果。
原YOLOv8n模型对军舰和散货船的检测置信度分别为75%和85%。如图9所示,MITD-YOLO模型对于军舰的检测置信度达到82%,对于散货船的检测置信度达到91%,这证明MITD-YOLO模型不仅使整体检测精度有显著提升,对于不同种类船舶的检测效果也有所改善。
图10所展示的YOLOv8n模型对于海上红外船舶图像中的小目标存在大量漏检问题,而MITD-YOLO模型识别出了这些微小船舶目标,显著提升了对小目标的检测效果,能够更精准地定位海上船舶红外图像中不同尺度的目标。
图11展示了YOLOv8n模型对于海上红外船舶数据集存在的误检情况,其损失函数的局限性导致锚框回归出现误差,使同一个目标被多次识别。改进后的MITD-YOLO模型优化了损失函数,有效降低了船舶目标被误检的概率。
图11图12中背景复杂,包含了在海上复杂环境中容易混淆的背景目标和在港口中的建筑物等。由图可见,YOLOv8n模型存在混淆背景信息,误将背景中的干扰识别成船舶目标,并错误标注船舶种类;而MITD-YOLO海上船舶红外图像目标检测模型能有效区分背景信息和船舶目标。在复杂背景条件下,改进后的模型能够更准确地检测和识别船舶目标,在实际应用中对于多场景需求具有重要意义。
通过引入C2f_DBB以及EMSConv卷积,添加三重注意力机制,替换PIoUv2损失函数等改进方法,本文提出了MITD-YOLO模型。实验结果表明,该算法显著提高了在复杂背景下船舶目标的检测和识别准确率,降低了对船舶红外图像中小目标的误检和漏检率。
1) 采用多种尺度的卷积以及多分支并行结构,对尺度差别较大、对比度低、噪声明显的海上红外图像目标识别效果有显著改善。
2) 在损失函数中提高先验框的回归效率,能够提高复杂海上船舶红外图像检测框的精度和稳定性。
3) 与YOLOv8n算法相比,采用本文构建的MITD-YOLO模型后,目标检测精确率和召回率分别提升2.3%和1.7%,mAP@0.5提升2.2%,并保证了检测速度,具有较强实用性。
本文提出的MITD-YOLO模型为海上智能感知系统提供了技术支持,有助于提升海上船舶航行安全。未来还可以研究多模态数据融合以进一步提高检测性能,并探索模型压缩技术以适应资源受限的海上设备。
参考文献 引证文献
排序方式:
1
ZHU D Y, TANG J W, FU X X, et al. Detection of infrared small target based on background subtraction local contrast measure and Gaussian structural similarity[J]. Heliyon, 2023, 9(6): e16998.
2
LU Z L, LIU S X, YILAHUN H, et al. Infrared small target detection based on background estimation and scale fusion[J]. IEEE Geoscience and Remote Sensing Letters, 2024, 21: 1–5.
3
ZHANG Z Z, DING C, GAO Z S, et al. ANLPT: self-adaptive and non-local patch-tensor model for infrared small target detection[J]. Remote Sensing, 2023, 15(4): 1021.
4
GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH: IEEE, 2014: 580−587. DOI:10.1109/CVPR.2014.81.
5
GUPTA V, MANDLOI A, PAWAR S, et al. Deep learning based object detection using mask RCNN[M]//AWASTHI S, SINGH NARUKA M, PRAKASH YADAV S, et al. AI and IoT−based Intelligent Health Care & Sanitation. Singapore: Bentham Science Publishers, 2023: 207−221.
6
OU J H, WANG J G, XUE J, et al. Infrared image target detection of substation electrical equipment using an improved faster R-CNN[J]. IEEE Transactions on Power Delivery, 2023, 38(1): 387–396.
7
ZHAO X F, XIA Y T, XU M Y, et al. An infrared small vehicle target detection method based on deep learning[C]//Third International Seminar on Artificial Intelligence, Networking, and Information Technology (AINIT 2022). Shanghai: SPIE, 2023. DOI:10.1117/12.2667313.
8
陈德海, 邵恒, 张军令. 改进YOLOv3的船舶检测算法研究[J]. 现代电子技术, 2023, 46(2): 101–106.
CHEN D H, SHAO H, ZHANG J L. Research on improved YOLOv3 ship detection algorithm[J]. Modern Electronics Technique, 2023, 46(2): 101–106 (in Chinese).
9
WANG D, DU H Q, MA Z F. Object detection in infrared images using modified YOLOV4 models and an image enhancement module[C]//Fourteenth International Conference on Graphics and Image Processing (ICGIP 2022). Nanjing: SPIE, 2023. DOI:10.1117/12.2680173.
10
张炳焱, 张闯, 石振男, 等. 基于YOLO-FNC模型的轻量化船舶检测方法[J]. 中国舰船研究, 2024, 19(5): 180–187.
ZHANG B Y, ZHANG C, SHI Z N, et al. Lightweight ship detection method based on YOLO-FNC model[J]. Chinese Journal of Ship Research, 2024, 19(5): 180–187 (in both Chinese and English).
11
HU S M, ZHAO F, LU H Z, et al. Improving YOLOv7-tiny for infrared and visible light image object detection on drones[J]. Remote Sensing, 2023, 15(13): 3214.
12
张瑶, 陈姚节. 改进YOLOv8的水面小目标检测算法[J]. 计算机系统应用, 2024, 33(4): 152–161.
ZHANG Y, CHEN Y J. Improved YOLOv8 algorithm for small surface object detection on water surface[J]. Computer Systems & Applications, 2024, 33(4): 152–161 (in Chinese).
13
DING X H, ZHANG X Y, HAN J G, et al. Diverse branch block: building a convolution as an inception-like unit[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, TN: IEEE, 2021: 10881−10890. DOI:10.1109/CVPR46437.2021.01074.
14
MISRA D, NALAMADA T, ARASANIPALAI A U, et al. Rotate to attend: convolutional triplet attention module[C]//2021 IEEE Winter Conference on Applications of Computer Vision (WACV). Waikoloa, HI: IEEE, 2021: 3138−3147. DOI:10.1109/WACV48630.2021.00318.
15
LIU C, WANG K G, LI Q, et al. Powerful-IoU: more straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism[J]. Neural Networks, 2024, 170: 276–284.
16
ZHENG Z H, WANG P, REN D W, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Transactions on Cybernetics, 2022, 52(8): 5874–8586.
17
ZHENG Z H, WANG P, LIU W, et al. Distance−IoU loss: faster and better learning for bounding box regression[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York: AAAI, 2020: 12993−13000. DOI:10.1609/aaai.v34i07.6999.
18
REZATOFIGHI H, TSOI N, GWAK J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, CA: IEEE, 2019: 658−666. DOI:10.1109/CVPR.2019.00075.
19
GEVORGYAN Z. SIoU loss: more powerful learning for bounding box regression[Z/OL]. (2022-05-25)[2024-07-26]. https://arxiv.org/abs/2205.12740.
20
ZHANG H, XU C, ZHANG S J. Inner-IoU: more effective intersection over union loss with auxiliary bounding box[Z/OL]. (2023-11-14)[2024-07-25]. https://arxiv.org/abs/2311.02877.
21
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1131–1149.
22
HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[Z/OL]. (2018-01-24)[2025-02-26]. https://arxiv.org/abs/1703.06870.
2026年第21卷第2期
PDF下载
13
5
引用本文
BibTeX
文章信息
doi: 10.19693/j.issn.1673-3185.04311
  • 接收时间:2024-12-13
  • 首发时间:2026-05-20
  • 出版时间:2026-04-30
补充材料
相关文章
文章信息
作者
出版历史
  • 收稿日期:2024-12-13
  • 修回日期:2025-04-07
基金
作者信息
    1重庆交通大学 航运与船舶工程学院,重庆 400074
    2交通安全应急信息技术国家工程实验室,北京 100011

通讯作者:

* 杨雪锋
参考文献
分享链接
https://castjournals.cast.org.cn/joweb/zgjcyj/CN/10.19693/j.issn.1673-3185.04311
分享至
全文二维码

扫描看全文

引用本文
BibTeX
本文的引用情况
2种不同金属材料的力学参数

Family
属数
Number of
genus
种数
Number of
species
占总种数比例
Percentage of
total species (%)

Genus
种数
Number of
species
占总种数比例
Percentage of total
species (%)
鹅膏菌科Amanitaceae 2 11 5.26 鹅膏菌属 Amanita 10 4.78
小菇科 Mycenaceae 2 12 5.74 丝盖伞属 Inocybe 5 2.39
多孔菌科 Polyporaceae 8 14 6.70 蜡蘑属 Laccaria 5 2.39
红菇科 Russulaceae 3 23 11.00 小皮伞属 Marasmius 6 2.87
小菇属 Mycena 11 5.26
光柄菇属 Pluteus 5 2.39
红菇属 Russula 17 8.13
栓菌属 Trametes 5 2.39
关闭全屏