Article(id=1149776903252435686, tenantId=1146029695717560320, journalId=1146123166801305609, issueId=1149776900194791454, articleNumber=null, orderNo=null, doi=10.12404/j.issn.1671-1815.2403508, pmid=null, cstr=null, oa=null, hot=null, price=null, onlineType=0, articleFormat=0, articleType=null, articleTypeStr=research-article, receivedDate=1715443200000, receivedDateStr=2024-05-12, revisedDate=1722441600000, revisedDateStr=2024-08-01, acceptedDate=null, acceptedDateStr=null, onlineDate=1752057775557, onlineDateStr=2025-07-09, pubDate=1744905600000, pubDateStr=2025-04-18, doiRegisterDate=null, doiRegisterDateStr=null, onlineIssueDate=1752057775557, onlineIssueDateStr=2025-07-09, onlineJustAcceptDate=null, onlineJustAcceptDateStr=null, onlineFirstDate=null, onlineFirstDateStr=null, sourceXml=null, magXml=null, createTime=1752057775557, creator=13701087609, updateTime=1752057775557, updator=13701087609, issue=Issue{id=1149776900194791454, tenantId=1146029695717560320, journalId=1146123166801305609, year='2025', volume='25', issue='11', pageStart='4397', pageEnd='4826', issueExtLink='null', onlineDate='null', pubDate='null', beforeIssueId=null, nextIssueId=null, price=null, status=1, issueComplete=1, articleOrder=1, issueType=-1, specialIssue=0, createTime=1752057774827, creator=13701087609, updateTime=1768456666677, updator=13701087609, preIssue=null, nextIssue=null, ext={EN=IssueExt(id=1218558837930512931, tenantId=1146029695717560320, journalId=1146123166801305609, issueId=1149776900194791454, language=EN, specialIssueTitle=, coverIllustrator=, specialIssueEditor=, specialIssueAbout=), CN=IssueExt(id=1218558837930512932, tenantId=1146029695717560320, journalId=1146123166801305609, issueId=1149776900194791454, language=CN, specialIssueTitle=, coverIllustrator=, specialIssueEditor=, specialIssueAbout=)}, issueFiles=null}, startPage=4647, endPage=4655, ext={EN=ArticleExt(id=1149776903457956584, articleId=1149776903252435686, tenantId=1146029695717560320, journalId=1146123166801305609, language=EN, title=Open World Object Detection Based on Shape Perception and Class Balance Optimization, columnId=1156262729162810294, journalTitle=Science Technology and Engineering, columnName=Papers·Automation and Computational Technology, runingTitle=null, highlight=null, articleAbstract=

An open world object detection method based on shape perception and class balance optimization was proposed to address the issue of poor prediction performance of unknown class objects in open world object detection. Unknown classes referred to classes that were not labeled during the training phase. Due to the lack of guidance from labels, detecting unknown class objects was a challenging task. An unknown class enhanced detector has been constructed as an unknown class detection branch. During training, this detector was supervised using only known class labels, allowing it to learn the similarities in features of known class objects and generalize to unknown class objects. To improve the detector's sensitivity to unknown classes, the region proposal network (RPN) module's ability to distinguish between foreground and background was utilized. A specific filtering method was employed to select results with “unknown class potential” from the RPN output, which were then used as pseudo labels in the training process. Due to the absence of confidence scores, traditional non-maximum suppression (NMS) methods were difficult to apply for post-processing unknown objects. Therefore, a redundant unknown object suppression mechanism was designed, consisting of a center point-based grouping strategy and a redundancy score matrix based on shape perception. The center point-based grouping strategy included three methods based on the unknown class center points to determine the suppression range. Subsequently, a redundancy score matrix was constructed based on the redundancy scores of each prediction box within the group to suppress highly redundant predictions. Experimental results on open world object detection datasets demonstrated that the open world object detection based on shape perception and class balance optimization maintained high recall rates for unknown classes while achieving high prediction accuracy. This method effectively addressed the challenges of open world scenarios and avoided generating a large number of useless predictions.

, correspAuthors=Yan-min ZHU, authorNote=null, correspAuthorsNote=null, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=null, magXml=null, pdfUrl=null, pdf=null, pdfFileSize=null, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=null, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=null, mapNumber=null, authorCompany=null, fund=null, authors=null, authorsList=Yang XU, Shu-zhi SU, Yan-min ZHU, Chao WANG), CN=ArticleExt(id=1149776930590908707, articleId=1149776903252435686, tenantId=1146029695717560320, journalId=1146123166801305609, language=CN, title=基于形状感知与类平衡优化的开放世界目标检测, columnId=1156262729783567290, journalTitle=科学技术与工程, columnName=论文·自动化技术、计算机技术, runingTitle=null, highlight=null, articleAbstract=

针对开放世界目标检测中未知类目标预测性能不佳的问题,提出了一种基于形状感知与类平衡优化的开放世界目标检测方法。未知类指在训练阶段未标注的类别,由于缺乏标签的指导,未知类目标的检测是一个具有挑战性的任务。构建了一种未知类增强探测器,作为未知类检测分支,在训练阶段只利用已知类标签进行监督,让探测器学习已知类目标特征的相似性,进而推广到未知类目标。为了提高探测器对未知类的敏感度,利用区域生成网络(region proposal network, RPN)模块区分前景和背景的特性,使用特定筛选方式,从RPN输出中选择“具有未知类潜力”的结果作为伪标签参与探测器训练过程。由于缺乏置信度得分,传统非极大值抑制(non-maximum suppression, NMS)方法难以应用于未知目标的后处理,因此设计了一种冗余未知类目标框抑制器,该抑制器由基于中心点的分组策略和基于形状感知冗余度得分矩阵构成。其中基于中心点的分组策略包含三种根据未知类中心点的分组方法,用于确定抑制范围。接着根据组内每一个预测框的冗余度得分构建冗余度得分矩阵,从而抑制高冗余预测结果。在开放世界目标检测数据集上的实验结果表明基于形状感知与类平衡优化的开放世界目标检测方法在保证未知类召回率的同时,具有较高的未知类预测精度。基于形状感知与类平衡优化的开放世界目标检测方法能有效应对开放世界的难题,避免产生大量的无用预测结果。

, correspAuthors=朱彦敏, authorNote=null, correspAuthorsNote=
* 朱彦敏(1988—),女,汉族,山东泰安人,博士,讲师。研究方向:图像处理与模式识别。E-mail:
, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=4u5LkjTE5DMOOwFQoP6Eeg==, magXml=0XHUqjpwyqmCjwLJmm6yaw==, pdfUrl=null, pdf=aIA1Wr/lV+GXRihfA0T3Nw==, pdfFileSize=15356888, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=D5IWOM+VUsIM/kX6bjDcNw==, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=hfY2Hev8NhBjDnrpny0V2A==, mapNumber=null, authorCompany=null, fund=null, authors=

徐阳(2000—),男,汉族,江苏淮安人,硕士研究生。研究方向:目标检测。E-mail:

, authorsList=徐阳, 苏树智, 朱彦敏, 王超)}, authors=[Author(id=1218843900459008249, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, orderNo=0, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=yangxu_my@163.com, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1218843900572254464, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, authorId=1218843900459008249, language=EN, stringName=Yang XU, firstName=Yang, middleName=null, lastName=XU, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1 School of Computer Science and Engineering, Anhui University of Science & Technology, Huainan 232001, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1218843900668723463, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, authorId=1218843900459008249, language=CN, stringName=徐阳, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1 安徽理工大学计算机科学与工程学院, 淮南 232001, bio={"content":"

徐阳(2000—),男,汉族,江苏淮安人,硕士研究生。研究方向:目标检测。E-mail:

"}, bioImg=null, bioContent=

徐阳(2000—),男,汉族,江苏淮安人,硕士研究生。研究方向:目标检测。E-mail:

, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1218843900224127209, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, xref=1, ext=[AuthorCompanyExt(id=1218843900240904426, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, companyId=1218843900224127209, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1 School of Computer Science and Engineering, Anhui University of Science & Technology, Huainan 232001, China), AuthorCompanyExt(id=1218843900253487340, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, companyId=1218843900224127209, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1 安徽理工大学计算机科学与工程学院, 淮南 232001)])]), Author(id=1218843900819718417, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, orderNo=1, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1218843900916187419, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, authorId=1218843900819718417, language=EN, stringName=Shu-zhi SU, firstName=Shu-zhi, middleName=null, lastName=SU, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1 School of Computer Science and Engineering, Anhui University of Science & Technology, Huainan 232001, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1218843901008462116, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, authorId=1218843900819718417, language=CN, stringName=苏树智, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1 安徽理工大学计算机科学与工程学院, 淮南 232001, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1218843900224127209, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, xref=1, ext=[AuthorCompanyExt(id=1218843900240904426, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, companyId=1218843900224127209, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1 School of Computer Science and Engineering, Anhui University of Science & Technology, Huainan 232001, China), AuthorCompanyExt(id=1218843900253487340, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, companyId=1218843900224127209, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1 安徽理工大学计算机科学与工程学院, 淮南 232001)])]), Author(id=1218843901209788713, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, orderNo=2, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=zyanmin1988@163.com, emailSecond=null, emailThird=null, correspondingAuthor=1, authorType=1, ext={EN=AuthorExt(id=1218843901293674800, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, authorId=1218843901209788713, language=EN, stringName=Yan-min ZHU, firstName=Yan-min, middleName=null, lastName=ZHU, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, *, address=2 School of Mechanical Engineering, Anhui University of Science & Technology, Huainan 232001, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1218843901406921019, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, authorId=1218843901209788713, language=CN, stringName=朱彦敏, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, *, address=2 安徽理工大学机械工程学院, 淮南 232001, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1218843900349956339, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, xref=2, ext=[AuthorCompanyExt(id=1218843900358344947, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, companyId=1218843900349956339, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2 School of Mechanical Engineering, Anhui University of Science & Technology, Huainan 232001, China), AuthorCompanyExt(id=1218843900362539252, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, companyId=1218843900349956339, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2 安徽理工大学机械工程学院, 淮南 232001)])]), Author(id=1218843901478224195, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, orderNo=3, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1218843901595664715, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, authorId=1218843901478224195, language=EN, stringName=Chao WANG, firstName=Chao, middleName=null, lastName=WANG, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1 School of Computer Science and Engineering, Anhui University of Science & Technology, Huainan 232001, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1218843901696328020, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, authorId=1218843901478224195, language=CN, stringName=王超, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1 安徽理工大学计算机科学与工程学院, 淮南 232001, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1218843900224127209, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, xref=1, ext=[AuthorCompanyExt(id=1218843900240904426, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, companyId=1218843900224127209, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1 School of Computer Science and Engineering, Anhui University of Science & Technology, Huainan 232001, China), AuthorCompanyExt(id=1218843900253487340, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, companyId=1218843900224127209, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1 安徽理工大学计算机科学与工程学院, 淮南 232001)])])], keywords=[Keyword(id=1218843901998317934, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, orderNo=1, keyword=open world object detection), Keyword(id=1218843902090592631, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, orderNo=2, keyword=unknown class object detection), Keyword(id=1218843902182867329, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, orderNo=3, keyword=center point-based grouping strategy), Keyword(id=1218843902287724938, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, orderNo=4, keyword=shape perception), Keyword(id=1218843902392582545, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, orderNo=5, keyword=redundancy score), Keyword(id=1218843902514217373, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, orderNo=1, keyword=开放世界目标检测), Keyword(id=1218843902614880678, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, orderNo=2, keyword=未知类目标检测), Keyword(id=1218843902728126895, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, orderNo=3, keyword=基于中心点的分组策略), Keyword(id=1218843902820401591, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, orderNo=4, keyword=形状感知), Keyword(id=1218843902950425027, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, orderNo=5, keyword=冗余度得分)], refs=[Reference(id=1218843907723543374, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2023, volume=23, issue=5, pageStart=2051, pageEnd=2058, url=null, language=null, rfNumber=[1], rfOrder=0, authorNames=吴晨曦, 应保胜, 许小伟, journalName=科学技术与工程, refType=null, unstructuredReference=吴晨曦, 应保胜, 许小伟, 等. 基于改进单步多框目标检测的道路小目标检测算法[J]. 科学技术与工程, 2023, 23(5): 2051-2058., articleTitle=基于改进单步多框目标检测的道路小目标检测算法, refAbstract=null), Reference(id=1218843907920675676, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2023, volume=23, issue=5, pageStart=2051, pageEnd=2058, url=null, language=null, rfNumber=[1], rfOrder=1, authorNames=Wu Chenxi, Ying Baosheng, Xu Xiaowei, journalName=Science Technology and Engineering, refType=null, unstructuredReference=Wu Chenxi, Ying Baosheng, Xu Xiaowei, et al. Road small target detection algorithm based on improved SSD[J]. Science Technology and Engineering, 2023, 23(5): 2051-2058., articleTitle=Road small target detection algorithm based on improved SSD, refAbstract=null), Reference(id=1218843908101030756, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2021, volume=null, issue=null, pageStart=5830, pageEnd=5840, url=null, language=null, rfNumber=[2], rfOrder=2, authorNames=Joseph K J, Khan S, Khan F S, journalName=Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, refType=null, unstructuredReference=Joseph K J, Khan S, Khan F S, et al. Towards open world object detection[C]// Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. New York: IEEE, 2021: 5830-5840., articleTitle=Towards open world object detection, refAbstract=null), Reference(id=1218843908205888366, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2022, volume=null, issue=null, pageStart=3961, pageEnd=3970, url=null, language=null, rfNumber=[3], rfOrder=3, authorNames=Zheng J, Li W, Hong J, journalName=Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, refType=null, unstructuredReference=Zheng J, Li W, Hong J, et al. Towards open-set object detection and discovery[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2022: 3961-3970., articleTitle=Towards open-set object detection and discovery, refAbstract=null), Reference(id=1218843908373660544, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2020, volume=9, issue=1, pageStart=8, pageEnd=15, url=null, language=null, rfNumber=[4], rfOrder=4, authorNames=Jaiswal A, Babu A R, Zadeh M Z, journalName=Technologies, refType=null, unstructuredReference=Jaiswal A, Babu A R, Zadeh M Z, et al. A survey on contrastive self-supervised learning[J]. Technologies, 2020, 9(1): 8-15., articleTitle=A survey on contrastive self-supervised learning, refAbstract=null), Reference(id=1218843908528849807, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2021, volume=9, issue=null, pageStart=82146, pageEnd=82168, url=null, language=null, rfNumber=[5], rfOrder=5, authorNames=Schmarje L, Santarossa M, Schröder S M, journalName=IEEE Access, refType=null, unstructuredReference=Schmarje L, Santarossa M, Schröder S M, et al. A survey on semi-, self-and unsupervised learning for image classification[J]. IEEE Access, 2021, 9: 82146-82168., articleTitle=A survey on semi-, self-and unsupervised learning for image classification, refAbstract=null), Reference(id=1218843908642096026, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2023, volume=23, issue=3, pageStart=1160, pageEnd=1167, url=null, language=null, rfNumber=[6], rfOrder=6, authorNames=郑涵, 田猛, 赵延峰, journalName=科学技术与工程, refType=null, unstructuredReference=郑涵, 田猛, 赵延峰, 等. 基于改进Faster R-CNN的手部位姿估计方法[J]. 科学技术与工程, 2023, 23(3): 1160-1167., articleTitle=基于改进Faster R-CNN的手部位姿估计方法, refAbstract=null), Reference(id=1218843908805673892, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2023, volume=23, issue=3, pageStart=1160, pageEnd=1167, url=null, language=null, rfNumber=[6], rfOrder=7, authorNames=Zheng Han, Tian Meng, Zhao Yanfeng, journalName=Science Technology and Engineering, refType=null, unstructuredReference=Zheng Han, Tian Meng, Zhao Yanfeng, et al. Research on hand pose estimation based on improved Faster R-CNN method[J]. Science Technology and Engineering, 2023, 23(3): 1160-1167., articleTitle=Research on hand pose estimation based on improved Faster R-CNN method, refAbstract=null), Reference(id=1218843909002806197, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2018, volume=null, issue=null, pageStart=165, pageEnd=168, url=null, language=null, rfNumber=[7], rfOrder=8, authorNames=Gavrilescu R, Zet C, Fosalău C, journalName=2018 International Conference and Exposition on Electrical and Power Engineering (EPE), refType=null, unstructuredReference=Gavrilescu R, Zet C, Fosalău C, et al. Faster R-CNN: an approach to real-time object detection[C]// 2018 International Conference and Exposition on Electrical and Power Engineering (EPE). New York: IEEE, 2018: 165-168., articleTitle=Faster R-CNN: an approach to real-time object detection, refAbstract=null), Reference(id=1218843909149606856, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2020, volume=20, issue=5, pageStart=2776, pageEnd=2791, url=null, language=null, rfNumber=[8], rfOrder=9, authorNames=Zhang Y, Dai L, Wong E W M, journalName=IEEE Transactions on Wireless Communications, refType=null, unstructuredReference=Zhang Y, Dai L, Wong E W M. Optimal BS deployment and user association for 5G millimeter wave communication networks[J]. IEEE Transactions on Wireless Communications, 2020, 20(5): 2776-2791., articleTitle=Optimal BS deployment and user association for 5G millimeter wave communication networks, refAbstract=null), Reference(id=1218843909250270167, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2022, volume=null, issue=null, pageStart=626, pageEnd=630, url=null, language=null, rfNumber=[9], rfOrder=10, authorNames=Yu J, Ma L, Li Z, journalName=2022 IEEE International Conference on Image Processing (ICIP), refType=null, unstructuredReference=Yu J, Ma L, Li Z, et al. Open-world object detection via discriminative class prototype learning[C]// 2022 IEEE International Conference on Image Processing (ICIP). New York: IEEE, 2022: 626-630., articleTitle=Open-world object detection via discriminative class prototype learning, refAbstract=null), Reference(id=1218843909376099297, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2022, volume=null, issue=null, pageStart=193, pageEnd=210, url=null, language=null, rfNumber=[10], rfOrder=11, authorNames=Wu Z, Lu Y, Chen X, journalName=European Conference on Computer Vision, refType=null, unstructuredReference=Wu Z, Lu Y, Chen X, et al. UC-OWOD: unknown-classified open world object detection[C]// European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022: 193-210., articleTitle=UC-OWOD: unknown-classified open world object detection, refAbstract=null), Reference(id=1218843909506122734, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2023, volume=133, issue=null, pageStart=109027, pageEnd=null, url=null, language=null, rfNumber=[11], rfOrder=12, authorNames=Zhang Y, Zhang X Y, Shi H, journalName=Pattern Recognition, refType=null, unstructuredReference=Zhang Y, Zhang X Y, Shi H. OW-TAL: learning unknown human activities for open-world temporal action localization[J]. Pattern Recognition, 2023, 133: 109027., articleTitle=OW-TAL: learning unknown human activities for open-world temporal action localization, refAbstract=null), Reference(id=1218843909648729084, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2023, volume=33, issue=5, pageStart=1201, pageEnd=1215, url=null, language=null, rfNumber=[12], rfOrder=13, authorNames=Zhao X, Ma Y, Wang D, journalName=IEEE Transactions on Circuits and Systems for Video Technology, refType=null, unstructuredReference=Zhao X, Ma Y, Wang D, et al. Revisiting open world object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(5): 1201-1215., articleTitle=Revisiting open world object detection, refAbstract=null), Reference(id=1218843909770362889, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2023, volume=null, issue=null, pageStart=3230, pageEnd=3239, url=null, language=null, rfNumber=[13], rfOrder=14, authorNames=Liang W, Xue F, Liu Y, journalName=Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, refType=null, unstructuredReference=Liang W, Xue F, Liu Y, et al. Unknown sniffer for object detection: don't turn a blind eye to unknown objects[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2023: 3230-3239., articleTitle=Unknown sniffer for object detection: don't turn a blind eye to unknown objects, refAbstract=null), Reference(id=1218843909988466726, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2022, volume=null, issue=null, pageStart=9235, pageEnd=9244, url=null, language=null, rfNumber=[14], rfOrder=15, authorNames=Gupta A, Narayan S, Joseph K J, journalName=Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, refType=null, unstructuredReference=Gupta A, Narayan S, Joseph K J, et al. Owdetr: open-world detection transformer[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2022: 9235-9244., articleTitle=Owdetr: open-world detection transformer, refAbstract=null), Reference(id=1218843910261096507, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2022, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[15], rfOrder=16, authorNames=Zhu X, Su W, Lu L, journalName=arXiv preprint arXiv: 2010.04159, refType=null, unstructuredReference=Zhu X, Su W, Lu L, et al. Deformable detr: deformable transformers for end-to-end object detection[J]. arXiv preprint arXiv: 2010.04159, 2022., articleTitle=Deformable detr: deformable transformers for end-to-end object detection, refAbstract=null), Reference(id=1218843910407897161, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2021, volume=21, issue=13, pageStart=4350, pageEnd=null, url=null, language=null, rfNumber=[16], rfOrder=17, authorNames=Wenkel S, Alhazmi K, Liiv T, journalName=Sensors, refType=null, unstructuredReference=Wenkel S, Alhazmi K, Liiv T, et al. Confidence score: the forgotten dimension of object detection performance evaluation[J]. Sensors, 2021, 21(13): 4350., articleTitle=Confidence score: the forgotten dimension of object detection performance evaluation, refAbstract=null), Reference(id=1218843910554697820, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2019, volume=null, issue=null, pageStart=85, pageEnd=94, url=null, language=null, rfNumber=[17], rfOrder=18, authorNames=Zhou D, Fang J, Song X, journalName=2019 international conference on 3D vision (3DV), refType=null, unstructuredReference=Zhou D, Fang J, Song X, et al. IoU loss for 2d/3d object detection[C]// 2019 international conference on 3D vision (3DV). New York: IEEE, 2019: 85-94., articleTitle=IoU loss for 2d/3d object detection, refAbstract=null), Reference(id=1218843910726664299, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2017, volume=null, issue=null, pageStart=2980, pageEnd=2988, url=null, language=null, rfNumber=[18], rfOrder=19, authorNames=Lin T Y, Goyal P, Girshick R, journalName=Proceedings of the IEEE International Conference on Computer Vision, refType=null, unstructuredReference=Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]// Proceedings of the IEEE International Conference on Computer Vision. New York: IEEE, 2017: 2980-2988., articleTitle=Focal loss for dense object detection, refAbstract=null), Reference(id=1218843910890242173, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2021, volume=21, issue=13, pageStart=4350, pageEnd=null, url=null, language=null, rfNumber=[19], rfOrder=20, authorNames=Wenkel S, Alhazmi K, Liiv T, journalName=Sensors, refType=null, unstructuredReference=Wenkel S, Alhazmi K, Liiv T, et al. Confidence score: the forgotten dimension of object detection performance evaluation[J]. Sensors, 2021, 21(13): 4350., articleTitle=Confidence score: the forgotten dimension of object detection performance evaluation, refAbstract=null), Reference(id=1218843911053820051, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=2023, volume=93, issue=null, pageStart=103830, pageEnd=null, url=null, language=null, rfNumber=[20], rfOrder=21, authorNames=Tong K, Wu Y, journalName=Journal of Visual Communication and Image Representation, refType=null, unstructuredReference=Tong K, Wu Y. Rethinking PASCAL-VOC and MS-COCO dataset for small object detection[J]. Journal of Visual Communication and Image Representation, 2023, 93: 103830., articleTitle=Rethinking PASCAL-VOC and MS-COCO dataset for small object detection, refAbstract=null), Reference(id=1218843911204815010, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, doi=null, pmid=null, pmcid=null, year=null, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[21], rfOrder=22, authorNames=Bideaux M, Phe A, Chaouch M, journalName=arXiv preprint arXiv: 2404.05641, 2024, refType=null, unstructuredReference=Bideaux M, Phe A, Chaouch M, et al. 3D-COCO: extension of MS-COCO dataset for image detection and 3D reconstruction modules[J]. arXiv preprint arXiv: 2404.05641, 2024., articleTitle=3D-COCO: extension of MS-COCO dataset for image detection and 3D reconstruction modules, refAbstract=null)], funds=[Fund(id=1218843907241198382, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, awardId=52374155, language=CN, fundingSource=国家自然科学基金(52374155), fundOrder=null, country=null), Fund(id=1218843907421553463, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, awardId=2022AH040113, language=CN, fundingSource=安徽省高等学校自然科学研究项目(2022AH040113), fundOrder=null, country=null)], companyList=[AuthorCompany(id=1218843900224127209, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, xref=1, ext=[AuthorCompanyExt(id=1218843900240904426, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, companyId=1218843900224127209, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1 School of Computer Science and Engineering, Anhui University of Science & Technology, Huainan 232001, China), AuthorCompanyExt(id=1218843900253487340, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, companyId=1218843900224127209, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1 安徽理工大学计算机科学与工程学院, 淮南 232001)]), AuthorCompany(id=1218843900349956339, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, xref=2, ext=[AuthorCompanyExt(id=1218843900358344947, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, companyId=1218843900349956339, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2 School of Mechanical Engineering, Anhui University of Science & Technology, Huainan 232001, China), AuthorCompanyExt(id=1218843900362539252, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, companyId=1218843900349956339, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2 安徽理工大学机械工程学院, 淮南 232001)])], figs=[ArticleFig(id=1218843903206277584, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, label=Fig.1, caption=Network structure, figureFileSmall=xEA44VmX2YFqJtBhMcGK6Q==, figureFileBig=kzVen6DBRf5Yjo+N2MyINw==, tableContent=null), ArticleFig(id=1218843903302746584, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, label=图1, caption=网络结构, figureFileSmall=xEA44VmX2YFqJtBhMcGK6Q==, figureFileBig=kzVen6DBRf5Yjo+N2MyINw==, tableContent=null), ArticleFig(id=1218843903415992802, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, label=Fig.2, caption=Schematic diagram of feature space, figureFileSmall=DCo3+iv2wE2e4XV8btr7Mg==, figureFileBig=I9yrc2FakXzYI8M9QcJ/lQ==, tableContent=null), ArticleFig(id=1218843903550210541, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, label=图2, caption=特征空间示意图, figureFileSmall=DCo3+iv2wE2e4XV8btr7Mg==, figureFileBig=I9yrc2FakXzYI8M9QcJ/lQ==, tableContent=null), ArticleFig(id=1218843903634096628, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, label=Fig.3, caption=Pseudo label generation process, figureFileSmall=ARFSyWUoiZGBd9hdSm/q0w==, figureFileBig=DgZnIhrnOKXwcCCAPTqKJg==, tableContent=null), ArticleFig(id=1218843903797674496, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, label=图3, caption=伪标签产生过程, figureFileSmall=ARFSyWUoiZGBd9hdSm/q0w==, figureFileBig=DgZnIhrnOKXwcCCAPTqKJg==, tableContent=null), ArticleFig(id=1218843903906726407, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, label=Fig.4, caption=Training process of unknown class enhanced detector, figureFileSmall=fJR/0YyFIzRUHmPs7QyNcQ==, figureFileBig=R60YPtJRaQ/FMRY48ObRIg==, tableContent=null), ArticleFig(id=1218843904011584017, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, label=图4, caption=未知类增强探测器训练过程, figureFileSmall=fJR/0YyFIzRUHmPs7QyNcQ==, figureFileBig=R60YPtJRaQ/FMRY48ObRIg==, tableContent=null), ArticleFig(id=1218843904091275801, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, label=Fig.5, caption=Traditional classification methods, figureFileSmall=9QQjJWKdd7mYli9j1uV6bA==, figureFileBig=DAtdcp8jogVrZL/L5zQKgw==, tableContent=null), ArticleFig(id=1218843904212910627, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, label=图5, caption=传统分类方法, figureFileSmall=9QQjJWKdd7mYli9j1uV6bA==, figureFileBig=DAtdcp8jogVrZL/L5zQKgw==, tableContent=null), ArticleFig(id=1218843904351322670, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, label=Fig.6, caption=Classification based on center point adjacency graph, figureFileSmall=l7+V3YBabuqApInIMzb0AQ==, figureFileBig=RHtC+lXqRnHY2H/RKlM9ig==, tableContent=null), ArticleFig(id=1218843904435208757, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, label=图6, caption=基于中心点邻接图的分类, figureFileSmall=l7+V3YBabuqApInIMzb0AQ==, figureFileBig=RHtC+lXqRnHY2H/RKlM9ig==, tableContent=null), ArticleFig(id=1218843904607175229, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, label=Fig.7, caption=Classification based on the grid where the center point is located, figureFileSmall=yfTegQiEFKa8hvGzns7osg==, figureFileBig=yyBqRVyCUBIVOrWGa58agw==, tableContent=null), ArticleFig(id=1218843904749781575, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, label=图7, caption=基于中心点所在网格分类, figureFileSmall=yfTegQiEFKa8hvGzns7osg==, figureFileBig=yyBqRVyCUBIVOrWGa58agw==, tableContent=null), ArticleFig(id=1218843904888193616, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, label=Fig.8, caption=Adaptive hierarchical classification, figureFileSmall=7DQmEhUocgCMV/hSFU6tiw==, figureFileBig=o/Ijswm2SePouYTrGoYEmw==, tableContent=null), ArticleFig(id=1218843905005634138, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, label=图8, caption=自适应层次分类, figureFileSmall=7DQmEhUocgCMV/hSFU6tiw==, figureFileBig=o/Ijswm2SePouYTrGoYEmw==, tableContent=null), ArticleFig(id=1218843905152434798, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, label=Fig.9, caption=Visualization of inference results, figureFileSmall=/c55sqZvZ/EWeVjgiLwfEg==, figureFileBig=7aN6q4r9XE6Xy1EY9YbZJA==, tableContent=null), ArticleFig(id=1218843905286652542, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, label=图9, caption=推理结果可视化, figureFileSmall=/c55sqZvZ/EWeVjgiLwfEg==, figureFileBig=7aN6q4r9XE6Xy1EY9YbZJA==, tableContent=null), ArticleFig(id=1218843905391510153, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, label=Table 1, caption=

Test dataset

, figureFileSmall=null, figureFileBig=null, tableContent=
数据集 图片 已知 未知
VOC-Test 4 952 5.09 0
COCO-OOD 504 0 3.28
COCO-Mixed 897 2.96 2.82
), ArticleFig(id=1218843905517339286, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, label=表1, caption=

测试数据集

, figureFileSmall=null, figureFileBig=null, tableContent=
数据集 图片 已知 未知
VOC-Test 4 952 5.09 0
COCO-OOD 504 0 3.28
COCO-Mixed 897 2.96 2.82
), ArticleFig(id=1218843905638974111, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, label=Table 2, caption=

Experimental results on the VOC dataset

, figureFileSmall=null, figureFileBig=null, tableContent=
方法 mAP
Faster R-CNN 0.487
OW-DETR 0.420
VOS[21] 0.491
ORE 0.243
UnSniffer 0.462
本文方法 0.466
), ArticleFig(id=1218843905768997547, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, label=表2, caption=

VOC数据集上实验结果

, figureFileSmall=null, figureFileBig=null, tableContent=
方法 mAP
Faster R-CNN 0.487
OW-DETR 0.420
VOS[21] 0.491
ORE 0.243
UnSniffer 0.462
本文方法 0.466
), ArticleFig(id=1218843905869660859, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, label=Table 3, caption=

Experimental results on the COCO-OOD dataset

, figureFileSmall=null, figureFileBig=null, tableContent=
方法 APu F1u Pu Rc
Faster R-CNN
OW-DETR 0.033 0.056 0.030 0.380
VOS 0.234 0.296 0.277 0.319
ORE 0.214 0.255 0.153 0.782
UnSniffer 0.396 0.475 0.487 0.464
本文方法_v1 0.367 0.485 0.602 0.407
本文方法_v2 0.392 0.498 0.619 0.417
本文方法_v3 0.402 0.507 0.617 0.430
), ArticleFig(id=1218843906008072905, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, label=表3, caption=

COCO-OOD数据集上实验结果

, figureFileSmall=null, figureFileBig=null, tableContent=
方法 APu F1u Pu Rc
Faster R-CNN
OW-DETR 0.033 0.056 0.030 0.380
VOS 0.234 0.296 0.277 0.319
ORE 0.214 0.255 0.153 0.782
UnSniffer 0.396 0.475 0.487 0.464
本文方法_v1 0.367 0.485 0.602 0.407
本文方法_v2 0.392 0.498 0.619 0.417
本文方法_v3 0.402 0.507 0.617 0.430
), ArticleFig(id=1218843906150679254, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, label=Table 4, caption=

Experimental results on the COCO Mix dataset

, figureFileSmall=null, figureFileBig=null, tableContent=
方法 mAP APu F1u Pu Rc
Faster R-CNN
OW-DETR 0.414 0.007 0.025 0.014 0.161
VOS 0.364 0.117 0.164 0.191 0.144
ORE 0.213 0.140 0.175 0.103 0.592
UnSniffer 0.362 0.121 0.252 0.201 0.338
本文方法_v1 0.375 0.102 0.234 0.227 0.242
本文方法_v2 0.375 0.098 0.228 0.225 0.233
本文方法_v3 0.375 0.138 0.252 0.243 0.263
), ArticleFig(id=1218843906255536862, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, label=表4, caption=

COCO-Mix数据集上实验结果

, figureFileSmall=null, figureFileBig=null, tableContent=
方法 mAP APu F1u Pu Rc
Faster R-CNN
OW-DETR 0.414 0.007 0.025 0.014 0.161
VOS 0.364 0.117 0.164 0.191 0.144
ORE 0.213 0.140 0.175 0.103 0.592
UnSniffer 0.362 0.121 0.252 0.201 0.338
本文方法_v1 0.375 0.102 0.234 0.227 0.242
本文方法_v2 0.375 0.098 0.228 0.225 0.233
本文方法_v3 0.375 0.138 0.252 0.243 0.263
), ArticleFig(id=1218843906389754605, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, label=Table 5, caption=

Results of ablation experiments on the COCO-OOD dataset

, figureFileSmall=null, figureFileBig=null, tableContent=
RUOBS_v1 RUOBS_v2 RUOBS_v3 方法 Pu Rc F1u
× × × VOS 0.277 0.319 0.296
× × 0.507 0.301 0.377
× × 0.476 0.302 0.369
× × 0.512 0.298 0.376
× × × ORE 0.153 0.782 0.255
× × 0.317 0.721 0.440
× × 0.320 0.723 0.443
× × 0.328 0.717 0.450
× × × UnSniffer 0.487 0.468 0.477
× × 0.492 0.455 0.472
× × 0.488 0.466 0.476
× × 0.492 0.457 0.473
× × × 本文方法 0.582 0.437 0.499
× × 0.607 0.433 0.505
× × 0.612 0.428 0.503
× × 0.617 0.430 0.507
), ArticleFig(id=1218843906540749561, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, label=表5, caption=

COCO-OOD数据集上消融实验结果

, figureFileSmall=null, figureFileBig=null, tableContent=
RUOBS_v1 RUOBS_v2 RUOBS_v3 方法 Pu Rc F1u
× × × VOS 0.277 0.319 0.296
× × 0.507 0.301 0.377
× × 0.476 0.302 0.369
× × 0.512 0.298 0.376
× × × ORE 0.153 0.782 0.255
× × 0.317 0.721 0.440
× × 0.320 0.723 0.443
× × 0.328 0.717 0.450
× × × UnSniffer 0.487 0.468 0.477
× × 0.492 0.455 0.472
× × 0.488 0.466 0.476
× × 0.492 0.457 0.473
× × × 本文方法 0.582 0.437 0.499
× × 0.607 0.433 0.505
× × 0.612 0.428 0.503
× × 0.617 0.430 0.507
), ArticleFig(id=1218843906700133124, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=EN, label=Table 6, caption=

Results of ablation experiments on the COCO Mix dataset

, figureFileSmall=null, figureFileBig=null, tableContent=
RUOBS_v1 RUOBS_v2 RUOBS_v3 方法 Pu Rc F1u
× × × VOS 0.191 0.144 0.164
× × 0.257 0.130 0.172
× × 0.266 0.129 0.174
× × 0.265 0.129 0.173
× × × ORE 0.103 0.592 0.175
× × 0.152 0.580 0.240
× × 0.177 0.562 0.269
× × 0.165 0.578 0.257
× × × UnSniffer 0.201 0.338 0.252
× × 0.217 0.332 0.262
× × 0.212 0.336 0.259
× × 0.227 0.336 0.270
× × × 本文方法 0.205 0.275 0.234
× × 0.237 0.270 0.252
× × 0.225 0.267 0.244
× × 0.243 0.263 0.252
), ArticleFig(id=1218843906918236947, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149776903252435686, language=CN, label=表6, caption=

COCO-Mix数据集上消融实验结果

, figureFileSmall=null, figureFileBig=null, tableContent=
RUOBS_v1 RUOBS_v2 RUOBS_v3 方法 Pu Rc F1u
× × × VOS 0.191 0.144 0.164
× × 0.257 0.130 0.172
× × 0.266 0.129 0.174
× × 0.265 0.129 0.173
× × × ORE 0.103 0.592 0.175
× × 0.152 0.580 0.240
× × 0.177 0.562 0.269
× × 0.165 0.578 0.257
× × × UnSniffer 0.201 0.338 0.252
× × 0.217 0.332 0.262
× × 0.212 0.336 0.259
× × 0.227 0.336 0.270
× × × 本文方法 0.205 0.275 0.234
× × 0.237 0.270 0.252
× × 0.225 0.267 0.244
× × 0.243 0.263 0.252
)], attaches=null, journal=Journal(id=1146119176004939786, delFlag=0, nameCn=科学技术与工程, nameEn=Science Technology and Engineering, nameHistory1=null, nameHistory2=null, issn=1671-1815, eissn=, cn=11-4688/T, coden=null, periodic=4, language=CN, oaType=是, ccby=null, superviseOffice=null, ownerOffice=null, pubOffice=null, editorOffice=null, officeType=null, aims=null, clcCode=null, officeProv=null, officeCity=null, officeAddr=null, officeZip=null, officeEmail=null, officePhone=null, editDirector=null, officeDirector=null, officeDirectorPhone=null, officeStaffNum=null, officeEmpNum=null, coverPicUrl=UKU/O7GSka5polgCTkbIIw==, journalPrice=null, startedYear=null, abbrevIsoEn=Sci Technol Eng, journalRemark=null, publicationField=null, createdTime=null, updatedTime=1754445529766, createdBy=null, updatedBy=13701087609, firstLetterCn=S, firstLetterEn=S, subjectCode=Natural Sciences, subjectName=自然科学, subjectCodeEn=Natural Sciences, subjectNameEn=null, picCn=UKU/O7GSka5polgCTkbIIw==, picEn=5hwlULoNwcbj3xUmVi9MAQ==, jcr=null, cjcr=null, exts=[JournalExt(id=1159791870395564357, language=CN, name=科学技术与工程, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=null, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=http://www.stae.com.cn/jsygc/home, createdTime=1754445529793, updatedTime=1754445529793, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=http://www.stae.com.cn/jsygc/site/menus/20090429150146001, submissionAuthorUrl=http://www.stae.com.cn/jsygc/author/login, submissionEditorUrl=http://www.stae.com.cn/jsygc/editor/login, submissionReviewUrl=http://www.stae.com.cn/jsygc/reviewer/login, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""}), JournalExt(id=1159791870441701702, language=EN, name=Science Technology and Engineering, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=null, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=http://www.stae.com.cn/jsygc/home, createdTime=1754445529804, updatedTime=1754445529804, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=http://www.stae.com.cn/jsygc/author/login, submissionEditorUrl=http://www.stae.com.cn/jsygc/editor/login, submissionReviewUrl=http://www.stae.com.cn/jsygc/reviewer/login, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""})], databaseList=null, tenantJournalId=1146123166801305609, websiteList=[Website(id=1148243202391400884, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1146123166801305609, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/kxjsygc/CN, language=CN, createTime=1751692112777, createBy=18614031015, updateTime=1753520965431, updateBy=18614031015, name=科学技术与工程-中文站点, tplId=1146099689490845704, title=科学技术与工程, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1148622798802673703, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1148243202391400884, code=articleTextType, value=kx, createTime=1751782615614, updateTime=1751782615614, creator=18614031015, updator=18614031015), WebsiteProps(id=1148622798781702180, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1148243202391400884, code=banner, value=null, createTime=1751782615609, updateTime=1751782615609, creator=18614031015, updator=18614031015), WebsiteProps(id=1148622798769119267, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1148243202391400884, code=logo, value=https://castjournals.cast.org.cn/joweb/kjdb/CN/file/pic?fileId=j86gbwi+p0Idkyl5SzIlmQ==, createTime=1751782615606, updateTime=1751782615606, creator=18614031015, updator=18614031015), WebsiteProps(id=1148622798794285094, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1148243202391400884, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/kjdb/CN/file/pic, createTime=1751782615612, updateTime=1751782615612, creator=18614031015, updator=18614031015), WebsiteProps(id=1148622798790090789, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1148243202391400884, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_cn_619/, createTime=1751782615611, updateTime=1751782615611, creator=18614031015, updator=18614031015)]), Website(id=1155914124811976731, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1146123166801305609, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/kxjsygc/EN, language=EN, createTime=1753521003206, createBy=18614031015, updateTime=1753521003206, updateBy=18614031015, name=科学技术与工程-英文站点, tplId=1146101810881728533, title=Science Technology and Engineering, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1155914371227308235, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1155914124811976731, code=articleTextType, value=kx, createTime=1753521061952, updateTime=1753521061952, creator=18614031015, updator=18614031015), WebsiteProps(id=1155914371210531016, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1155914124811976731, code=banner, value=null, createTime=1753521061947, updateTime=1753521061947, creator=18614031015, updator=18614031015), WebsiteProps(id=1155914371202142407, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1155914124811976731, code=logo, value=https://castjournals.cast.org.cn/joweb/kjdb/CN/file/pic?fileId=j86gbwi+p0Idkyl5SzIlmQ==, createTime=1753521061945, updateTime=1753521061945, creator=18614031015, updator=18614031015), WebsiteProps(id=1155914371223113930, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1155914124811976731, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/kjdb/CN/file/pic, createTime=1753521061950, updateTime=1753521061950, creator=18614031015, updator=18614031015), WebsiteProps(id=1155914371218919625, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1155914124811976731, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_cn_619/, createTime=1753521061949, updateTime=1753521061949, creator=18614031015, updator=18614031015)])], journalTitle=科学技术与工程, weixinUrl=null, journalUrl=null, iacademicId=null, status=0, seqNo=null, journalTitleEn=Science Technology and Engineering, journalPhotoCn=UKU/O7GSka5polgCTkbIIw==, journalPhotoEn=5hwlULoNwcbj3xUmVi9MAQ==, journalFirstLetter=S, journalRecommend=null, journalNew=null, journalCollection=null, jcrJf=null, cjcrJf=null, jcrJfStr=null, cjcrJfStr=null, submissionFirstDecision=null, sciSubjectClassification=null, casSubjectClassification=null, citeScore=null, totalCitationFrequency=null, icpCode=null, psCode=null, advertisingLicenseCode=null, copyrightInformation=null, country=null, option=null, provinceCode=null, provinceName=null, collectFlag=false), detailUrlCn=https://castjournals.cast.org.cn/joweb/kxjsygc/CN/10.12404/j.issn.1671-1815.2403508, detailUrlEn=https://castjournals.cast.org.cn/joweb/kxjsygc/EN/10.12404/j.issn.1671-1815.2403508, pdfUrlCn=https://castjournals.cast.org.cn/joweb/kxjsygc/CN/PDF/10.12404/j.issn.1671-1815.2403508, pdfUrlEn=https://castjournals.cast.org.cn/joweb/kxjsygc/EN/PDF/10.12404/j.issn.1671-1815.2403508, aliStartDate=null, aliEndDate=null, collectionFlag=false, citedCount=null, citedUrl=null, reference=null)
收藏切换
基于形状感知与类平衡优化的开放世界目标检测
收藏切换
PDF下载
徐阳 1 , 苏树智 1 , 朱彦敏 2, * , 王超 1
科学技术与工程 | 论文·自动化技术、计算机技术 2025,25(11): 4647-4655
收起
收藏切换
科学技术与工程 | 论文·自动化技术、计算机技术 2025, 25(11): 4647-4655
基于形状感知与类平衡优化的开放世界目标检测
全屏
徐阳1 , 苏树智1, 朱彦敏2, * , 王超1
作者信息
  • 1 安徽理工大学计算机科学与工程学院, 淮南 232001
  • 2 安徽理工大学机械工程学院, 淮南 232001
  • 徐阳(2000—),男,汉族,江苏淮安人,硕士研究生。研究方向:目标检测。E-mail:

通讯作者:

* 朱彦敏(1988—),女,汉族,山东泰安人,博士,讲师。研究方向:图像处理与模式识别。E-mail:
Open World Object Detection Based on Shape Perception and Class Balance Optimization
Yang XU1 , Shu-zhi SU1, Yan-min ZHU2, * , Chao WANG1
Affiliations
  • 1 School of Computer Science and Engineering, Anhui University of Science & Technology, Huainan 232001, China
  • 2 School of Mechanical Engineering, Anhui University of Science & Technology, Huainan 232001, China
出版时间: 2025-04-18 doi: 10.12404/j.issn.1671-1815.2403508
文章导航
收藏切换

针对开放世界目标检测中未知类目标预测性能不佳的问题,提出了一种基于形状感知与类平衡优化的开放世界目标检测方法。未知类指在训练阶段未标注的类别,由于缺乏标签的指导,未知类目标的检测是一个具有挑战性的任务。构建了一种未知类增强探测器,作为未知类检测分支,在训练阶段只利用已知类标签进行监督,让探测器学习已知类目标特征的相似性,进而推广到未知类目标。为了提高探测器对未知类的敏感度,利用区域生成网络(region proposal network, RPN)模块区分前景和背景的特性,使用特定筛选方式,从RPN输出中选择“具有未知类潜力”的结果作为伪标签参与探测器训练过程。由于缺乏置信度得分,传统非极大值抑制(non-maximum suppression, NMS)方法难以应用于未知目标的后处理,因此设计了一种冗余未知类目标框抑制器,该抑制器由基于中心点的分组策略和基于形状感知冗余度得分矩阵构成。其中基于中心点的分组策略包含三种根据未知类中心点的分组方法,用于确定抑制范围。接着根据组内每一个预测框的冗余度得分构建冗余度得分矩阵,从而抑制高冗余预测结果。在开放世界目标检测数据集上的实验结果表明基于形状感知与类平衡优化的开放世界目标检测方法在保证未知类召回率的同时,具有较高的未知类预测精度。基于形状感知与类平衡优化的开放世界目标检测方法能有效应对开放世界的难题,避免产生大量的无用预测结果。

开放世界目标检测  /  未知类目标检测  /  基于中心点的分组策略  /  形状感知  /  冗余度得分

An open world object detection method based on shape perception and class balance optimization was proposed to address the issue of poor prediction performance of unknown class objects in open world object detection. Unknown classes referred to classes that were not labeled during the training phase. Due to the lack of guidance from labels, detecting unknown class objects was a challenging task. An unknown class enhanced detector has been constructed as an unknown class detection branch. During training, this detector was supervised using only known class labels, allowing it to learn the similarities in features of known class objects and generalize to unknown class objects. To improve the detector's sensitivity to unknown classes, the region proposal network (RPN) module's ability to distinguish between foreground and background was utilized. A specific filtering method was employed to select results with “unknown class potential” from the RPN output, which were then used as pseudo labels in the training process. Due to the absence of confidence scores, traditional non-maximum suppression (NMS) methods were difficult to apply for post-processing unknown objects. Therefore, a redundant unknown object suppression mechanism was designed, consisting of a center point-based grouping strategy and a redundancy score matrix based on shape perception. The center point-based grouping strategy included three methods based on the unknown class center points to determine the suppression range. Subsequently, a redundancy score matrix was constructed based on the redundancy scores of each prediction box within the group to suppress highly redundant predictions. Experimental results on open world object detection datasets demonstrated that the open world object detection based on shape perception and class balance optimization maintained high recall rates for unknown classes while achieving high prediction accuracy. This method effectively addressed the challenges of open world scenarios and avoided generating a large number of useless predictions.

open world object detection  /  unknown class object detection  /  center point-based grouping strategy  /  shape perception  /  redundancy score
徐阳, 苏树智, 朱彦敏, 王超. 基于形状感知与类平衡优化的开放世界目标检测. 科学技术与工程, 2025 , 25 (11) : 4647 -4655 . DOI: 10.12404/j.issn.1671-1815.2403508
Yang XU, Shu-zhi SU, Yan-min ZHU, Chao WANG. Open World Object Detection Based on Shape Perception and Class Balance Optimization[J]. Science Technology and Engineering, 2025 , 25 (11) : 4647 -4655 . DOI: 10.12404/j.issn.1671-1815.2403508
目标检测[1]是一项基础性的计算机视觉任务,它需定位和识别图像中的感兴趣目标,在传统的目标检测任务中这些感兴趣目标是面向封闭世界的,它们的种类以及数量都是固定的,因此,传统目标检测难以应对开放环境下的需求。例如,在自动驾驶领域,传统目标检测只能检测出学习过的目标,而忽略那些未被学习过的未知目标带来的危险。开放世界目标检测由Joseph等[2]首次提出,其主要任务是在没有类标签监督的情况下检测出潜在的“未知目标”,在现实世界中,随着学习的类别越来越多,未知类目标开始转变为已知类目标。这个过程类似与人类认知新事物的过程,随着学习的深入,越来越少的事物被标记为未知。由于面向目标转变为开放的世界,开放世界目标检测比传统目标检测有更加广泛的应用前景以及更有价值的现实意义。
目前的开放世界目标检测方法对于未知类目标的检测性能有很大的提升空间,根本原因在于模型训练过程中没有标签的指导,模型很难准确检测出未知类目标。在检测过程中很容易出现忽略潜在的未知类目标的情况,同时一些非目标类会被误当作未知类目标干扰实验结果。对于未知类目标预测结果,无法像传统目标检测一样使用非极大值抑制等方法去除同一未知类目标上冗余预测结果。这些问题导致开放世界目标检测的检测性能较差,各项指标都处于较低水准。
开集识别[3]的目的是处理分类或检测任务中遇到的未知样本。开集识别假设通过训练集获得的知识是不完整的,即在实际应用中可能会遇到新的未知类。关于这个问题,相关学者已经探索了使用自监督学习[4]和基于重构的无监督学习[5]方法进行开集识别。
OWOD(open world object detection )任务是由Joseph等[2]提出的,由于其潜在的现实影响而引起了很多关注。在他们的工作中,Joseph等[2]介绍了ORE(open world object detector),它利用具有特征空间对比聚类的Faster R-CNN(region-based convolutional network )模型[6]、基于RPN的未知检测器[7]和用于OWOD对象的基于能量的未知标识符(energy based unknown identifier, EBUI)[8]。Yu等[9]试图通过将特征聚类的数量设置为类的数量来最小化嵌入特征空间中已知和未知类的重叠分布,并减少已知和未知对象之间的混淆,从而扩展ORE。同时,Zhang等[10]试图通过引入第二个基于位置的对象检测头来扩展ORE,并报告了未知对象回忆的好处,刺激了对象在OWOD中的效用。UC-OWOD(unknown-classified open world object detection )[11]还对未知对象进行分类,以在测量未知类时获得比ORE更好的结果。Zhao等[12]通过选择性搜索纠正了自动标注的建议,并通过类特定驱逐函数校准了过度自信的激活边界。然而,自动标记步骤生成许多伪未知样本,这些样本实际上并不代表未知样本,这限制了它们将知识从已知转移到未知的能力。Liang等[13]提出了一种广义对象置信度来更好地区分未知类对象,并对未知类实现了良好的召回率。然而,它们忽略了未知类的预测准确度,导致未知类的不平衡预测。
Gupta等[14]将可变形(detection transformer, DETR)模型[15]应用于开放世界对象,并提出了(open world detection transformer, OW-DETR)。最近,基于Transformer[16]的方法在开放世界对象中显示出了巨大的潜力。OW-DETR使用伪标记方案来监督未知对象检测,其中选择具有高骨干激活的不匹配对象建议作为未知对象。在OWOD方面的工作促使人们使用基于变压器的模型和集成对象来实现强大的OWOD性能。
在开放世界目标检测任务定义中,假设有一个已知类集合Ck={1,2,…,N}和一个未知类集合Cu={N+1,N+2,N+3,…},模型在训练的过程中已知类集合Ck中类别的目标具有类标签参与训练,在测试阶段模型不仅需要检测出已知类集合Ck中类别的目标,还需要能够检测出未知类集合Cu中类别的目标。
基于形状感知与类平衡优化的开放世界目标检测方法的网络结构如图1所示,其中包含主干网络、RPN、RoI Pooling、未知类增强探测器、目标分类头、目标回归头、冗余未知目标框抑制器(redundant unknown object box suppressor, RUOBS)七个部分。基于形状感知与类平衡优化的开放世界目标检测方法首先通过主干网络提取特征,然后利用RPN模块获取潜在的目标区域,接着RoI Pooling部分将RPN输出结果进行池化,得到大小尺寸一致的特征图。然后通过目标分类头和目标回归头配合得到已知类目标预测结果,接着通过未知类增强探测器和目标回归头的配合进一步过滤掉非目标类结果,保留未知类检测结果,得到未知类目标的预测结果。最后通过RUOBS处理未知类目标的冗余预测框,得到未知类最终预测结果。
由于在训练过程中缺乏标签的指导,未知类目标难以被有效检测出。为此,构建了一个新的分支,专门用于未知类目标检测。与已知类目标检测不同,用于未知类检测的分支不依赖未知类标签。它仅从不同类别目标特征的角度出发,通过学习不同类别目标特征的相似性来区分区域中是否包含潜在目标,从而实现检测未知类目标的效果。如图2所示,在特征空间中,尽管未知类目标与已知类目标存在差异,但当已知类数量达到一定程度时,未知类目标在特征空间中会分布在已知类目标周围。而非目标类由于缺乏完整的语义信息,其特征与已知类和潜在未知类的特征距离相对较大。利用这一特性,未知类目标检测分支类似于一个过滤器,它可以过滤出所有包含完整语义信息的目标,并将它们作为未知类预测结果。未知类增强探测器,在功能上类似于一个二分类的过滤器。与已知类目标的预测不同的是,未知类增强探测器不再将物体划分到各种各样的类别当中,所有候选区域中的所有类别中的物体都会被归为两个宽泛的类别,这里称为开放世界类与非开放世界类。
为了更好地增强未知类探测器的检测性能,利用了RPN模块能够区分前景和背景的特性,筛选部分RPN输出结果作为伪标签参与训练。这些伪标签将被视作开放世界类的扩充,使未知类探测器能够学习到更多类别目标的特征,从而检测出更多潜在的未知目标。伪标签筛选过程如图3所示。①模块计算真实标签(ground truth, GT)与经过RPN模块生成的候选区域的(intersection over union, IoU);②筛选出IoU[17]小于阈值k1的候选区域结果;③计算保留下来的候选区域中的框的面积,并筛选出面积大于阈值k2的候选区域结果作为潜在目标;④对潜在目标使用非极大值抑制方法筛选出最优潜在目标作为伪标签。加入伪标签协调训练后,未知类探测器能够源源不断地学习到各种类别目标的特性,从而更好地检测出潜在的未知目标。
未知类增强探测器的训练过程如图4所示,伪标签样本产生后与经过GT采样后的正样本合并产生最终正样本,其余样本均作为负样本参与计算损失。未知类增强探测器引用Focal Loss[18]作为损失函数,表达式为
Lde=-αt(1-pt)γlgpt
式(1)中:pt为模型的预测结果;αt为模型的权重系数;γ为超参数。ptαt表达式为
pt= p , y = 1 1 - p , y 1
αt= α , y = 1 1 - α , y 1
式中:y为类标签;p∈[0,1]为模型对标签y=1的预测概率;α∈[0,1]为加权系数。
由于置信度得分[19]不适用于未知目标,同时未知目标的数量是不确定的,这就导致了传统的(non-maximum suppression, NMS)方法无法确定最佳预测框。因此,提出了一种基于预测框中心点位置的分组策略,以确定冗余未知类目标框抑制的范围。采用基于中心点的分组策略是为了将相对位置接近的预测框划分到同一组中,方便后续计算每一组预测框的冗余度得分。最理想的分类状态是将每一个未知类目标上预测框分为一组,然而难点在于预测过程中未知类的数量未知,因此传统的聚类方法如K-means系列的分类方法无法获得类别数量,在分类过程中容易出现如图5展示的问题。面对三个预测框,在不知类别数目的情况下,难以确定将其分为一组、两组或者三组。而在实际预测中,期望将其分到同一组。
针对类别数量不确定导致难以分组的问题,设计了一种基于中心点密度的分类方法,利用所有未知类预测框的每一个中心点与其他所有点相连,构成邻接图。接着使用深度优先搜索的方法遍历这个邻接图并将与节点欧式距离小于阈值k的点分到同一组,最终获得分类结果。这种方法的优点在于不需要事先知道类别的数量,但是容易造成偏差传递,遇到密集点位容易产生分类错误的问题,分类结果如图6所示。
针对2.4.1节中的问题,设计了一种基于中心点所在网格分类的方法,如图7所示,输入图片根据长宽比被划分成N×M个网格区域,根据预测框的中心点坐标计算每个预测框中心点所在的区域,并将中心点位于同一区域的预测框分组。这种方法的优点在于密集点位的包容性较强,不会产生偏差传递,但由于网格分割是完全按照图片比例,因此在网格边缘的点容易被错误地分到别的组。
层次聚类的合并算法通过计算两类数据点间的距离,对最为接近的两类数据点进行组合,并反复迭代这一过程,直到将所有数据点合成一类,并生成聚类谱系图。然而此方法依旧依赖类别数量信息作为参考,因此提出了一种自适应层次分类方法。此方法在层次聚类迭代的过程中通过判断类内距离与阈值的关系,不断优化类别数量,最终确定分类的结果。基于中心点自适应层次分类的工作流程如下。①将类别数目设定为1;②进行层次聚类;③计算每一类的类内距离;④判断类内距离与阈值关系,若有所有的类内距离都小于阈值则分类结束,否则类别数目加1;重复②~④直到分类结束。此处类间距离采用平均欧式距离,首先计算每一个点到同一类里所有的点的欧式距离,接着用欧式距离之和除以总边数。图8所示为一个自适应层次分类的示例,每一次分类之后构建每一类的类内距离矩阵,直到所有的类内距离都小于阈值。基于中心点自适应层次分类方法,可以有效避免类别数目缺失问题,同时能应对密集点位以及边缘点位错分的问题。
对于同一分组的未知类目标预测框,设计了一种基于形状感知的冗余度得分,根据预测框在IoU以及在形状方面的相似程度得到冗余度得分(redundancy score, RDS),RDS可表示为
RDS(a,b)=λIoU(a,b) +ρsim(a,b)
IoU(a,b)= a b a b
sim(a,b)=1- [ A D ( a , b ) ] 2 + [ A R D ( a , b ) ] 2
AD(a,b)= A a - A b m a x ( A a , A b )
ARD(a,b)= H a / ( H b - W a ) / W b
式中:ab为待计算RDS的目标框;ab为目标框的交集面积;ab为目标框的并集面积;λρ为用于平衡的超参数;A为预测框的面积;H为预测框的高度;W为预测框的宽度。
基于预测框中心点位置的分组策略旨在尽可能将集中的未知类预测框进行分组。同时,基于形状感知的冗余度得分矩阵则反映了同一组内每个预测框与其他所有预测框之间的冗余程度。针对冗余度得分较高的预测框,模型会优先保留索引靠前的预测框。这一设计的目的是在相对集中的预测框中过滤掉冗余度较高的预测框。这种方法可以帮助模型更好地处理冗余预测框,从而提高未知类预测精度方面表现。
实验选用的数据集为UnSniffer方法中提出的开放世界数据集,在训练过程中,仅使用Pascal VOC数据集[20],Pascal VOC仅包含20个类,而在测试阶段,将分别从已知类识别和未知类识别来评估模型。
表1所示,测试集包含三个部分,其中VOC测试数据集由4 952个图像组成,其中仅包含所学习的20个类的标记信息,而COCO-OOD由504个图像构成,其中仅包括1 655个未知类的对象的标记信息。COCO-Mixed数据集包含已知和未知类,由897张图像组成,其中包含2 533个未知对象和2 658个已知对象的注释信息。MS-COCO数据集[21]中与Pascal VOC对象类重叠的原始标签被用作已知对象的标签,所有标签都由MS-COCO中的原始标签和基于MS-COCO定义的扩展标签组成。评估模型在检测未知和已知对象方面的可靠性更具挑战性,因为COCO-Mixed中的每个图像都包含更多具有复杂类和集中位置的对象实例。
开放世界目标检测不仅需要检测已知对象,还需要检测未知对象,因此评估指标也分为已知对象和未知对象两个方面。为了评估已知物体的检测性能,使用平均准确度(mean average precision,mAP),mAP的计算公式为
P= T P T P + F P
R= T P T P + F N
AP= 0 1 P(R)dR
mAP= i = 1 N A P i N
式中:TP为已知类中的阳性样本;FP为已知类的假阳性情况;FN为已知类内的假阴性情况;P为模型识别的阳性样本与真阳性样本的比率;R为模型正确识别的样本与阳性样本的比例;AP为PR曲线下的区域。对于未知物体检测性能的评估,使用未知平均准确度(APu)、未知召回率(Ru)、未知F1分数(F1u)和未知准确度(Pu)计算公式为
Pu= T P u T P u + F P u
Ru= T P u T P u + F N u
F1u= 2 P u R u P u + R u
APu= 0 1 Pu(Ru)dRu
式中:TPu为未知类中预测的阳性样本;FPu为未知类中的假阳性样本;而FNu则为未知类的假阴性样本;Ru为正确识别未知类中阳性样本的比率;Pu为未知类中正性样本与真阳性样本的比例;F1uPuRu的调和平均值。
表2为在VOC-test数据集上进行的,基于形状感知与类平衡优化的开放世界目标检测方法虽然并不是最佳算法,但是与最优算法差距仅有2.5%,VOC-test数据集中只包含已知类,除了OW-DETR外而对于已知类目标的预测都是依靠Faster R-CNN模型, 因此在已知类预测的平均准确度上,各个算法表现差异不大。
表3中结果表明,基于形状感知与类平衡优化的开放世界目标检测方法在APuF1uPu指标上表现优异,尤其是在Pu方面,比第二名高了13%。
表4中的结果表明,基于形状感知与类平衡优化的开放世界目标检测方法在复杂度较高的COCO-Mix数据集上也有不错的表现,虽然相对于ORE方法,在未知类目标召回率Rc表现相对一般,但是由于ORE产生大量的无用预测框导致其未知类预测准确度(Pu)方面表现较差。
为了验证RUOBS这个模块的三个不同版本的有效性,设计了针对该模块的消融实验。实验结果如表5表6所示。
表5表6可以看出不同的方法加入RUOBS后性能均有不同程度的提升。在COCO-OOD数据集上,VOS加入RUOBS模块后提升最为明显,未知目标Pu最好的效果提升了23.5%,F1u提升了8%。ORE加入RUOBS模块后Pu提升17.5%了,F1u提升了19.5%,UnSniffer加入RUOBS模块后Pu提升了0.5%。在COCO Mix数据集上,VOS在加入RUOBS模块后Pu最好的效果提升了7.5%,F1u提升了1%。ORE在加入RUOBS模块后Pu最好的效果提升了7.4%,F1u提升了9.4%。UnSniffer加入RUOBS模块后Pu最好的效果提升了2.6%,F1u提升了1.8%。由此可以得出,RUOBS模块可以有效抑制未知目标的冗余预测框,从而提升未知目标的预测准确度。
图9展示了各种算法在开放世界环境下的检测结果,可知,本文算法在不同环境下都有较好的检测性能,尽管对于一些小型未知类目标存在一定的漏检,但是未知类目标错检结果相对是最少的。
ORE和VOS虽然在未知类目标的召回率方面表现出优势,但是检测结果中包含了大量无效预测框,这将极大影响未知目标的预测精度。Faster R-CNN由于只能预测已知类别的目标因此图中第三行和第四行的预测结果为空。
对开放世界目标检测任务,提出了一种形状感知与类平衡优化的开放世界目标检测方法,通过结合未知类增强探测器、基于预测框中心点位置的分组策略以及基于形状感知的冗余度得分矩阵等新颖方法,有效地应对了开放世界目标检测任务中冗余未知类预测框和未知类目标检测的挑战。通过实验验证,在开放世界目标检测数据集上的大量实验结果显示,这种基于形状感知与类平衡优化的开放世界目标检测方法在各个方面均有优异的表现,特别是在未知类目标的预测准确度方面表现出了远超其他方法的性能。
  • 国家自然科学基金(52374155)
  • 安徽省高等学校自然科学研究项目(2022AH040113)
参考文献 引证文献
排序方式:
[1]
吴晨曦, 应保胜, 许小伟, 等. 基于改进单步多框目标检测的道路小目标检测算法[J]. 科学技术与工程, 2023, 23(5): 2051-2058.
Wu Chenxi, Ying Baosheng, Xu Xiaowei, et al. Road small target detection algorithm based on improved SSD[J]. Science Technology and Engineering, 2023, 23(5): 2051-2058.
[2]
Joseph K J, Khan S, Khan F S, et al. Towards open world object detection[C]// Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. New York: IEEE, 2021: 5830-5840.
[3]
Zheng J, Li W, Hong J, et al. Towards open-set object detection and discovery[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2022: 3961-3970.
[4]
Jaiswal A, Babu A R, Zadeh M Z, et al. A survey on contrastive self-supervised learning[J]. Technologies, 2020, 9(1): 8-15.
[5]
Schmarje L, Santarossa M, Schröder S M, et al. A survey on semi-, self-and unsupervised learning for image classification[J]. IEEE Access, 2021, 9: 82146-82168.
[6]
郑涵, 田猛, 赵延峰, 等. 基于改进Faster R-CNN的手部位姿估计方法[J]. 科学技术与工程, 2023, 23(3): 1160-1167.
Zheng Han, Tian Meng, Zhao Yanfeng, et al. Research on hand pose estimation based on improved Faster R-CNN method[J]. Science Technology and Engineering, 2023, 23(3): 1160-1167.
[7]
Gavrilescu R, Zet C, Fosalău C, et al. Faster R-CNN: an approach to real-time object detection[C]// 2018 International Conference and Exposition on Electrical and Power Engineering (EPE). New York: IEEE, 2018: 165-168.
[8]
Zhang Y, Dai L, Wong E W M. Optimal BS deployment and user association for 5G millimeter wave communication networks[J]. IEEE Transactions on Wireless Communications, 2020, 20(5): 2776-2791.
[9]
Yu J, Ma L, Li Z, et al. Open-world object detection via discriminative class prototype learning[C]// 2022 IEEE International Conference on Image Processing (ICIP). New York: IEEE, 2022: 626-630.
[10]
Wu Z, Lu Y, Chen X, et al. UC-OWOD: unknown-classified open world object detection[C]// European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022: 193-210.
[11]
Zhang Y, Zhang X Y, Shi H. OW-TAL: learning unknown human activities for open-world temporal action localization[J]. Pattern Recognition, 2023, 133: 109027.
[12]
Zhao X, Ma Y, Wang D, et al. Revisiting open world object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(5): 1201-1215.
[13]
Liang W, Xue F, Liu Y, et al. Unknown sniffer for object detection: don't turn a blind eye to unknown objects[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2023: 3230-3239.
[14]
Gupta A, Narayan S, Joseph K J, et al. Owdetr: open-world detection transformer[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2022: 9235-9244.
[15]
Zhu X, Su W, Lu L, et al. Deformable detr: deformable transformers for end-to-end object detection[J]. arXiv preprint arXiv: 2010.04159, 2022.
[16]
Wenkel S, Alhazmi K, Liiv T, et al. Confidence score: the forgotten dimension of object detection performance evaluation[J]. Sensors, 2021, 21(13): 4350.
[17]
Zhou D, Fang J, Song X, et al. IoU loss for 2d/3d object detection[C]// 2019 international conference on 3D vision (3DV). New York: IEEE, 2019: 85-94.
[18]
Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]// Proceedings of the IEEE International Conference on Computer Vision. New York: IEEE, 2017: 2980-2988.
[19]
Wenkel S, Alhazmi K, Liiv T, et al. Confidence score: the forgotten dimension of object detection performance evaluation[J]. Sensors, 2021, 21(13): 4350.
[20]
Tong K, Wu Y. Rethinking PASCAL-VOC and MS-COCO dataset for small object detection[J]. Journal of Visual Communication and Image Representation, 2023, 93: 103830.
[21]
Bideaux M, Phe A, Chaouch M, et al. 3D-COCO: extension of MS-COCO dataset for image detection and 3D reconstruction modules[J]. arXiv preprint arXiv: 2404.05641, 2024.
2025年第25卷第11期
PDF下载
319
126
引用本文
BibTeX
文章信息
doi: 10.12404/j.issn.1671-1815.2403508
  • 接收时间:2024-05-12
  • 首发时间:2025-07-09
  • 出版时间:2025-04-18
补充材料
相关文章
文章信息
作者
出版历史
  • 收稿日期:2024-05-12
  • 修回日期:2024-08-01
基金
国家自然科学基金(52374155)
安徽省高等学校自然科学研究项目(2022AH040113)
作者信息
    1 安徽理工大学计算机科学与工程学院, 淮南 232001
    2 安徽理工大学机械工程学院, 淮南 232001

通讯作者:

* 朱彦敏(1988—),女,汉族,山东泰安人,博士,讲师。研究方向:图像处理与模式识别。E-mail:
参考文献
分享链接
https://castjournals.cast.org.cn/joweb/kxjsygc/CN/10.12404/j.issn.1671-1815.2403508
分享至
全文二维码

扫描看全文

引用本文
BibTeX
本文的引用情况
2种不同金属材料的力学参数

Family
属数
Number of
genus
种数
Number of
species
占总种数比例
Percentage of
total species (%)

Genus
种数
Number of
species
占总种数比例
Percentage of total
species (%)
鹅膏菌科Amanitaceae 2 11 5.26 鹅膏菌属 Amanita 10 4.78
小菇科 Mycenaceae 2 12 5.74 丝盖伞属 Inocybe 5 2.39
多孔菌科 Polyporaceae 8 14 6.70 蜡蘑属 Laccaria 5 2.39
红菇科 Russulaceae 3 23 11.00 小皮伞属 Marasmius 6 2.87
小菇属 Mycena 11 5.26
光柄菇属 Pluteus 5 2.39
红菇属 Russula 17 8.13
栓菌属 Trametes 5 2.39
关闭全屏