Article(id=1149780468662498136, tenantId=1146029695717560320, journalId=1146123166801305609, issueId=1149780466032669506, articleNumber=null, orderNo=null, doi=10.12404/j.issn.1671-1815.2403871, pmid=null, cstr=null, oa=null, hot=null, price=null, onlineType=0, articleFormat=0, articleType=null, articleTypeStr=research-article, receivedDate=1716480000000, receivedDateStr=2024-05-24, revisedDate=1736784000000, revisedDateStr=2025-01-14, acceptedDate=null, acceptedDateStr=null, onlineDate=1752058625616, onlineDateStr=2025-07-09, pubDate=1744041600000, pubDateStr=2025-04-08, doiRegisterDate=null, doiRegisterDateStr=null, onlineIssueDate=1752058625616, onlineIssueDateStr=2025-07-09, onlineJustAcceptDate=null, onlineJustAcceptDateStr=null, onlineFirstDate=null, onlineFirstDateStr=null, sourceXml=null, magXml=null, createTime=1752058625616, creator=13701087609, updateTime=1752058625616, updator=13701087609, issue=Issue{id=1149780466032669506, tenantId=1146029695717560320, journalId=1146123166801305609, year='2025', volume='25', issue='10', pageStart='3969', pageEnd='4395', issueExtLink='null', onlineDate='null', pubDate='null', beforeIssueId=null, nextIssueId=null, price=null, status=1, issueComplete=1, articleOrder=1, issueType=-1, specialIssue=0, createTime=1752058624990, creator=13701087609, updateTime=1768456644259, updator=13701087609, preIssue=null, nextIssue=null, ext={EN=IssueExt(id=1218558743898411553, tenantId=1146029695717560320, journalId=1146123166801305609, issueId=1149780466032669506, language=EN, specialIssueTitle=, coverIllustrator=, specialIssueEditor=, specialIssueAbout=), CN=IssueExt(id=1218558743898411554, tenantId=1146029695717560320, journalId=1146123166801305609, issueId=1149780466032669506, language=CN, specialIssueTitle=, coverIllustrator=, specialIssueEditor=, specialIssueAbout=)}, issueFiles=null}, startPage=4229, endPage=4238, ext={EN=ArticleExt(id=1149780468926739290, articleId=1149780468662498136, tenantId=1146029695717560320, journalId=1146123166801305609, language=EN, title=A Garbage Imageclassification Algorithm Based on the Improved Efficientnetv2 Network, columnId=1156262729162810294, journalTitle=Science Technology and Engineering, columnName=Papers·Automation and Computational Technology, runingTitle=null, highlight=null, articleAbstract=

An improved version of the EfficientNetV2 network is presented for garbage image classification to address the limitations of mainstream algorithms, such as poor dataset universality, limited recognition types, and algorithmic constraints in specific environments. The proposed algorithm emphasized both classification speed and accuracy. The EfficientNetV2 network was utilized as the baseline model, and classification speed was enhanced through the incorporation of the SK (selective kernel) attention mechanism. Transfer learning strategies were employed to improve classification accuracy. By leveraging deep learning model frameworks for garbage image processing, the need for manual feature extraction from dataset images was eliminated, and the scope of garbage recognition was expanded. Experimental results demonstrate that the proposed algorithm achieves an accuracy of 99.71% on a self-built dataset, which is an improvement of at least 4.77% compared to other algorithms, such as GoogleNet. Furthermore, in terms of time efficiency, the proposed algorithm outperforms algorithms like VggNet19 by at least 50%. Through the enhancement of the EfficientNetV2 network, accurate and faster garbage classification is enabled, providing a scientific and efficient solution to the growing challenges posed by garbage issues.

, correspAuthors=Lu ZENG, authorNote=null, correspAuthorsNote=null, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=null, magXml=null, pdfUrl=null, pdf=null, pdfFileSize=null, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=null, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=null, mapNumber=null, authorCompany=null, fund=null, authors=null, authorsList=Zhen-li ZHANG, Yuan CHEN, Hao FU, Lu ZENG), CN=ArticleExt(id=1149780504968393285, articleId=1149780468662498136, tenantId=1146029695717560320, journalId=1146123166801305609, language=CN, title=基于改进EfficientNetV2网络的垃圾图像分类算法, columnId=1156262729783567290, journalTitle=科学技术与工程, columnName=论文·自动化技术、计算机技术, runingTitle=null, highlight=null, articleAbstract=

目前主流垃圾图像分类算法中存在数据集普适性差、垃圾识别种类少、分类算法局限于特定环境等问题。针对这些问题,结合垃圾图像分类的快速性与准确率的要求,提出了一种基于改进EfficientNetV2网络的垃圾图像分类算法。该算法以EfficientNetV2网络作为基准模型,通过添加SK(selective kernel)注意力机制提升分类的快速性,使用迁移学习策略提升分类的准确率。该算法利用深度学习模型框架对垃圾图像进行处理,无需对数据集图像特征进行人工提取,在实现对垃圾图像快速准确分类的同时增加了垃圾识别的种类。实验表明,新的算法在自建数据集上的准确率为99.71%,相较于GoogleNet等其他算法,提升了至少4.77%。在时间上相较于VggNet19算法等,提升了至少50%。通过改进EfficientNetV2网络,实现了更为准确快速的垃圾分类,为日益激增的垃圾问题提供了一种科学高效的解决方案。

, correspAuthors=曾璐, authorNote=null, correspAuthorsNote=
* 曾璐(1981—),女,汉族,江西赣州人,硕士,副教授。研究方向:智能控制技术。E-mail:
, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=6LPcVyPEfDm3pXnOTAjd8w==, magXml=UufCIPyAAIZo7EcDJgcxuQ==, pdfUrl=null, pdf=RVeQ2I3uDx824pMS1jRDMQ==, pdfFileSize=13155522, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=JKrAaHfuKXIAWWk4tNHcwg==, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=pgu2iNX+vMv5qgiI+PxxqA==, mapNumber=null, authorCompany=null, fund=null, authors=

张振利(1976—),男,汉族,河北滦县人,硕士,副教授。研究方向:检测技术与控制。E-mail:

, authorsList=张振利, 陈源, 付豪, 曾璐)}, authors=[Author(id=1218525105395777695, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, orderNo=0, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=47717770@qq.com, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1218525105529995443, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, authorId=1218525105395777695, language=EN, stringName=Zhen-li ZHANG, firstName=Zhen-li, middleName=null, lastName=ZHANG, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=School of Electrical Engineering and Automation, Jiangxi University of Science and Technology, Ganzhou 341000, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1218525105689379018, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, authorId=1218525105395777695, language=CN, stringName=张振利, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=江西理工大学电气工程与自动化学院, 赣州 341000, bio={"content":"

张振利(1976—),男,汉族,河北滦县人,硕士,副教授。研究方向:检测技术与控制。E-mail:

"}, bioImg=null, bioContent=

张振利(1976—),男,汉族,河北滦县人,硕士,副教授。研究方向:检测技术与控制。E-mail:

, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1218525105269948557, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, xref=null, ext=[AuthorCompanyExt(id=1218525105278337167, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, companyId=1218525105269948557, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Electrical Engineering and Automation, Jiangxi University of Science and Technology, Ganzhou 341000, China), AuthorCompanyExt(id=1218525105290920084, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, companyId=1218525105269948557, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=江西理工大学电气工程与自动化学院, 赣州 341000)])]), Author(id=1218525105815208150, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, orderNo=1, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1218525105966203109, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, authorId=1218525105815208150, language=EN, stringName=Yuan CHEN, firstName=Yuan, middleName=null, lastName=CHEN, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=School of Electrical Engineering and Automation, Jiangxi University of Science and Technology, Ganzhou 341000, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1218525106075255030, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, authorId=1218525105815208150, language=CN, stringName=陈源, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=江西理工大学电气工程与自动化学院, 赣州 341000, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1218525105269948557, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, xref=null, ext=[AuthorCompanyExt(id=1218525105278337167, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, companyId=1218525105269948557, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Electrical Engineering and Automation, Jiangxi University of Science and Technology, Ganzhou 341000, China), AuthorCompanyExt(id=1218525105290920084, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, companyId=1218525105269948557, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=江西理工大学电气工程与自动化学院, 赣州 341000)])]), Author(id=1218525106171724036, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, orderNo=2, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1218525106251415822, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, authorId=1218525106171724036, language=EN, stringName=Hao FU, firstName=Hao, middleName=null, lastName=FU, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=School of Electrical Engineering and Automation, Jiangxi University of Science and Technology, Ganzhou 341000, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1218525106343690526, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, authorId=1218525106171724036, language=CN, stringName=付豪, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=江西理工大学电气工程与自动化学院, 赣州 341000, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1218525105269948557, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, xref=null, ext=[AuthorCompanyExt(id=1218525105278337167, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, companyId=1218525105269948557, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Electrical Engineering and Automation, Jiangxi University of Science and Technology, Ganzhou 341000, China), AuthorCompanyExt(id=1218525105290920084, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, companyId=1218525105269948557, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=江西理工大学电气工程与自动化学院, 赣州 341000)])]), Author(id=1218525106427576619, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, orderNo=3, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=zenglu@jxust.edu.cn, emailSecond=null, emailThird=null, correspondingAuthor=1, authorType=1, ext={EN=AuthorExt(id=1218525106595348804, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, authorId=1218525106427576619, language=EN, stringName=Lu ZENG, firstName=Lu, middleName=null, lastName=ZENG, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=*, address=School of Electrical Engineering and Automation, Jiangxi University of Science and Technology, Ganzhou 341000, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1218525106775703893, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, authorId=1218525106427576619, language=CN, stringName=曾璐, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=*, address=江西理工大学电气工程与自动化学院, 赣州 341000, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1218525105269948557, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, xref=null, ext=[AuthorCompanyExt(id=1218525105278337167, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, companyId=1218525105269948557, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Electrical Engineering and Automation, Jiangxi University of Science and Technology, Ganzhou 341000, China), AuthorCompanyExt(id=1218525105290920084, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, companyId=1218525105269948557, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=江西理工大学电气工程与自动化学院, 赣州 341000)])])], keywords=[Keyword(id=1218525107018973552, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, orderNo=1, keyword=garbage classification), Keyword(id=1218525107169968511, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, orderNo=2, keyword=deep learning), Keyword(id=1218525107253854601, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, orderNo=3, keyword=EfficientNetV2), Keyword(id=1218525107341934997, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, orderNo=4, keyword=convolutional neural network), Keyword(id=1218525107438404005, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, orderNo=5, keyword=split-attention mechanism with switchable normalization), Keyword(id=1218525107539067319, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, orderNo=1, keyword=垃圾分类), Keyword(id=1218525107706839491, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, orderNo=2, keyword=深度学习), Keyword(id=1218525107832668623, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, orderNo=3, keyword=EfficientNetV2), Keyword(id=1218525107924943320, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, orderNo=4, keyword=卷积神经网络), Keyword(id=1218525108038189544, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, orderNo=5, keyword=SK注意力机制)], refs=[Reference(id=1218525112672895874, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2020, volume=43, issue=5, pageStart=755, pageEnd=780, url=null, language=null, rfNumber=[1], rfOrder=0, authorNames=徐冰冰, 岑科廷, 黄俊杰, journalName=计算机学报, refType=null, unstructuredReference=徐冰冰, 岑科廷, 黄俊杰, 等. 图卷积神经网络综述[J]. 计算机学报, 2020, 43(5): 755-780., articleTitle=图卷积神经网络综述, refAbstract=null), Reference(id=1218525112773559177, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2020, volume=43, issue=5, pageStart=755, pageEnd=780, url=null, language=null, rfNumber=[1], rfOrder=1, authorNames=Xu Bingbing, Cen Keting, Huang Junjie, journalName=Chinese Journal of Computers, refType=null, unstructuredReference=Xu Bingbing, Cen Keting, Huang Junjie, et al. A survey on graph convolutional neural network[J]. Chinese Journal of Computers, 2020, 43(5): 755-780., articleTitle=A survey on graph convolutional neural network, refAbstract=null), Reference(id=1218525112924554132, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2021, volume=21, issue=21, pageStart=8970, pageEnd=8975, url=null, language=null, rfNumber=[2], rfOrder=2, authorNames=陈亚宇, 孙骥晟, 李建龙, journalName=科学技术与工程, refType=null, unstructuredReference=陈亚宇, 孙骥晟, 李建龙, 等. 基于深度学习与图像处理的废弃物分类与定位方法[J]. 科学技术与工程, 2021, 21(21): 8970-8975., articleTitle=基于深度学习与图像处理的废弃物分类与定位方法, refAbstract=null), Reference(id=1218525113054577564, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2021, volume=21, issue=21, pageStart=8970, pageEnd=8975, url=null, language=null, rfNumber=[2], rfOrder=3, authorNames=Chen Yayu, Sun Jisheng, Li Jianlong, journalName=Science Technology and Engineering, refType=null, unstructuredReference=Chen Yayu, Sun Jisheng, Li Jianlong, et al. Research on waste classification and location method based on deep learning and image processing[J]. Science Technology and Engineering, 2021, 21(21): 8970-8975., articleTitle=Research on waste classification and location method based on deep learning and image processing, refAbstract=null), Reference(id=1218525113184600996, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2024, volume=18, issue=null, pageStart=5121, pageEnd=5136, url=null, language=null, rfNumber=[3], rfOrder=4, authorNames=Xia Z X, Zhou H, Yu H, journalName=SIViP, refType=null, unstructuredReference=Xia Z X, Zhou H, Yu H, et al. YOLO-MTG: a lightweight YOLO model for multi-target garbage detection[J]. SIViP, 2024, 18: 5121-5136., articleTitle=YOLO-MTG: a lightweight YOLO model for multi-target garbage detection, refAbstract=null), Reference(id=1218525113306235821, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2022, volume=14, issue=22, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[4], rfOrder=5, authorNames=Yang Z, Xia Z, Yang G, journalName=Sustainability, refType=null, unstructuredReference=Yang Z, Xia Z, Yang G, et al. A garbage classification method based on a small convolution neural network[J]. Sustainability, 2022, 14(22). DOI: 10.3390/su142214735., articleTitle=A garbage classification method based on a small convolution neural network, refAbstract=null), Reference(id=1218525113402704822, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2023, volume=414, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[5], rfOrder=6, authorNames=Chen Y, Luo A, Cheng M, journalName=Journal of Cleaner Production, refType=null, unstructuredReference=Chen Y, Luo A, Cheng M, et al. Classification and recycling of recyclable garbage based on deep learning[J]. Journal of Cleaner Production, 2023, 414.DOI: 10.1016/j.jclepro.2023.137558., articleTitle=Classification and recycling of recyclable garbage based on deep learning, refAbstract=null), Reference(id=1218525113511756733, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2021, volume=164, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[6], rfOrder=7, authorNames=Mao W L, Chen W C, Wang C T, journalName=Resources, Conservation and Recycling, refType=null, unstructuredReference=Mao W L, Chen W C, Wang C T, et al. Recycling waste classifica-tion using optimized convolutional neural network[J]. Resources, Conservation and Recycling, 2021, 164.DOI: 10.1016/j.resconrec.2020.105132., articleTitle=Recycling waste classifica-tion using optimized convolutional neural network, refAbstract=null), Reference(id=1218525113637585861, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2024, volume=12, issue=null, pageStart=44799, pageEnd=44807, url=null, language=null, rfNumber=[7], rfOrder=8, authorNames=Tian X, Shi L, Luo Y, journalName=IEEE Access, refType=null, unstructuredReference=Tian X, Shi L, Luo Y, et al. Garbage classification algorithm based on improved MobileNetV3[J]. IEEE Access, 2024, 12: 44799-44807., articleTitle=Garbage classification algorithm based on improved MobileNetV3, refAbstract=null), Reference(id=1218525113792775116, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2021, volume=39, issue=2, pageStart=110, pageEnd=115, url=null, language=null, rfNumber=[8], rfOrder=9, authorNames=袁建野, 南新元, 蔡鑫, journalName=环境工程, refType=null, unstructuredReference=袁建野, 南新元, 蔡鑫, 等. 基于轻量级残差网路的垃圾图片分类方法[J]. 环境工程, 2021, 39(2): 110-115., articleTitle=基于轻量级残差网路的垃圾图片分类方法, refAbstract=null), Reference(id=1218525113956352978, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2021, volume=39, issue=2, pageStart=110, pageEnd=115, url=null, language=null, rfNumber=[8], rfOrder=10, authorNames=Yuan Jianye, Nan Xinyuan, Cai Xin, journalName=Environmental Engineering, refType=null, unstructuredReference=Yuan Jianye, Nan Xinyuan, Cai Xin, et al. Garbage image classification by lightweight residual network[J]. Environmental Engineering, 2021, 39(2): 110-115., articleTitle=Garbage image classification by lightweight residual network, refAbstract=null), Reference(id=1218525114086376410, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2020, volume=8, issue=null, pageStart=140019, pageEnd=140029, url=null, language=null, rfNumber=[9], rfOrder=11, authorNames=Kang Z, Yang J, Li G, journalName=IEEE Access, refType=null, unstructuredReference=Kang Z, Yang J, Li G, et al. An automatic garbage classification system based on deep learning[J]. IEEE Access, 2020, 8: 140019-140029., articleTitle=An automatic garbage classification system based on deep learning, refAbstract=null), Reference(id=1218525114166068192, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2021, volume=781, issue=3, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[10], rfOrder=12, authorNames=Gu Y, Ge B, journalName=IOP Conference Series: Earth and Environmental Science, refType=null, unstructuredReference=Gu Y, Ge B. Research onlightweight convolutional neural network in garbage classification[J]. IOP Conference Series: Earth and Environmental Science, 2021, 781(3). DOI: 10.1088/1755-1315/781/3/032011., articleTitle=Research onlightweight convolutional neural network in garbage classification, refAbstract=null), Reference(id=1218525114241565671, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2024, volume=10, issue=9, pageStart=e29966, pageEnd=null, url=null, language=null, rfNumber=[11], rfOrder=13, authorNames=Wang J, journalName=Heliyon, refType=null, unstructuredReference=Wang J. Application research of image classification algorithm based on deep learning in household garbage sorting[J]. Heliyon, 2024, 10(9): e29966., articleTitle=Application research of image classification algorithm based on deep learning in household garbage sorting, refAbstract=null), Reference(id=1218525114325451759, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2019, volume=19, issue=3, pageStart=136, pageEnd=141, url=null, language=null, rfNumber=[12], rfOrder=14, authorNames=汤伟, 刘思洋, 高涵, journalName=科学技术与工程, refType=null, unstructuredReference=汤伟, 刘思洋, 高涵, 等. 基于视觉的水面垃圾清理机器人目标检测算法[J]. 科学技术与工程, 2019, 19(3): 136-141., articleTitle=基于视觉的水面垃圾清理机器人目标检测算法, refAbstract=null), Reference(id=1218525114480641014, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2019, volume=19, issue=3, pageStart=136, pageEnd=141, url=null, language=null, rfNumber=[12], rfOrder=15, authorNames=Tang Wei, Liu Siyang, Gao Han, journalName=Science Technology and Engineering, refType=null, unstructuredReference=Tang Wei, Liu Siyang, Gao Han, et al. A target detection algorithm for surface cleaning robot based on machine vision[J]. Science Technology and Engineering, 2019, 19(3): 136-141., articleTitle=A target detection algorithm for surface cleaning robot based on machine vision, refAbstract=null), Reference(id=1218525114593887229, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2021, volume=41, issue=2, pageStart=498, pageEnd=512, url=null, language=null, rfNumber=[13], rfOrder=16, authorNames=高明, 陈玉涵, 张泽慧, journalName=系统工程理论与实践, refType=null, unstructuredReference=高明, 陈玉涵, 张泽慧, 等. 基于新型空间注意力机制和迁移学习的垃圾图像分类算法[J]. 系统工程理论与实践, 2021, 41(2): 498-512., articleTitle=基于新型空间注意力机制和迁移学习的垃圾图像分类算法, refAbstract=null), Reference(id=1218525114715521025, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2021, volume=41, issue=2, pageStart=498, pageEnd=512, url=null, language=null, rfNumber=[13], rfOrder=17, authorNames=Gao Ming, Chen Yuhan, Zhang Zehui, journalName=Systems Engineering Theory and Practice, refType=null, unstructuredReference=Gao Ming, Chen Yuhan, Zhang Zehui, et al. Classification algorithm of garbage images based on novel spatial attention mechanism and transfer learning[J]. Systems Engineering Theory and Practice, 2021, 41(2): 498-512., articleTitle=Classification algorithm of garbage images based on novel spatial attention mechanism and transfer learning, refAbstract=null), Reference(id=1218525114816184328, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2019, volume=null, issue=null, pageStart=6105, pageEnd=6114, url=null, language=null, rfNumber=[14], rfOrder=18, authorNames=Tan M, Le Q, journalName=International Conference on Machine Learning, PMLR, refType=null, unstructuredReference=Tan M, Le Q. EfficientNet: rethinking model scaling for convolutional neural networks[C]// International Conference on Machine Learning, PMLR. Long Beach: PMLR, 2019: 6105-6114., articleTitle=EfficientNet: rethinking model scaling for convolutional neural networks, refAbstract=null), Reference(id=1218525114887487505, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2021, volume=null, issue=null, pageStart=10096, pageEnd=10106, url=null, language=null, rfNumber=[15], rfOrder=19, authorNames=Tan M, Le Q, journalName=International Conference on Machine Learning, PMLR, Virtual Event, refType=null, unstructuredReference=Tan M, Le Q. EfficientNetV2: smaller models and faster training[C]// International Conference on Machine Learning, PMLR, Virtual Event. Long Beach: PMLR, 2021: 10096-10106., articleTitle=EfficientNetV2: smaller models and faster training, refAbstract=null), Reference(id=1218525115004928023, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2018, volume=null, issue=null, pageStart=7132, pageEnd=7141, url=null, language=null, rfNumber=[16], rfOrder=20, authorNames=Hu J, Shen L, Sun G, journalName=Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, refType=null, unstructuredReference=Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2018: 7132-7141., articleTitle=Squeeze-and-excitation networks, refAbstract=null), Reference(id=1218525115105591326, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2022, volume=33, issue=6, pageStart=065023, pageEnd=null, url=null, language=null, rfNumber=[17], rfOrder=21, authorNames=Qu H, Yang J, Shen M, journalName=Measurement Science and Technology, refType=null, unstructuredReference=Qu H, Yang J, Shen M, et al. Fault diagnosis of rolling bearing under time-varying speed conditions based on EfficientNetv2[J]. Measurement Science and Technology, 2022, 33(6): 065023., articleTitle=Fault diagnosis of rolling bearing under time-varying speed conditions based on EfficientNetv2, refAbstract=null), Reference(id=1218525115235614759, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, doi=null, pmid=null, pmcid=null, year=2022, volume=null, issue=null, pageStart=2098, pageEnd=2107, url=null, language=null, rfNumber=[18], rfOrder=22, authorNames=Zhang Y, Li D, Law K L, journalName=Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), refType=null, unstructuredReference=Zhang Y, Li D, Law K L, et al. Idr: self-supervised image denoising via iterative data refinement[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). New Orleans: IEEE, 2022: 2098-2107., articleTitle=Idr: self-supervised image denoising via iterative data refinement, refAbstract=null)], funds=[Fund(id=1218525112458986348, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, awardId=62363013, language=CN, fundingSource=国家自然科学基金(62363013), fundOrder=null, country=null)], companyList=[AuthorCompany(id=1218525105269948557, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, xref=null, ext=[AuthorCompanyExt(id=1218525105278337167, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, companyId=1218525105269948557, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Electrical Engineering and Automation, Jiangxi University of Science and Technology, Ganzhou 341000, China), AuthorCompanyExt(id=1218525105290920084, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, companyId=1218525105269948557, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=江西理工大学电气工程与自动化学院, 赣州 341000)])], figs=[ArticleFig(id=1218525108356956681, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, label=Fig.1, caption=Model compression schematic, figureFileSmall=AlZNoQgJxfVpGmDG6UrADA==, figureFileBig=jYuZJtoYfV/mcdJbW1xyAg==, tableContent=null), ArticleFig(id=1218525108474397205, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, label=图1, caption=模型缩放示意图, figureFileSmall=AlZNoQgJxfVpGmDG6UrADA==, figureFileBig=jYuZJtoYfV/mcdJbW1xyAg==, tableContent=null), ArticleFig(id=1218525108646363686, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, label=Fig.2, caption=The structure diagram of EfficientNetV2 network, figureFileSmall=QAXjrIiMEvIvLIbujfgNZw==, figureFileBig=dSdKtZ2NzxdIt0eZvxWsew==, tableContent=null), ArticleFig(id=1218525108767998514, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, label=图2, caption=EfficientNetV2网络的结构图, figureFileSmall=QAXjrIiMEvIvLIbujfgNZw==, figureFileBig=dSdKtZ2NzxdIt0eZvxWsew==, tableContent=null), ArticleFig(id=1218525108931576383, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, label=Fig.3, caption=Structure of Fused-MBConv and MBConv, figureFileSmall=21zARycxCtQRRoDMiLIffA==, figureFileBig=iIWKnz245YsZ3dz8ei2xPg==, tableContent=null), ArticleFig(id=1218525109036433992, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, label=图3, caption=Fused-MBConv模块结构和MBConv模块结构

H为图像或特征图的高度,即垂直方向上的像素数量;W为图像或特征图的宽度,即水平方向上的像素数量;C为图像或特征图的通道数;Conv(Convolution)为卷积;Depthwise Conv为卷积操作的一种变体,是轻量化神经网络中常用的高效卷积操作

, figureFileSmall=21zARycxCtQRRoDMiLIffA==, figureFileBig=iIWKnz245YsZ3dz8ei2xPg==, tableContent=null), ArticleFig(id=1218525109162263132, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, label=Fig.4, caption=SKNet network architecture diagram, figureFileSmall=XOXYFTh4ZFy68sLG8c/A5g==, figureFileBig=MsbsEqtJL0wOgihD/oJLtw==, tableContent=null), ArticleFig(id=1218525109317452397, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, label=图4, caption=SKNet网络结构图

Kernel为一个小尺寸的权重矩阵,在卷积操作中作为滑动窗口,用于在输入数据上提取局部特征;Split 为处理多尺度特征的重要步骤;Softmax 主要用于生成每个感受野(或每个尺度)的权重;Select 过程是从生成的多尺度特征中选择最相关的特征,并为每个特征分配一个权重,从而增强最终的输出特征

, figureFileSmall=XOXYFTh4ZFy68sLG8c/A5g==, figureFileBig=MsbsEqtJL0wOgihD/oJLtw==, tableContent=null), ArticleFig(id=1218525109439087226, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, label=Fig.5, caption=Principle of MBConv structure, figureFileSmall=AnxvmfcHIzSH8yH9bT67CA==, figureFileBig=5hmh9QosRGnRFm6D77wQAA==, tableContent=null), ArticleFig(id=1218525109577499273, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, label=图5, caption=MBConv结构原理, figureFileSmall=AnxvmfcHIzSH8yH9bT67CA==, figureFileBig=5hmh9QosRGnRFm6D77wQAA==, tableContent=null), ArticleFig(id=1218525109694939796, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, label=Fig.6, caption=SK-MBConv structure diagram, figureFileSmall=+ENbAaoiqTNimaWZU/J57Q==, figureFileBig=HRkJXNMqLrwk7IwCvraCQA==, tableContent=null), ArticleFig(id=1218525109841740453, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, label=图6, caption=SK-MBConv结构图, figureFileSmall=+ENbAaoiqTNimaWZU/J57Q==, figureFileBig=HRkJXNMqLrwk7IwCvraCQA==, tableContent=null), ArticleFig(id=1218525109980152498, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, label=Fig.7, caption=Improved algorithm flowchart, figureFileSmall=n3LBPVFdha9F0zcQV44ISQ==, figureFileBig=2EBOtEv5SnklOI81tdB9xw==, tableContent=null), ArticleFig(id=1218525110080815802, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, label=图7, caption=改进后新算法的算法流程图

Input为经过预处理后的图像数据;Stem用于处理输入图像并生成初始的特征表示;Convolutional Layer为卷积层;Batch Normalization Layer用于加速训练速度、提高模型稳定性;Activation Layer用于引入非线性特性;Blocks是构建神经网络的基础模块,用于表示网络中各个层级的特定运算单元;Head为网络的最后部分,用于对提取的高级特征进行处理,输出最终的分类或回归结果;Dropout Layer为正则化技术,用于减少神经网络的过拟合现象;Dense Layer连接上一层的所有神经元到当前层的每个神经元;Global Average Pooling Layer为池化层;Output为模型的最终输出层,直接生成模型的预测结果

, figureFileSmall=n3LBPVFdha9F0zcQV44ISQ==, figureFileBig=2EBOtEv5SnklOI81tdB9xw==, tableContent=null), ArticleFig(id=1218525110244393674, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, label=Fig.8, caption=Recognition Accuracy, figureFileSmall=LCSE2Jqnu+m2j63nXCD6TA==, figureFileBig=iVlLVP1ZJS0ReHD2h9gesw==, tableContent=null), ArticleFig(id=1218525110357639893, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, label=图8, caption=识别准确率, figureFileSmall=LCSE2Jqnu+m2j63nXCD6TA==, figureFileBig=iVlLVP1ZJS0ReHD2h9gesw==, tableContent=null), ArticleFig(id=1218525110496051937, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, label=Fig.9, caption=Training loss values, figureFileSmall=P42rIJpnH45Srp+ZBlbEQw==, figureFileBig=pvr6vZKnDNSvHSfunFP/sw==, tableContent=null), ArticleFig(id=1218525110634463976, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, label=图9, caption=训练损失值, figureFileSmall=P42rIJpnH45Srp+ZBlbEQw==, figureFileBig=pvr6vZKnDNSvHSfunFP/sw==, tableContent=null), ArticleFig(id=1218525110768681716, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, label=Fig.10, caption=Results of garbage image recognition, figureFileSmall=l1Q9GdYwgTIWZqDNP/ZqGw==, figureFileBig=Psrp8ibvSM71jXsDjTAC4Q==, tableContent=null), ArticleFig(id=1218525110877733629, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, label=图10, caption=垃圾图像识别结果, figureFileSmall=l1Q9GdYwgTIWZqDNP/ZqGw==, figureFileBig=Psrp8ibvSM71jXsDjTAC4Q==, tableContent=null), ArticleFig(id=1218525111091643144, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, label=Table 1, caption=

Overall architecture of EfficientNetV2-S model

, figureFileSmall=null, figureFileBig=null, tableContent=
阶段 操作 步长 通道数 层数
1 Conv3×3 2 24 1
2 Fused-MBConv1,k3×3 1 24 2
3 Fused-MBConv4,k3×3 2 48 4
4 Fused-MBConv4,k3×3 2 64 4
5 MBConv4,k3×3,SE0.25 2 128 6
6 MBConv6,k3×3,SE0.25 1 160 9
7 MBConv6,k3×3,SE0.25 2 256 15
8 Conv1×1 &Pooling & FC 1 280 1
), ArticleFig(id=1218525111225860886, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, label=表1, caption=

EfficientNetV2模型的总体结构

, figureFileSmall=null, figureFileBig=null, tableContent=
阶段 操作 步长 通道数 层数
1 Conv3×3 2 24 1
2 Fused-MBConv1,k3×3 1 24 2
3 Fused-MBConv4,k3×3 2 48 4
4 Fused-MBConv4,k3×3 2 64 4
5 MBConv4,k3×3,SE0.25 2 128 6
6 MBConv6,k3×3,SE0.25 1 160 9
7 MBConv6,k3×3,SE0.25 2 256 15
8 Conv1×1 &Pooling & FC 1 280 1
), ArticleFig(id=1218525111347495708, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, label=Table 2, caption=

Detailed data in the experimental dataset

, figureFileSmall=null, figureFileBig=null, tableContent=
类别 训练集(Train) 测试集(Val) 数量
Battery 4 253 472 4 725
Biological 4 433 492 4 925
Brown glass 2 732 303 3 035
Cardboard 4 010 445 4 455
Clothes 23 963 2 662 26 625
Green glass 2 831 314 3 145
Metal 3 461 384 3 845
Paper 4 725 525 5 250
Plastic 3 893 432 4 325
Shoes 8 897 988 9 885
Trash 3 137 348 3 485
White glass 3 488 387 3 875
), ArticleFig(id=1218525111460741922, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, label=表2, caption=

实验所用数据集中的具体数据

, figureFileSmall=null, figureFileBig=null, tableContent=
类别 训练集(Train) 测试集(Val) 数量
Battery 4 253 472 4 725
Biological 4 433 492 4 925
Brown glass 2 732 303 3 035
Cardboard 4 010 445 4 455
Clothes 23 963 2 662 26 625
Green glass 2 831 314 3 145
Metal 3 461 384 3 845
Paper 4 725 525 5 250
Plastic 3 893 432 4 325
Shoes 8 897 988 9 885
Trash 3 137 348 3 485
White glass 3 488 387 3 875
), ArticleFig(id=1218525111653679917, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, label=Table 3, caption=

Comparison of experimental results from different models

, figureFileSmall=null, figureFileBig=null, tableContent=
模型 识别准确
率/%
模型参数数量 运行总时间/s
AlexNet 88.88 14 606 028 1 629
GoogleNet 94.94 10 340 180 9 197
ResNet50 94.44 25 671 628 4 893
MobileNet-V2 93.83 2 273 356 2 623
VggNet19 93.89 75 627 596 28 673
SwinTransform 74.20 27 557 394 273 294
ConvNeXt 89.72 27 829 356 268 279
本文模型 99.71 20 346 732 10 856
), ArticleFig(id=1218525111775314740, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, label=表3, caption=

不同的模型实验结果对比

, figureFileSmall=null, figureFileBig=null, tableContent=
模型 识别准确
率/%
模型参数数量 运行总时间/s
AlexNet 88.88 14 606 028 1 629
GoogleNet 94.94 10 340 180 9 197
ResNet50 94.44 25 671 628 4 893
MobileNet-V2 93.83 2 273 356 2 623
VggNet19 93.89 75 627 596 28 673
SwinTransform 74.20 27 557 394 273 294
ConvNeXt 89.72 27 829 356 268 279
本文模型 99.71 20 346 732 10 856
), ArticleFig(id=1218525111938892604, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, label=Table 4, caption=

Causal experiment 1

, figureFileSmall=null, figureFileBig=null, tableContent=
方案 是否添加迁
移学习策略
识别准
确率/%
运行总
时间/s
MBConv 98.81 14 294
MBConv+SGE 34.31 15 885
MBConv+ECA 61.11 16 050
MBConv+CBAM 50.51 17 744
MBConv+SK 99.71 10 856
), ArticleFig(id=1218525112039555911, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, label=表4, caption=

消融实验1

, figureFileSmall=null, figureFileBig=null, tableContent=
方案 是否添加迁
移学习策略
识别准
确率/%
运行总
时间/s
MBConv 98.81 14 294
MBConv+SGE 34.31 15 885
MBConv+ECA 61.11 16 050
MBConv+CBAM 50.51 17 744
MBConv+SK 99.71 10 856
), ArticleFig(id=1218525112136024912, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=EN, label=Table 5, caption=

Causal experiment 2

, figureFileSmall=null, figureFileBig=null, tableContent=
基准模型 SK注意
力机制
迁移学习 识别准
确率/%
运行总
时间/s
P 46.51 14 844
P P 48.34 11 337
P P 98.81 14 294
P P P 99.71 10 856
), ArticleFig(id=1218525112240882521, tenantId=1146029695717560320, journalId=1146123166801305609, articleId=1149780468662498136, language=CN, label=表5, caption=

消融实验2

, figureFileSmall=null, figureFileBig=null, tableContent=
基准模型 SK注意
力机制
迁移学习 识别准
确率/%
运行总
时间/s
P 46.51 14 844
P P 48.34 11 337
P P 98.81 14 294
P P P 99.71 10 856
)], attaches=null, journal=Journal(id=1146119176004939786, delFlag=0, nameCn=科学技术与工程, nameEn=Science Technology and Engineering, nameHistory1=null, nameHistory2=null, issn=1671-1815, eissn=, cn=11-4688/T, coden=null, periodic=4, language=CN, oaType=是, ccby=null, superviseOffice=null, ownerOffice=null, pubOffice=null, editorOffice=null, officeType=null, aims=null, clcCode=null, officeProv=null, officeCity=null, officeAddr=null, officeZip=null, officeEmail=null, officePhone=null, editDirector=null, officeDirector=null, officeDirectorPhone=null, officeStaffNum=null, officeEmpNum=null, coverPicUrl=UKU/O7GSka5polgCTkbIIw==, journalPrice=null, startedYear=null, abbrevIsoEn=Sci Technol Eng, journalRemark=null, publicationField=null, createdTime=null, updatedTime=1754445529766, createdBy=null, updatedBy=13701087609, firstLetterCn=S, firstLetterEn=S, subjectCode=Natural Sciences, subjectName=自然科学, subjectCodeEn=Natural Sciences, subjectNameEn=null, picCn=UKU/O7GSka5polgCTkbIIw==, picEn=5hwlULoNwcbj3xUmVi9MAQ==, jcr=null, cjcr=null, exts=[JournalExt(id=1159791870395564357, language=CN, name=科学技术与工程, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=null, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=http://www.stae.com.cn/jsygc/home, createdTime=1754445529793, updatedTime=1754445529793, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=http://www.stae.com.cn/jsygc/site/menus/20090429150146001, submissionAuthorUrl=http://www.stae.com.cn/jsygc/author/login, submissionEditorUrl=http://www.stae.com.cn/jsygc/editor/login, submissionReviewUrl=http://www.stae.com.cn/jsygc/reviewer/login, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""}), JournalExt(id=1159791870441701702, language=EN, name=Science Technology and Engineering, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=null, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=http://www.stae.com.cn/jsygc/home, createdTime=1754445529804, updatedTime=1754445529804, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=http://www.stae.com.cn/jsygc/author/login, submissionEditorUrl=http://www.stae.com.cn/jsygc/editor/login, submissionReviewUrl=http://www.stae.com.cn/jsygc/reviewer/login, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""})], databaseList=null, tenantJournalId=1146123166801305609, websiteList=[Website(id=1148243202391400884, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1146123166801305609, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/kxjsygc/CN, language=CN, createTime=1751692112777, createBy=18614031015, updateTime=1753520965431, updateBy=18614031015, name=科学技术与工程-中文站点, tplId=1146099689490845704, title=科学技术与工程, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1148622798802673703, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1148243202391400884, code=articleTextType, value=kx, createTime=1751782615614, updateTime=1751782615614, creator=18614031015, updator=18614031015), WebsiteProps(id=1148622798781702180, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1148243202391400884, code=banner, value=null, createTime=1751782615609, updateTime=1751782615609, creator=18614031015, updator=18614031015), WebsiteProps(id=1148622798769119267, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1148243202391400884, code=logo, value=https://castjournals.cast.org.cn/joweb/kjdb/CN/file/pic?fileId=j86gbwi+p0Idkyl5SzIlmQ==, createTime=1751782615606, updateTime=1751782615606, creator=18614031015, updator=18614031015), WebsiteProps(id=1148622798794285094, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1148243202391400884, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/kjdb/CN/file/pic, createTime=1751782615612, updateTime=1751782615612, creator=18614031015, updator=18614031015), WebsiteProps(id=1148622798790090789, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1148243202391400884, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_cn_619/, createTime=1751782615611, updateTime=1751782615611, creator=18614031015, updator=18614031015)]), Website(id=1155914124811976731, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1146123166801305609, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/kxjsygc/EN, language=EN, createTime=1753521003206, createBy=18614031015, updateTime=1753521003206, updateBy=18614031015, name=科学技术与工程-英文站点, tplId=1146101810881728533, title=Science Technology and Engineering, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1155914371227308235, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1155914124811976731, code=articleTextType, value=kx, createTime=1753521061952, updateTime=1753521061952, creator=18614031015, updator=18614031015), WebsiteProps(id=1155914371210531016, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1155914124811976731, code=banner, value=null, createTime=1753521061947, updateTime=1753521061947, creator=18614031015, updator=18614031015), WebsiteProps(id=1155914371202142407, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1155914124811976731, code=logo, value=https://castjournals.cast.org.cn/joweb/kjdb/CN/file/pic?fileId=j86gbwi+p0Idkyl5SzIlmQ==, createTime=1753521061945, updateTime=1753521061945, creator=18614031015, updator=18614031015), WebsiteProps(id=1155914371223113930, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1155914124811976731, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/kjdb/CN/file/pic, createTime=1753521061950, updateTime=1753521061950, creator=18614031015, updator=18614031015), WebsiteProps(id=1155914371218919625, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1155914124811976731, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_cn_619/, createTime=1753521061949, updateTime=1753521061949, creator=18614031015, updator=18614031015)])], journalTitle=科学技术与工程, weixinUrl=null, journalUrl=null, iacademicId=null, status=0, seqNo=null, journalTitleEn=Science Technology and Engineering, journalPhotoCn=UKU/O7GSka5polgCTkbIIw==, journalPhotoEn=5hwlULoNwcbj3xUmVi9MAQ==, journalFirstLetter=S, journalRecommend=null, journalNew=null, journalCollection=null, jcrJf=null, cjcrJf=null, jcrJfStr=null, cjcrJfStr=null, submissionFirstDecision=null, sciSubjectClassification=null, casSubjectClassification=null, citeScore=null, totalCitationFrequency=null, icpCode=null, psCode=null, advertisingLicenseCode=null, copyrightInformation=null, country=null, option=null, provinceCode=null, provinceName=null, collectFlag=false), detailUrlCn=https://castjournals.cast.org.cn/joweb/kxjsygc/CN/10.12404/j.issn.1671-1815.2403871, detailUrlEn=https://castjournals.cast.org.cn/joweb/kxjsygc/EN/10.12404/j.issn.1671-1815.2403871, pdfUrlCn=https://castjournals.cast.org.cn/joweb/kxjsygc/CN/PDF/10.12404/j.issn.1671-1815.2403871, pdfUrlEn=https://castjournals.cast.org.cn/joweb/kxjsygc/EN/PDF/10.12404/j.issn.1671-1815.2403871, aliStartDate=null, aliEndDate=null, collectionFlag=false, citedCount=null, citedUrl=null, reference=null)
收藏切换
基于改进EfficientNetV2网络的垃圾图像分类算法
收藏切换
PDF下载
张振利 , 陈源 , 付豪 , 曾璐 *
科学技术与工程 | 论文·自动化技术、计算机技术 2025,25(10): 4229-4238
收起
收藏切换
科学技术与工程 | 论文·自动化技术、计算机技术 2025, 25(10): 4229-4238
基于改进EfficientNetV2网络的垃圾图像分类算法
全屏
张振利 , 陈源, 付豪, 曾璐*
作者信息
  • 江西理工大学电气工程与自动化学院, 赣州 341000
  • 张振利(1976—),男,汉族,河北滦县人,硕士,副教授。研究方向:检测技术与控制。E-mail:

通讯作者:

* 曾璐(1981—),女,汉族,江西赣州人,硕士,副教授。研究方向:智能控制技术。E-mail:
A Garbage Imageclassification Algorithm Based on the Improved Efficientnetv2 Network
Zhen-li ZHANG , Yuan CHEN, Hao FU, Lu ZENG*
Affiliations
  • School of Electrical Engineering and Automation, Jiangxi University of Science and Technology, Ganzhou 341000, China
出版时间: 2025-04-08 doi: 10.12404/j.issn.1671-1815.2403871
文章导航
收藏切换

目前主流垃圾图像分类算法中存在数据集普适性差、垃圾识别种类少、分类算法局限于特定环境等问题。针对这些问题,结合垃圾图像分类的快速性与准确率的要求,提出了一种基于改进EfficientNetV2网络的垃圾图像分类算法。该算法以EfficientNetV2网络作为基准模型,通过添加SK(selective kernel)注意力机制提升分类的快速性,使用迁移学习策略提升分类的准确率。该算法利用深度学习模型框架对垃圾图像进行处理,无需对数据集图像特征进行人工提取,在实现对垃圾图像快速准确分类的同时增加了垃圾识别的种类。实验表明,新的算法在自建数据集上的准确率为99.71%,相较于GoogleNet等其他算法,提升了至少4.77%。在时间上相较于VggNet19算法等,提升了至少50%。通过改进EfficientNetV2网络,实现了更为准确快速的垃圾分类,为日益激增的垃圾问题提供了一种科学高效的解决方案。

垃圾分类  /  深度学习  /  EfficientNetV2  /  卷积神经网络  /  SK注意力机制

An improved version of the EfficientNetV2 network is presented for garbage image classification to address the limitations of mainstream algorithms, such as poor dataset universality, limited recognition types, and algorithmic constraints in specific environments. The proposed algorithm emphasized both classification speed and accuracy. The EfficientNetV2 network was utilized as the baseline model, and classification speed was enhanced through the incorporation of the SK (selective kernel) attention mechanism. Transfer learning strategies were employed to improve classification accuracy. By leveraging deep learning model frameworks for garbage image processing, the need for manual feature extraction from dataset images was eliminated, and the scope of garbage recognition was expanded. Experimental results demonstrate that the proposed algorithm achieves an accuracy of 99.71% on a self-built dataset, which is an improvement of at least 4.77% compared to other algorithms, such as GoogleNet. Furthermore, in terms of time efficiency, the proposed algorithm outperforms algorithms like VggNet19 by at least 50%. Through the enhancement of the EfficientNetV2 network, accurate and faster garbage classification is enabled, providing a scientific and efficient solution to the growing challenges posed by garbage issues.

garbage classification  /  deep learning  /  EfficientNetV2  /  convolutional neural network  /  split-attention mechanism with switchable normalization
张振利, 陈源, 付豪, 曾璐. 基于改进EfficientNetV2网络的垃圾图像分类算法. 科学技术与工程, 2025 , 25 (10) : 4229 -4238 . DOI: 10.12404/j.issn.1671-1815.2403871
Zhen-li ZHANG, Yuan CHEN, Hao FU, Lu ZENG. A Garbage Imageclassification Algorithm Based on the Improved Efficientnetv2 Network[J]. Science Technology and Engineering, 2025 , 25 (10) : 4229 -4238 . DOI: 10.12404/j.issn.1671-1815.2403871
随着社会的飞速发展,垃圾的产生越来越多,如何科学有效便捷地处理日益激增的垃圾,已成为亟待解决的问题之一,这对人们的身心健康、幸福生活、环境保护和可持续发展具有非常重要的意义。常见的垃圾涵盖废纸、塑料、玻璃、金属、厨房垃圾等种类,垃圾图像识别的快速性与准确率是现阶段各种算法关注的重点。现如今的垃圾末端分类处理中,图像处理是最为重要的一环。通过数学模型、特定算法等可以对相关垃圾图像数据集进行训练,得到相对可靠的分类结果。近年来,随着数据处理的进步,卷积神经网络已被证明是处理复杂视觉任务的有效架构[1]。卷积神经网络是一组相互连接的神经元,通过处理系统的输入,并根据相应数据的重要程度和其他参数输出的响应程度来为数据处理分配逻辑权重。随着科学技术的不断发展,卷积神经网络已经被广泛地应用到包括垃圾图像分类在内的各种图像分类领域中[2]。常用的垃圾图像分类方法有基于轻量化的YOLO模型[3],该方法通过轻量化设计和高效特征提取技术实现了高精度检测和适应实际场景的能力。有一种基于小型卷积神经网络( convolutional neural network,CNN)的垃圾分类方法[4],该方法通过优化预处理算法和轻量化设计,拥有广泛的实际应用潜力。有一种基于ShuffleNet网络的方法[5],该方法通过轻量化设计和先进技术应用,在快速响应方面表现出色,适合资源受限设备上的垃圾分类应用。有一种基于DenseNet网络的方法[6],该方法通过数据增强和遗传算法优化,显著提高了垃圾分类精度和模型的可解释性。有基于MobileNet网络的方法[7-8],这些方法通过创新显著减少了模型的参数量和计算量,同时提升了分类精度,适合于嵌入式设备和移动设备的实时垃圾分类应用。有基于ResNet网络的方法[9-10],这些方法实现了对垃圾图像实现高精度和高效能的分类。还有一些研究提出了一些方法,例如利用CapSA优化CNN超参数,并采用ECOC和ANN的混合学习模型进行分类[11];通过均值漂移滤波、色差灰度模型与改进的OTSU方法[12];结合像素级空间注意力机制(pixel-level spatial attention,PSATT)、标签平滑正则化损失函数和阶梯形OneCycle学习率控制方法[13],这些方法各具优势,实现了对垃圾图像进行高效和自动化处理。近些年来,基于深度学习的垃圾图像分类方法虽然取得了很大进展,但同样面临着一些不足,在泛化能力、数据集缺乏普适性、实时性、长期稳定性、硬件兼容性和计算资源需求等方面仍需进一步研究、验证和优化,以确保在各种实际场景中的有效性和可靠性。
本文研究提出一种新的基于深度学习的垃圾图像分类算法,它使用轻量级的卷积神经网络模型EfficientNetV2作为主干网络,引入SK(selective kernel)注意力机制对移动倒置瓶颈卷积模块(mobile inverted bottleneck convolution,MBConv)进行更新提升垃圾图像分类任务的快速性,使用迁移学习策略提升分类任务的准确性。改进后的算法具备图像特征自动提取的能力,可实现对垃圾图像快速准确的分类。
传统的神经网络模型通过对图像数据分辨率、网络通道数量和网络深度这3个维度中的任意一个进行常规缩放获得更复杂的网络,如图1所示。Tan等[14]提出了EfficientNet网络,使用了复合缩放法,通过设置适当的比例扩展输入图像数据分辨率、网络深度和网络通道数3个维度,以获得所需的最佳参数,最大限度地提高识别精度,得到性能更优的模型。EfficientNet网络在使用过程中,出现以下问题:训练图片尺寸很大,训练时间非常长;在浅层网络中使用深度卷积结构会导致训练速度较慢,无法充分利用现有的加速器;在EfficientNetV1中,所有阶段会同时对宽度和深度进行放大,但是每个阶段的参数数量和结构是不同的,所以相同放大的方法在使用中不是最优解。为更好地解决以上问题,Tan等[15]提出了EfficientNetV2网络。
EfficientNetV2网络在训练速度和参数数量方面的表现都优于之前的网络,研究验证对于不同的图像尺寸,应该使用不同的正则化方法,新的研究能够自动根据不同图像尺寸使用动态的正则化方法,通过这种改进,模型的预测准确率得到了提高,同时还提升了训练的速度[16]。因此,选取EfficientNetV2网络应用于垃圾分类当中。图2为EfficientNetV2网络的结构图。
EfficientNetV2模型的基本结构包括Fused-MBConv模块和MBConv模块。图3展示了这两种结构。这两种模块均引入了BN层和SENet神经网络中提出的SE(squeeze-and-excitation networks)模块[17],其中BN层在深层神经网络中对每个隐层神经元的输入值进行白化操作,SE模块实现增强网络模型提取关键有用信息的能力。
在EfficientNetV2网络中,引入了几项架构改进,在保持性能的同时进一步降低了计算复杂度。
初始卷积层(stem):设计了初始卷积层用来更有效地处理输入图像,减少操作数。反向残差块:这些块使用倒置残差和线性瓶颈,减少了参数和浮点运算数。高效块设计:EfficientNetV2中每个块的结构经过优化,以最小化冗余操作,例如使用挤压-激励机制和先进的归一化技术。EfficientNetV2引入了一个尺度系数Ф,同时调整网络的宽度、深度和分辨率。
通过这些创新,EfficientNetV2在减少计算复杂度的同时,保持或提高了准确性和速度等性能指标。
注意力机制在深度学习领域中起着至关重要的作用,尤其是在处理图像和序列数据时。它可以帮助模型集中在图像中最相关的部分,从而提高分类的准确性和效率。在决定将注意力机制添加到哪个位置时,需要考虑到模型的整体架构、处理的数据类型以及目标任务的具体需求。
SKNet使用动态选择机制,该机制能够根据目标物体的尺寸自适应地选择不同的感受野。这个机制主要作用于卷积核,对卷积核的敏感度进行动态的调整,根据不同大小的图像生成相对应大小的卷积核,并尝试找到最适合的卷积核。这种机制的目的是提高模型对不同尺寸目标的感知能力。SKNet的设计使得它具有目标尺度自适应的特点,从而更好地适应不同尺度的目标检测任务,SKNet的网络结构如图4所示。它的动态选择机制允许神经元根据不同刺激进行感受野的调节,从而提高了模型的感知能力和准确性。这种自适应性的特点使得SKNet在处理多尺度目标时更具优势,从而为目标检测任务带来更好的效果。
SKNet模块的处理主要包括以下3个步骤。
第一步:Split。该操作通过使用两个大小不一样的卷积核对输入的特征图进行Group Convolution等操作,如使用2×2、8×8的卷积核同时进行处理,较小的卷积核(如2×2)通常更适合捕捉局部细节特征,而较大的卷积核(如8×8)则更适合捕捉全局上下文信息,会得到两个不同的特征图(feature map),即U1和U2。主要测试不同卷积核对目标的敏感度,以提高检测精度。
第二步:Fuse。对第一步得到的两个不同的特征图U1和U2进行融合,可以使用逐元素相加或逐元素连接的方式将它们合并为一个融合特征图。然后采用全局平均池化操作将该融合特征图转换为一个全局特征向量S,以嵌入全局信息,利用全连接获取向量Z,从而达到准确自适应选取目的。其中Fgp表示全局平均池化操作,Ffc表示先降维再升维的两层全连接层。最终,通过门控机制和元素相加的方式将两个分支的信息融合,并通过全局平均池化和全连接层操作嵌入全局信息,得到最终的特征向量S。这个特征向量S可以在后续任务中使用。
Sc=Fgp(Uc)= 1 H W i = 1 H j = 1 WUc(i,j)
Z=Ffc(S)=δ[B(Ws)]
式中:Uc(i,j)为特征图Uc中位于位置(i,j)的像素值;HW分别为特征图Uc的高和宽;c为通道数;Sc为全局平均池化的结果,通常是一个向量,维度为c
通道数的选取会直接影响到模型的性能和计算复杂度。首先要考虑输入特征的维度,通道数不能大于输入特征的维度,否则会导致无法计算或计算异常。其次,通道数的增加会增加计算量,因此需要根据计算资源的限制来选择合适的通道数,以在性能和计算效率之间找到平衡。最后,不同的任务可能对通道数有不同的要求,复杂的任务可能需要更多的通道来捕捉更复杂的特征交互。Ws表示一个全连接层的权重矩阵。它对输入S进行线性变换,将其映射到另一个特征空间。在门控机制中,使用ReLU函数(δ)进行激活,B(批量标准化)用于归一化。特征维度d的计算通过取最大值max(c/r,L)来确定,其中r为减速率,Ld的最小值。最终得到的特征向量S可用于后续任务。
第三步:Select。在向量Z上执行softmax操作以获得acbc,然后将原始特征U1和U2分别与对应的权重乘积acbc逐元素相乘,并对结果进行逐像素累加,最终得到输出特征VV经过加权和操作后得到的输出特征图。它整合了不同分支特征图U1和U2的信息,且权重由 Softmax 操作动态决定。由于acbc之间函数值之和等于1,从而达到设定每个分支特征图权重的目的。
ac= e A c Z e A c Z + e B c Z
bc= e B c Z e B c Z + e B c Z
Vc=ac U 1 c+bc U 2 c
式中:ABRd×c;AcRd,表示A的第c行;aca中的第c个元素;Bcbc的含义同上。
EfficientNetV2网络被分成8个阶段,每个阶段都有不同的结构和功能。EfficientNetV2模型的总体结构如表1所示。阶段1:这是一个普通卷积层,使用3×3大小的卷积核和步距为2,同时包含批量归一化(BN)层和激活函数Swish。阶段2~阶段4:这些阶段都由Fused-MBConv模块进行重复累加所构成,表中的层数表示该阶段中Fused-MBConv结构的重复次数。Fused-MBConv结构的设计旨在减少网络的参数量和计算量,以提高网络的效率。阶段5~阶段7:这些阶段都由MBConv结构组成,同样是重复累加得到的。表中的层数表示该阶段MBConv结构的重复次数。阶段8由1×1的普通卷积层、池化层和全连接层构成。这些层的组合用于进一步处理和提取特征,并完成最终的特征处理和分类。通过使用Fused-MBConv和MBConv结构,EfficientNetV2网络能够在保持较低的参数和计算量的同时,提供强大的特征提取能力。
在EfficientNetV2中,MBConv[18]结构是对MobileNetV2网络模型中的倒残差结构进行微调的一种改进。相比于传统的残差结构,倒残差结构先升维后降维,从而减小信息损失的可能性,形成更稀疏的特征表示。主要进行了两个方面的调整:第一,使用了不同的激活函数,EfficientNetV2的MBConv用Swish激活函数替换Relu6激活函数,Swish激活函数能够提供更好的非线性激活能力;第二,在每个MBConv中内置通道注意力机制,在深度卷积层之后增加SE结构,计算通道权重,从而增强模型的表达能力。通过这些微调,EfficientNetV2中的MBConv结构在保持倒残差结构优点的同时,增强了特征提取和表示能力。MBConv结构如图5所示。
MBConv结构由以下几个关键组件组成。1×1普通卷积层:这是MBConv结构的起始部分,使用1×1的卷积核、步距为1,同时包含批量归一化 (BN)和SiLU(Swish)激活函数,它主要起到升维的作用,增加特征的维度;k×k深度卷积层:在1×1卷积层之后,接入一个k×k的深度卷积层,其中k的值通常为3或5,较小的k可以保留更多的局部特征,适合于捕捉图像中的细节信息;较大的k则有助于捕捉更大的上下文信息。选择合适的k取决于任务的复杂性和资源限制。深度卷积层的步距可以是1或2,步距为1时,输出特征图大小与输入相似,适合于保留更多的空间信息;步距为2时,特征图尺寸减半,适合于减少计算量和参数数量,同时增大感受野。选择步距时需要考虑输入特征图的大小和网络的深度。这一步骤大大减少计算量和参数数量,提高了网络的效率;SE模块:在深度卷积层之后,接入SE模块。SE模块通过引入全局信息池化操作和一层全连接层,计算每个通道的权重,以增强关键信息的表达能力,SE模块可以帮助网络更好地捕捉重要的特征;1×1普通卷积层:在SE模块之后,使用1×1的卷积核和批量归一化(BN)层,起到降维的作用,减少特征的维度;Dropout层:MBConv结构的最后一步是Dropout层,用于随机深度。随机深度是指根据倒残差层数l的线性函数,以一定的概率丢弃整个MBConv模块,从而减小网络的深度。这种随机深度并不使用传统的随机失活方式,而是以一定的概率使整个模块消失;MBConv模块结构的设计旨在提高网络的效率和表达能力。通过升维、深度卷积、SE模块、降维和随机深度等步骤,它能够同时减少计算量和参数数量,提高特征的表达能力,并增强网络的泛化能力。
将SK注意力机制集成到MBConv类中是一种创新的方法,旨在增强垃圾图像分类模型的性能。这种集成方式充分利用了SK注意力机制的特点,即通过动态调整特征映射来适应图像中不同尺度的特征。改进后的MBConv命名为SK-MBConv模块,其结构图如图6所示。
集成到MBConv类中的好处。在MBConv中集成SK注意力机制意味着在深度卷积和SE(squeeze-and-excitation)模块之间添加一个能动态调整特征映射重要性的层。这样的集成使得MBConv不仅可以通过SE模块处理通道间的关系,而且能够通过SK层处理不同尺寸特征间的关系。这种集成有助于网络更好地适应图像中的不同尺度特征,从而提高识别和分类的能力。相比于传统的MBConv结构,这种改进可以提升模型在处理多尺度特征时的灵活性和准确性。改进后的MBConv结构的影响。改进后的结构通过引入SK层,使得网络能够更加有效地处理和解析不同尺度的图像特征。这种动态选择机制使得网络能够更好地适应图像中的不同尺度特征,无论是大尺度的结构性特征还是小尺度的细节特征。这种方法特别适合于垃圾分类这样的应用中,因为垃圾图像往往包含多种尺寸和类型的对象。
新的算法通过Stem模块开始,执行3×3卷积操作以提取初始特征,并在之后应用批归一化和激活函数。接着,利用一系列MBConv模块,每个模块包含以下步骤:首先是1×1卷积扩展输入通道数,然后是深度可分离卷积进行空间特征提取,可选地应用SE模块以增强通道关注度,最后通过1×1卷积将特征映射到较低维度空间。整个过程通过Head模块完成,其中包括全局平均池化和全连接层,用于生成最终的类别预测。图7为新算法的算法流程图。
综上所述,将SK注意力机制集成到MBConv类中显著增强了EfficientNetv2网络在垃圾图像分类任务上的性能。通过此方法,模型不仅能够更加精确地处理不同尺寸的特征,而且还能提高整体的分类准确率。通过改进,EfficientNetv2网络在处理复杂和多变的图像数据时将表现出更高的精确度和鲁棒性。这种结构上的改进为深度学习在复杂图像处理任务中提供了新的可能性。
迁移学习是一种机器学习技术,通过利用在一个任务上学到的知识来改善在另一个相关任务上的性能。在深度学习中,通常会使用预训练的模型作为初始模型,并通过调整模型的参数来适应新的任务。它的核心思想就是将预训练模型的特征学习能力和泛化能力转移到新的任务中。通过共享和迁移模型在源任务上学到的知识,可以加速模型收敛、降低对大量标注数据的需求,提高在新任务上的性能。
具体而言,使用迁移学习需要经过以下步骤:选择适合的预训练模型,冻结初始模型的参数;定义新任务,调整模型结构以适应新任务的需求;使用新任务的数据进行微调和训练。
通过迁移学习,可以利用已有的知识和经验,快速构建和训练适应新任务的模型,全面提高模型的性能和泛化能力,尤其在数据有限或相似任务的情况下具有显著的优势。
本次实验运行环境配置如下。操作系统:Windows11家庭中文版;CPU:12th Gen Intel(R) Core(TM) i7-12700H 2.30 GHz;GPU: NVIDIA GeForce RTX 3060 Laptop GPU;CUDA Version: 12.2;RAM:16 384 MB;VRAM:6 009 MB;Env:PyCharm2020.1×64。
本文所使用的垃圾分类数据集为Garbage Classification公共数据集,该数据集选择日常生活中常见的垃圾,共15 515张图片,分成12类。根据生活中的实际情况,所获得的图像来源并非十分理想的问题,例如每张图像的清晰度不同、图像发生偏转、关键部分发生缺失等问题,对准备好的数据集进行数据预处理[18],使其更加符合实际状况,并且将所有图像的尺寸都进行了大小为150×150的统一处理。将所有图像的尺寸进行统一处理,有以下几点原因:①统一输入尺寸:深度学习模型要求输入尺寸固定,简化设计和训练;②计算效率:固定尺寸有助于并行计算,提高整体性能;③避免尺寸偏差:尺寸差异大可能影响模型泛化能力;④数据一致性:相同尺寸提高数据一致性,便于模型学习关键特征。通过增强图像清晰度、随机旋转图像、上下翻转图像、高斯模糊等方法对数据集进行增强,最终获得扩充后的数据集共77 575张图像,在原有的15 515张图片的基础上增加了62 060张图片,增强了数据集的普适性,更加符合生活中的实际情况。将准备好的数据集划分为训练集与测试集,其中训练集占90%,测试集占10%。综上所述,通过统一图像尺寸和图像增强技术可提升数据集的质量和一致性,减少图像质量差异对模型的负面影响,增加数据的多样性,确保了数据集的可靠性。通过扩充数据集的规模,可以增强数据集的覆盖范围,提高模型的泛化能力。各类垃圾图片数量如表2所示。
采用AlexNet、GoogleNet、ResNet50、MobileNet-V2、VggNet19、Swin Transform、ConvNeXt和改进之后的新算法(命名为Ours)8个模型进行对比实验。每一个模型都在GPU上进行了batch_size=8,epoch=30的训练。评价指标包括3项,训练集识别准确率(以下简称识别准确率)、模型参数和运行总时间。其中,识别准确率取30轮中的最优值,实验结果如表3所示。不同的模型在30个epoch中的识别准确率如图8所示,训练损失值变化曲线如图9所示。
表3可以看到,新模型的准确率要远远高于其他5个算法。并且在6个模型中,新模型对于内存的需求不是最大,MobileNet-V2模型所需参数虽然为2 273 356个,但是它的训练集识别准确率和测试集识别准确率均不如新模型。从运行时间来看,新模型所需时间不是最长。因此综合模型参数和运行总时间来看,新模型针对垃圾分类问题具有一定的实际应用价值。
将EfficientNetV2网络和SK注意力机制结合起来时,新算法的稳定性相较于结合之前,得到了一些提升,可以从以下几个方面详细说明。
(1)特征提取的稳定性。EfficientNetV2网络通过缩放网络的深度、宽度和分辨率,在不同任务和数据集上实现了更好的性能平衡。这种策略能够有效地控制特征表达的复杂度,避免特征提取过程中的不稳定性和过拟合问题。SK注意力机制进一步增强了特征表示能力,通过动态选择卷积核尺寸来适应不同的输入特征,这种选择能力增强了模型对于不同场景和输入数据分布的适应能力。这种自适应性能够有效地提高特征提取的稳定性,使得模型能够更准确地捕捉和表达关键特征。
(2)模型的泛化能力。EfficientNetV2通过网络结构的优化,确保模型在不同数据集和任务上都能够保持较好的泛化能力。SK注意力机制的引入进一步增强了模型对于输入数据分布的适应能力,使得模型面对不同的视觉任务时能够灵活调整,从而提高泛化能力。
(3)对抗性和鲁棒性。SK注意力机制的引入不仅提升了特征表达能力,还加强了模型的对抗性和鲁棒性。SK注意力机制能够动态地选择卷积核尺寸,将更多的注意力集中在重要的特征上,抑制无关信息的影响。这种机制有助于模型在处理复杂场景时,更加关注关键的视觉特征,提高了模型对于复杂输入的鲁棒性。通过更精确地选择和表达特征,能够帮助模型避免过拟合问题,提高模型在训练数据外的泛化能力。这种稳定的特征提取能力是保证模型在实际应用中稳定性的关键因素之一。
综上所述,EfficientNetV2网络和SK 注意力机制的结合不仅在理论上有助于提高模型的稳定性,同时也在实际的垃圾分类任务中展示了良好的表现,使得模型能够更加稳定、可靠地应对各种复杂的输入和挑战。
针对基准模型中MBConv模块中,添加SK注意力机制是否为最佳策略设计了对比实验,即不同注意力机制对模型性能的影响,实验结果如表4所示。由表4可知,在都使用迁移学习策略的前提下,无论对比识别准确率或者运行总时间,SK注意力机制与MBConv模块的结合效果最好。
为验证改进方法中不同的模块和方法对模型性能的贡献,以EfficientNetV2作为基准模型设计了消融实验,实验结果如表5所示。可以看出,单独将SK注意力机制引入MBConv模块中之后,识别准确率相比于基准模型提升了1.83%。单独采用迁移学习策略时,识别准确率为98.81%。无论是否使用迁移学习策略,添加SK注意力机制之后,运行总时间都有所减少。而将SK注意力机制和迁移学习策略都添加到基准网络之后,由实验结果可以看出,相比于基准模型的识别准确率与运行总时间,改进之后的网络识别准确率提升了53.20%,运行总时间减少了26.87%。
测试结果表明,SK注意力机制和迁移学习策略使模型性能均有所提升,二者同时添加的效果最好。因此本文模型能更好地胜任垃圾识别任务。
为了更加直观地分析算法改进后的分类效果,将本文算法预测结果进行可视化测试,使用一组真实场景中常见的垃圾图像,将图像输入模型进行识别,结果如图10所示。
为提高垃圾分类的准确率和减少垃圾分类的时间,提出了一种基于改进EfficientNetV2网络的垃圾图像分类算法。使用相同的数据集分别在其余几个传统算法模型上进行了实验,实验结果表明,新算法在垃圾分类任务中表现出色,达到了99.71%的识别准确率,远高于其他传统算法模型如AlexNet(88.88%)、GoogleNet(94.94%)、ResNet50(94.44%)、MobileNet-V2(93.83%)、VggNet19(93.89%)、Swin Transform(74.20%)和ConvNeXt(89.72%)。消融实验对比了是否添加SK注意力机制,是否使用迁移学习策略;在使用迁移学习策略的前提下在MBConv模块中使用不同注意力机制对模型性能的影响。新的算法具有高效的特征提取能力和良好的泛化能力与鲁棒性。然而,新的算法也存在挑战和不足。首先,如同其他深度学习模型,其性能高度依赖于大规模和高质量的训练数据,这些训练数据对垃圾分类任务而言尤为重要,需要确保数据集覆盖各种垃圾类别和场景,以避免模型的偏差。其次,深度学习模型的复杂性限制了其在解释性方面的表现,这在需要对分类结果进行解释或依据结果采取行动的应用中尤为显著,提高模型的解释性是未来研究的重要方向之一。接下来的研究将集中在提升模型解释性的技术探索,如结合可视化方法或生成对抗网络生成解释,以增强模型结果的可解释性和理解性。还将探索在数据稀缺或无监督学习情况下如何优化模型性能,以及如何通过环境感知和自适应学习来提高模型在不同环境条件下的鲁棒性。
  • 国家自然科学基金(62363013)
参考文献 引证文献
排序方式:
[1]
徐冰冰, 岑科廷, 黄俊杰, 等. 图卷积神经网络综述[J]. 计算机学报, 2020, 43(5): 755-780.
Xu Bingbing, Cen Keting, Huang Junjie, et al. A survey on graph convolutional neural network[J]. Chinese Journal of Computers, 2020, 43(5): 755-780.
[2]
陈亚宇, 孙骥晟, 李建龙, 等. 基于深度学习与图像处理的废弃物分类与定位方法[J]. 科学技术与工程, 2021, 21(21): 8970-8975.
Chen Yayu, Sun Jisheng, Li Jianlong, et al. Research on waste classification and location method based on deep learning and image processing[J]. Science Technology and Engineering, 2021, 21(21): 8970-8975.
[3]
Xia Z X, Zhou H, Yu H, et al. YOLO-MTG: a lightweight YOLO model for multi-target garbage detection[J]. SIViP, 2024, 18: 5121-5136.
[4]
Yang Z, Xia Z, Yang G, et al. A garbage classification method based on a small convolution neural network[J]. Sustainability, 2022, 14(22). DOI: 10.3390/su142214735.
[5]
Chen Y, Luo A, Cheng M, et al. Classification and recycling of recyclable garbage based on deep learning[J]. Journal of Cleaner Production, 2023, 414.DOI: 10.1016/j.jclepro.2023.137558.
[6]
Mao W L, Chen W C, Wang C T, et al. Recycling waste classifica-tion using optimized convolutional neural network[J]. Resources, Conservation and Recycling, 2021, 164.DOI: 10.1016/j.resconrec.2020.105132.
[7]
Tian X, Shi L, Luo Y, et al. Garbage classification algorithm based on improved MobileNetV3[J]. IEEE Access, 2024, 12: 44799-44807.
[8]
袁建野, 南新元, 蔡鑫, 等. 基于轻量级残差网路的垃圾图片分类方法[J]. 环境工程, 2021, 39(2): 110-115.
Yuan Jianye, Nan Xinyuan, Cai Xin, et al. Garbage image classification by lightweight residual network[J]. Environmental Engineering, 2021, 39(2): 110-115.
[9]
Kang Z, Yang J, Li G, et al. An automatic garbage classification system based on deep learning[J]. IEEE Access, 2020, 8: 140019-140029.
[10]
Gu Y, Ge B. Research onlightweight convolutional neural network in garbage classification[J]. IOP Conference Series: Earth and Environmental Science, 2021, 781(3). DOI: 10.1088/1755-1315/781/3/032011.
[11]
Wang J. Application research of image classification algorithm based on deep learning in household garbage sorting[J]. Heliyon, 2024, 10(9): e29966.
[12]
汤伟, 刘思洋, 高涵, 等. 基于视觉的水面垃圾清理机器人目标检测算法[J]. 科学技术与工程, 2019, 19(3): 136-141.
Tang Wei, Liu Siyang, Gao Han, et al. A target detection algorithm for surface cleaning robot based on machine vision[J]. Science Technology and Engineering, 2019, 19(3): 136-141.
[13]
高明, 陈玉涵, 张泽慧, 等. 基于新型空间注意力机制和迁移学习的垃圾图像分类算法[J]. 系统工程理论与实践, 2021, 41(2): 498-512.
Gao Ming, Chen Yuhan, Zhang Zehui, et al. Classification algorithm of garbage images based on novel spatial attention mechanism and transfer learning[J]. Systems Engineering Theory and Practice, 2021, 41(2): 498-512.
[14]
Tan M, Le Q. EfficientNet: rethinking model scaling for convolutional neural networks[C]// International Conference on Machine Learning, PMLR. Long Beach: PMLR, 2019: 6105-6114.
[15]
Tan M, Le Q. EfficientNetV2: smaller models and faster training[C]// International Conference on Machine Learning, PMLR, Virtual Event. Long Beach: PMLR, 2021: 10096-10106.
[16]
Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2018: 7132-7141.
[17]
Qu H, Yang J, Shen M, et al. Fault diagnosis of rolling bearing under time-varying speed conditions based on EfficientNetv2[J]. Measurement Science and Technology, 2022, 33(6): 065023.
[18]
Zhang Y, Li D, Law K L, et al. Idr: self-supervised image denoising via iterative data refinement[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). New Orleans: IEEE, 2022: 2098-2107.
2025年第25卷第10期
PDF下载
406
167
引用本文
BibTeX
文章信息
doi: 10.12404/j.issn.1671-1815.2403871
  • 接收时间:2024-05-24
  • 首发时间:2025-07-09
  • 出版时间:2025-04-08
补充材料
相关文章
文章信息
作者
出版历史
  • 收稿日期:2024-05-24
  • 修回日期:2025-01-14
基金
国家自然科学基金(62363013)
作者信息
    江西理工大学电气工程与自动化学院, 赣州 341000

通讯作者:

* 曾璐(1981—),女,汉族,江西赣州人,硕士,副教授。研究方向:智能控制技术。E-mail:
参考文献
分享链接
https://castjournals.cast.org.cn/joweb/kxjsygc/CN/10.12404/j.issn.1671-1815.2403871
分享至
全文二维码

扫描看全文

引用本文
BibTeX
本文的引用情况
2种不同金属材料的力学参数

Family
属数
Number of
genus
种数
Number of
species
占总种数比例
Percentage of
total species (%)

Genus
种数
Number of
species
占总种数比例
Percentage of total
species (%)
鹅膏菌科Amanitaceae 2 11 5.26 鹅膏菌属 Amanita 10 4.78
小菇科 Mycenaceae 2 12 5.74 丝盖伞属 Inocybe 5 2.39
多孔菌科 Polyporaceae 8 14 6.70 蜡蘑属 Laccaria 5 2.39
红菇科 Russulaceae 3 23 11.00 小皮伞属 Marasmius 6 2.87
小菇属 Mycena 11 5.26
光柄菇属 Pluteus 5 2.39
红菇属 Russula 17 8.13
栓菌属 Trametes 5 2.39
关闭全屏