Article(id=1251226692853908040, tenantId=1146029695717560320, journalId=1251194772300279900, issueId=1251226682309423223, articleNumber=null, orderNo=null, doi=10.20079/j.issn.1001-893x.240618002, pmid=null, cstr=null, oa=null, hot=null, price=null, onlineType=0, articleFormat=0, articleType=null, articleTypeStr=null, receivedDate=1718640000000, receivedDateStr=2024-06-18, revisedDate=1722787200000, revisedDateStr=2024-08-05, acceptedDate=null, acceptedDateStr=null, onlineDate=1776245290243, onlineDateStr=2026-04-15, pubDate=1764259200000, pubDateStr=2025-11-28, doiRegisterDate=null, doiRegisterDateStr=null, onlineIssueDate=1776245290243, onlineIssueDateStr=2026-04-15, onlineJustAcceptDate=null, onlineJustAcceptDateStr=null, onlineFirstDate=null, onlineFirstDateStr=null, sourceXml=null, magXml=null, createTime=1776245290243, creator=13041195026, updateTime=1776245290243, updator=13041195026, issue=Issue{id=1251226682309423223, tenantId=1146029695717560320, journalId=1251194772300279900, year='2025', volume='65', issue='11', pageStart='1729', pageEnd='1954', issueExtLink='null', onlineDate='null', pubDate='null', beforeIssueId=null, nextIssueId=null, price=null, status=1, issueComplete=1, articleOrder=1, issueType=1, specialIssue=null, createTime=1776245287729, creator=13041195026, updateTime=1776246742124, updator=13041195026, preIssue=null, nextIssue=null, ext={EN=IssueExt(id=1251232782568080068, tenantId=1146029695717560320, journalId=1251194772300279900, issueId=1251226682309423223, language=EN, specialIssueTitle=, coverIllustrator=null, specialIssueEditor=, specialIssueAbout=), CN=IssueExt(id=1251232782568080069, tenantId=1146029695717560320, journalId=1251194772300279900, issueId=1251226682309423223, language=CN, specialIssueTitle=, coverIllustrator=null, specialIssueEditor=, specialIssueAbout=)}, issueFiles=null}, startPage=1836, endPage=1843, ext={EN=ArticleExt(id=1251226693193646686, articleId=1251226692853908040, tenantId=1146029695717560320, journalId=1251194772300279900, language=EN, title=Scalable Subspace Learning for Clustering Data Streams, columnId=1251226683223781499, journalTitle=Telecommunication Engineering, columnName=Application Fundamental Research and Advanced Technology, runingTitle=null, highlight=null, articleAbstract=

Traditional data stream clustering methods lack online dimensionality reduction capabilities for high-dimensional data, leading to limited clustering performance. To address this issue,a Scalable Subspace Learning for Clustering Data Streams(S2LCStream) method is proposed. Firstly,this method establishes a projection relationship between historical data and new data through scalable subspace learning,projecting the new data into the subspace spanned by historical data to obtain its clustering assignment in real-time. Secondly,to maintain the accuracy of clustering assignments over time, the method performs consistency detection of data distribution on the continuously arriving data stream,capturing concept drifts and adjusting clustering assignments through a backtracking mechanism to adapt to dynamically changing data distributions. Finally,the proposed method is validated on multiple real-world datasets, demonstrating its efficiency in handling high-dimensional data streams. Specifically, S2LCStream maintains high clustering accuracy while efficiently handling concept drift.

, correspAuthors=null, authorNote=null, correspAuthorsNote=null, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=null, magXml=null, pdfUrl=null, pdf=null, pdfFileSize=null, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=null, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=null, mapNumber=null, authorCompany=null, fund=null, authors=null, authorsList=Hongwei YIN, Yuzhou NI, Wenjun HU), CN=ArticleExt(id=1251226699304747843, articleId=1251226692853908040, tenantId=1146029695717560320, journalId=1251194772300279900, language=CN, title=基于可扩展子空间学习的数据流聚类方法, columnId=1251226683383165054, journalTitle=电讯技术, columnName=应用基础与前沿技术, runingTitle=null, highlight=null, articleAbstract=

传统数据流聚类方法缺乏对高维数据的在线降维能力,导致其聚类性能受限。为解决此问题,提出了一种基于可扩展子空间学习的数据流聚类方法(Scalable Subspace Learning for Clustering Data Streams,S2 LCStream)。首先,通过可扩展子空间学习建立历史数据与新增数据之间的投影关系,将新增数据投影至历史数据张成的子空间中,以实时获取其聚类划分。其次,为保持不同时刻聚类划分的准确性,对持续到达的数据流进行数据分布的一致性检测,捕获其中存在的概念漂移,并结合回溯机制对聚类划分进行调整以适应动态变化的数据分布。最后,通过在多个真实数据集上进行测试,验证了所提方法在处理高维数据流的效能。所提方法在保持较高聚类性能的同时,能够高效处理数据流中的概念漂移。

, correspAuthors=null, authorNote=null, correspAuthorsNote=
胡文军 Email:
, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=W8ZWSpl0mBSTl4K7J9Bdxg==, magXml=zVpWIi1BCquBQGumlIj6AA==, pdfUrl=null, pdf=d6fN3zSvq81pAp6sCiBiwA==, pdfFileSize=3449909, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=gMBAWS7LqZgbYjksGqF4XQ==, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=4IeFfsp+p1Ufb8mUdpNPFA==, mapNumber=null, authorCompany=null, fund=null, authors=

尹宏伟 男,1990年生于安徽宿松,2019年获博士学位,现为副教授,主要研究方向为机器学习、数据挖掘和聚类分析等。

倪钰洲 男,1999年生于江苏苏州,2018年获学士学位,现为硕士研究生,主要研究方向为聚类分析。

胡文军 男,1977年生于安徽绩溪,2012年获博士学位,现为教授,主要研究方向为机器学习、模式识别、数据挖掘、智能系统等。

, authorsList=尹宏伟, 倪钰洲, 胡文军)}, authors=[Author(id=1251226699866784626, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, orderNo=0, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1251226699988419451, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, authorId=1251226699866784626, language=EN, stringName=Hongwei YIN, firstName=Hongwei, middleName=null, lastName=YIN, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, 2, 3, address=1School of Information Engineering,Huzhou University,Huzhou 31300,China
2Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, Huzhou 313000,China
3Huzhou Key Laboratory of Aquatic Robot Technology,Huzhou 313000,China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1251226700076499842, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, authorId=1251226699866784626, language=CN, stringName=尹宏伟, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, 2, 3, address=1湖州师范学院 信息工程学院,浙江 湖州 313000
2浙江省现代农业资源智慧管理与应用研究重点实验室,浙江 湖州 313000
3湖州市水域机器人技术重点实验室,浙江 湖州 313000, bio={"content":"

尹宏伟 男,1990年生于安徽宿松,2019年获博士学位,现为副教授,主要研究方向为机器学习、数据挖掘和聚类分析等。

"}, bioImg=null, bioContent=

尹宏伟 男,1990年生于安徽宿松,2019年获博士学位,现为副教授,主要研究方向为机器学习、数据挖掘和聚类分析等。

, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1251226699543823187, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, xref=1, ext=[AuthorCompanyExt(id=1251226699548017491, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699543823187, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1School of Information Engineering,Huzhou University,Huzhou 31300,China), AuthorCompanyExt(id=1251226699552211796, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699543823187, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1湖州师范学院 信息工程学院,浙江 湖州 313000)]), AuthorCompany(id=1251226699619320666, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, xref=2, ext=[AuthorCompanyExt(id=1251226699623514969, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699619320666, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, Huzhou 313000,China), AuthorCompanyExt(id=1251226699631903578, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699619320666, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2浙江省现代农业资源智慧管理与应用研究重点实验室,浙江 湖州 313000)]), AuthorCompany(id=1251226699719983968, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, xref=3, ext=[AuthorCompanyExt(id=1251226699724178273, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699719983968, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=3Huzhou Key Laboratory of Aquatic Robot Technology,Huzhou 313000,China), AuthorCompanyExt(id=1251226699732566883, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699719983968, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=3湖州市水域机器人技术重点实验室,浙江 湖州 313000)])]), Author(id=1251226700181357448, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, orderNo=1, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1251226700294603663, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, authorId=1251226700181357448, language=EN, stringName=Yuzhou NI, firstName=Yuzhou, middleName=null, lastName=NI, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, 2, address=1School of Information Engineering,Huzhou University,Huzhou 31300,China
2Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, Huzhou 313000,China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1251226700374295446, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, authorId=1251226700181357448, language=CN, stringName=倪钰洲, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, 2, address=1湖州师范学院 信息工程学院,浙江 湖州 313000
2浙江省现代农业资源智慧管理与应用研究重点实验室,浙江 湖州 313000, bio={"content":"

倪钰洲 男,1999年生于江苏苏州,2018年获学士学位,现为硕士研究生,主要研究方向为聚类分析。

"}, bioImg=null, bioContent=

倪钰洲 男,1999年生于江苏苏州,2018年获学士学位,现为硕士研究生,主要研究方向为聚类分析。

, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1251226699543823187, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, xref=1, ext=[AuthorCompanyExt(id=1251226699548017491, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699543823187, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1School of Information Engineering,Huzhou University,Huzhou 31300,China), AuthorCompanyExt(id=1251226699552211796, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699543823187, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1湖州师范学院 信息工程学院,浙江 湖州 313000)]), AuthorCompany(id=1251226699619320666, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, xref=2, ext=[AuthorCompanyExt(id=1251226699623514969, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699619320666, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, Huzhou 313000,China), AuthorCompanyExt(id=1251226699631903578, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699619320666, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2浙江省现代农业资源智慧管理与应用研究重点实验室,浙江 湖州 313000)])]), Author(id=1251226700462375837, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, orderNo=2, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=hoowenjun@foxmail.com, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1251226700596593573, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, authorId=1251226700462375837, language=EN, stringName=Wenjun HU, firstName=Wenjun, middleName=null, lastName=HU, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, 2, 3, address=1School of Information Engineering,Huzhou University,Huzhou 31300,China
2Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, Huzhou 313000,China
3Huzhou Key Laboratory of Aquatic Robot Technology,Huzhou 313000,China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1251226700684673968, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, authorId=1251226700462375837, language=CN, stringName=胡文军, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, 2, 3, address=1湖州师范学院 信息工程学院,浙江 湖州 313000
2浙江省现代农业资源智慧管理与应用研究重点实验室,浙江 湖州 313000
3湖州市水域机器人技术重点实验室,浙江 湖州 313000, bio={"content":"

胡文军 男,1977年生于安徽绩溪,2012年获博士学位,现为教授,主要研究方向为机器学习、模式识别、数据挖掘、智能系统等。

"}, bioImg=null, bioContent=

胡文军 男,1977年生于安徽绩溪,2012年获博士学位,现为教授,主要研究方向为机器学习、模式识别、数据挖掘、智能系统等。

, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1251226699543823187, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, xref=1, ext=[AuthorCompanyExt(id=1251226699548017491, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699543823187, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1School of Information Engineering,Huzhou University,Huzhou 31300,China), AuthorCompanyExt(id=1251226699552211796, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699543823187, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1湖州师范学院 信息工程学院,浙江 湖州 313000)]), AuthorCompany(id=1251226699619320666, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, xref=2, ext=[AuthorCompanyExt(id=1251226699623514969, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699619320666, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, Huzhou 313000,China), AuthorCompanyExt(id=1251226699631903578, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699619320666, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2浙江省现代农业资源智慧管理与应用研究重点实验室,浙江 湖州 313000)]), AuthorCompany(id=1251226699719983968, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, xref=3, ext=[AuthorCompanyExt(id=1251226699724178273, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699719983968, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=3Huzhou Key Laboratory of Aquatic Robot Technology,Huzhou 313000,China), AuthorCompanyExt(id=1251226699732566883, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699719983968, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=3湖州市水域机器人技术重点实验室,浙江 湖州 313000)])])], keywords=[Keyword(id=1251226700831474618, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=EN, orderNo=1, keyword=data stream clustering), Keyword(id=1251226700974080968, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=EN, orderNo=2, keyword=subspace learning), Keyword(id=1251226701062161360, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=EN, orderNo=3, keyword=scalable subspace learning), Keyword(id=1251226701154436058, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=EN, orderNo=4, keyword=concept drift detection), Keyword(id=1251226701267682272, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=CN, orderNo=1, keyword=数据流聚类), Keyword(id=1251226701376734184, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=CN, orderNo=2, keyword=子空间学习), Keyword(id=1251226701481591790, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=CN, orderNo=3, keyword=可扩展子空间学习), Keyword(id=1251226701569672180, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=CN, orderNo=4, keyword=概念漂移检测)], refs=[Reference(id=1251226705222910071, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2023, volume=213, issue=null, pageStart=1, pageEnd=17, url=null, language=null, rfNumber=[1], rfOrder=0, authorNames=SUÁREZ-CETRULO A L, QUINTANA D, CERVANTES A, journalName=Expert Systems with Applications, refType=null, unstructuredReference=SUÁREZ-CETRULO A L, QUINTANA D, CERVANTES A. A survey on machine learning for recurring concept drifting data streams[J]. Expert Systems with Applications, 2023, 213:1-17., articleTitle=A survey on machine learning for recurring concept drifting data streams, refAbstract=null), Reference(id=1251226705306796159, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2024, volume=36, issue=5, pageStart=1857, pageEnd=1873, url=null, language=null, rfNumber=[2], rfOrder=1, authorNames=GAO Y J, FANG Z Q, XU J C, journalName=IEEE Transactions on Knowledge and Data Engineering, refType=null, unstructuredReference=GAO Y J, FANG Z Q, XU J C, et al. An efficient and distributed framework for real-time trajectory stream clustering[J]. IEEE Transactions on Knowledge and Data Engineering, 2024, 36(5):1857-1873., articleTitle=An efficient and distributed framework for real-time trajectory stream clustering, refAbstract=null), Reference(id=1251226705407459461, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2018, volume=68, issue=null, pageStart=756, pageEnd=764, url=null, language=null, rfNumber=[3], rfOrder=2, authorNames=LIN C C, CHEN C S, CHEN A P, journalName=Applied Soft Computing, refType=null, unstructuredReference=LIN C C, CHEN C S, CHEN A P. Using intelligent computing and data stream mining for behavioral finance associated with market profile and financial physics[J]. Applied Soft Computing, 2018, 68:756-764., articleTitle=Using intelligent computing and data stream mining for behavioral finance associated with market profile and financial physics, refAbstract=null), Reference(id=1251226705529094283, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2022, volume=49, issue=增刊1, pageStart=544, pageEnd=554, url=null, language=null, rfNumber=[4], rfOrder=3, authorNames=庞兴龙, 朱国胜, journalName=计算机科学, refType=null, unstructuredReference=庞兴龙, 朱国胜.基于半监督学习的网络流量分析研究[J].计算机科学, 2022, 49(增刊1):544-554., articleTitle=基于半监督学习的网络流量分析研究, refAbstract=null), Reference(id=1251226705633951888, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2022, volume=614, issue=null, pageStart=1, pageEnd=18, url=null, language=null, rfNumber=[5], rfOrder=4, authorNames=KASHANI E S, BAGHERI SHOURAKI S, NOROUZI Y, journalName=Information Sciences, refType=null, unstructuredReference=KASHANI E S, BAGHERI SHOURAKI S, NOROUZI Y. Evolving data stream clustering based on constant false clustering probability[J]. Information Sciences, 2022, 614:1-18., articleTitle=Evolving data stream clustering based on constant false clustering probability, refAbstract=null), Reference(id=1251226705738809493, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2011, volume=51, issue=9, pageStart=65, pageEnd=68, url=null, language=null, rfNumber=[6], rfOrder=5, authorNames=张国毅, 王晓峰, 张旭洲, journalName=电讯技术, refType=null, unstructuredReference=张国毅, 王晓峰, 张旭洲.基于数据流聚类的动态信号分选框架[J].电讯技术, 2011, 51(9):65-68., articleTitle=基于数据流聚类的动态信号分选框架, refAbstract=null), Reference(id=1251226705860444313, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2021, volume=51, issue=1, pageStart=91, pageEnd=102, url=null, language=null, rfNumber=[7], rfOrder=6, authorNames=BEZDEK J C, KELLER J M, journalName=IEEE Transactions on Systems,Man,and Cybernetics:Systems, refType=null, unstructuredReference=BEZDEK J C, KELLER J M. Streaming data analysis:clustering or classification?[J]. IEEE Transactions on Systems,Man,and Cybernetics:Systems, 2021, 51(1):91-102., articleTitle=Streaming data analysis:clustering or classification?, refAbstract=null), Reference(id=1251226705977884829, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2024, volume=18, issue=4, pageStart=1, pageEnd=22, url=null, language=null, rfNumber=[8], rfOrder=7, authorNames=LI J P, YU H, ZHANG Z Y, journalName=ACM Transactions on Knowledge Discovery from Data, refType=null, unstructuredReference=LI J P, YU H, ZHANG Z Y, et al. Concept drift adaptation by exploiting drift type[J]. ACM Transactions on Knowledge Discovery from Data, 2024, 18(4):1-22., articleTitle=Concept drift adaptation by exploiting drift type, refAbstract=null), Reference(id=1251226706086936736, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2013, volume=46, issue=1, pageStart=1, pageEnd=31, url=null, language=null, rfNumber=[9], rfOrder=8, authorNames=SILVA J A, FARIA E R, BARROS R C, journalName=ACM Computing Surveys, refType=null, unstructuredReference=SILVA J A, FARIA E R, BARROS R C, et al. Data stream clustering: a survey[J]. ACM Computing Surveys, 2013, 46(1):1-31., articleTitle=Data stream clustering: a survey, refAbstract=null), Reference(id=1251226706187600035, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2003, volume=null, issue=null, pageStart=81, pageEnd=92, url=null, language=null, rfNumber=[10], rfOrder=9, authorNames=AGGARWAL C C, HAN J W, WANG J Y, journalName=null, refType=null, unstructuredReference=AGGARWAL C C, HAN J W, WANG J Y, et al. A framework for clustering evolving data streams[C]//The 29th International Conference on very Large Data Bases. Berlin:Morgan Kaufmann, 2003:81-92., articleTitle=A framework for clustering evolving data streams, refAbstract=null), Reference(id=1251226706275680424, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2006, volume=null, issue=null, pageStart=328, pageEnd=339, url=null, language=null, rfNumber=[11], rfOrder=10, authorNames=CAO F, ESTERT M, QIAN W N, journalName=null, refType=null, unstructuredReference=CAO F, ESTERT M, QIAN W N, et al. Density-based clustering over an evolving data stream with noise[C]//The 2006 SIAM International Conference on Data Mining. Bethesda: Society for Industrial and Applied Mathematics, 2006:328-339., articleTitle=Density-based clustering over an evolving data stream with noise, refAbstract=null), Reference(id=1251226706401509548, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2017, volume=120, issue=null, pageStart=99, pageEnd=117, url=null, language=null, rfNumber=[12], rfOrder=11, authorNames=XU J, WANG G Y, LI T R, journalName=Knowledge-Based Systems, refType=null, unstructuredReference=XU J, WANG G Y, LI T R, et al. Fat node leading tree for data stream clustering with density peaks[J]. Knowledge-Based Systems, 2017, 120:99-117., articleTitle=Fat node leading tree for data stream clustering with density peaks, refAbstract=null), Reference(id=1251226706510561457, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2017, volume=67, issue=null, pageStart=228, pageEnd=238, url=null, language=null, rfNumber=[13], rfOrder=12, authorNames=DE ANDRADE J, HRUSCHKA E R, GAMA J, journalName=Expert Systems with Applications, refType=null, unstructuredReference=DE ANDRADE J, HRUSCHKA E R, GAMA J. An evolutionary algorithm for clustering data streams with a variable number of clusters[J]. Expert Systems with Applications, 2017, 67:228-238., articleTitle=An evolutionary algorithm for clustering data streams with a variable number of clusters, refAbstract=null), Reference(id=1251226706615419062, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2017, volume=4, issue=1, pageStart=64, pageEnd=74, url=null, language=null, rfNumber=[14], rfOrder=13, authorNames=PUSCHMANN D, BARNAGHI P, TAFAZOLLI R, journalName=IEEE Internet of Things Journal, refType=null, unstructuredReference=PUSCHMANN D, BARNAGHI P, TAFAZOLLI R. Adaptive clustering for dynamic IoT data streams[J]. IEEE Internet of Things Journal, 2017, 4(1):64-74., articleTitle=Adaptive clustering for dynamic IoT data streams, refAbstract=null), Reference(id=1251226706707693755, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2013, volume=25, issue=6, pageStart=1410, pageEnd=1424, url=null, language=null, rfNumber=[15], rfOrder=14, authorNames=WANG C D, LAI J H, HUANG D, journalName=IEEE Transactions on Knowledge and Data Engineering, refType=null, unstructuredReference=WANG C D, LAI J H, HUANG D, et al. SVStream: a support vector-based algorithm for clustering data streams[J]. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(6):1410-1424., articleTitle=SVStream: a support vector-based algorithm for clustering data streams, refAbstract=null), Reference(id=1251226708305723587, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2022, volume=49, issue=7, pageStart=25, pageEnd=30, url=null, language=null, rfNumber=[16], rfOrder=15, authorNames=陈圆圆, 王志海, journalName=计算机科学, refType=null, unstructuredReference=陈圆圆, 王志海.基于聚类分区的多维数据流概念漂移检测方法[J].计算机科学, 2022, 49(7):25-30., articleTitle=基于聚类分区的多维数据流概念漂移检测方法, refAbstract=null), Reference(id=1251226708381221064, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2023, volume=16, issue=1, pageStart=29, pageEnd=44, url=null, language=null, rfNumber=[17], rfOrder=16, authorNames=ZUBAROGLU A, ATALAY V, journalName=Statistical Analysis and Data Mining: the ASA Data Science Journal, refType=null, unstructuredReference=ZUBAROGLU A, ATALAY V. Online embedding and clustering of evolving data streams[J]. Statistical Analysis and Data Mining: the ASA Data Science Journal, 2023, 16(1):29-44., articleTitle=Online embedding and clustering of evolving data streams, refAbstract=null), Reference(id=1251226708469301453, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2016, volume=27, issue=12, pageStart=2499, pageEnd=2512, url=null, language=null, rfNumber=[18], rfOrder=17, authorNames=PENG X, TANG H J, ZHANG L, journalName=IEEE Transactions on Neural Networks and Learning Systems, refType=null, unstructuredReference=PENG X, TANG H J, ZHANG L, et al. A unified framework for representation-based subspace clustering of out-of-sample and large-scale data[J]. IEEE Transactions on Neural Networks and Learning Systems, 2016, 27(12):2499-2512., articleTitle=A unified framework for representation-based subspace clustering of out-of-sample and large-scale data, refAbstract=null), Reference(id=1251226708540604625, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2016, volume=29, issue=1, pageStart=11, pageEnd=21, url=null, language=null, rfNumber=[19], rfOrder=18, authorNames=刘博, 谢博鋆, 朱杰, journalName=模式识别与人工智能, refType=null, unstructuredReference=刘博, 谢博鋆, 朱杰, .快速可扩展的子空间聚类算法[J].模式识别与人工智能, 2016, 29(1):11-21., articleTitle=快速可扩展的子空间聚类算法, refAbstract=null), Reference(id=1251226708653850839, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2013, volume=24, issue=11, pageStart=2610, pageEnd=2627, url=null, language=null, rfNumber=[20], rfOrder=19, authorNames=朱林, 雷景生, 毕忠勤, journalName=软件学报, refType=null, unstructuredReference=朱林, 雷景生, 毕忠勤, .一种基于数据流的软子空间聚类算法[J].软件学报, 2013, 24(11):2610-2627., articleTitle=一种基于数据流的软子空间聚类算法, refAbstract=null), Reference(id=1251226708750319835, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2023, volume=63, issue=1, pageStart=39, pageEnd=46, url=null, language=null, rfNumber=[21], rfOrder=20, authorNames=陈金立, 付善腾, 朱熙铖, journalName=电讯技术, refType=null, unstructuredReference=陈金立, 付善腾, 朱熙铖, .阵元失效下基于核范数和SCAD惩罚的MIMO雷达DOA估计[J].电讯技术, 2023, 63(1):39-46., articleTitle=阵元失效下基于核范数和SCAD惩罚的MIMO雷达DOA估计, refAbstract=null), Reference(id=1251226708855177442, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2024-05-25, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[22], rfOrder=21, authorNames=null, journalName=null, refType=null, unstructuredReference=CUTURI M,UCI machine learning repository[EB/OL].[2024-05-25]. https://doi.org/10.24432/C52G70., articleTitle=CUTURI M,UCI machine learning repository, refAbstract=null), Reference(id=1251226708939063526, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=1998, volume=null, issue=null, pageStart=318, pageEnd=323, url=null, language=null, rfNumber=[23], rfOrder=22, authorNames=MARTINEZ A, BENAVENTE R, journalName=null, refType=null, unstructuredReference=MARTINEZ A, BENAVENTE R. The AR face database[R]. Columbus:Ohio State University, 1998:318-323., articleTitle=The AR face database, refAbstract=null), Reference(id=1251226709022949612, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2001, volume=23, issue=6, pageStart=643, pageEnd=660, url=null, language=null, rfNumber=[24], rfOrder=23, authorNames=GEORGHIADES A S, BELHUMEUR P N, KRIEGMAN D J, journalName=IEEE Transactions on Pattern Analysis and Machine Intelligence, refType=null, unstructuredReference=GEORGHIADES A S, BELHUMEUR P N, KRIEGMAN D J. From few to many:illumination cone models for face recognition under variable lighting and pose[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(6):643-660., articleTitle=From few to many:illumination cone models for face recognition under variable lighting and pose, refAbstract=null), Reference(id=1251226709106835697, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2010, volume=28, issue=5, pageStart=807, pageEnd=813, url=null, language=null, rfNumber=[25], rfOrder=24, authorNames=GROSS R, MATTHEWS I, COHN J, journalName=Image and Vision Computing, refType=null, unstructuredReference=GROSS R, MATTHEWS I, COHN J, et al. Multi-PIE[J].Image and Vision Computing, 2010, 28(5):807-813., articleTitle=Multi-PIE, refAbstract=null), Reference(id=1251226709203304693, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2009, volume=null, issue=null, pageStart=1, pageEnd=9, url=null, language=null, rfNumber=[26], rfOrder=25, authorNames=CHUA T S, TANG J H, HONG R C, journalName=null, refType=null, unstructuredReference=CHUA T S, TANG J H, HONG R C, et al. NUS-WIDE:a real-world web image database from National University of Singapore[C]//The 8th ACM International Conference on Image and Video Retrieval. Santorini:ACM, 2009:1-9., articleTitle=NUS-WIDE:a real-world web image database from National University of Singapore, refAbstract=null), Reference(id=1251226709278802167, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2022, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[27], rfOrder=26, authorNames=陈圆圆, journalName=数据流概念漂移检测及自适应聚类算法研究, refType=null, unstructuredReference=陈圆圆.数据流概念漂移检测及自适应聚类算法研究[D].北京:北京交通大学, 2022., articleTitle=null, refAbstract=null), Reference(id=1251226709371076860, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2023, volume=53, issue=10, pageStart=6408, pageEnd=6420, url=null, language=null, rfNumber=[28], rfOrder=27, authorNames=CHEN J, WANG Z, YANG S X, journalName=IEEE Transactions on Cybernetics, refType=null, unstructuredReference=CHEN J, WANG Z, YANG S X, et al. Two-stage sparse representation clustering for dynamic data streams[J]. IEEE Transactions on Cybernetics, 2023, 53(10):6408-6420., articleTitle=Two-stage sparse representation clustering for dynamic data streams, refAbstract=null), Reference(id=1251226709475934466, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, doi=null, pmid=null, pmcid=null, year=2025, volume=36, issue=1, pageStart=525, pageEnd=539, url=null, language=null, rfNumber=[29], rfOrder=28, authorNames=CHEN J, YANG S X, FAHY C, journalName=IEEE Transactions on Neural Networks and Learning Systems, refType=null, unstructuredReference=CHEN J, YANG S X, FAHY C, et al. Online sparse representation clustering for evolving data streams[J]. IEEE Transactions on Neural Networks and Learning Systems, 2025, 36(1):525-539., articleTitle=Online sparse representation clustering for evolving data streams, refAbstract=null)], funds=null, companyList=[AuthorCompany(id=1251226699543823187, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, xref=1, ext=[AuthorCompanyExt(id=1251226699548017491, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699543823187, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1School of Information Engineering,Huzhou University,Huzhou 31300,China), AuthorCompanyExt(id=1251226699552211796, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699543823187, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1湖州师范学院 信息工程学院,浙江 湖州 313000)]), AuthorCompany(id=1251226699619320666, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, xref=2, ext=[AuthorCompanyExt(id=1251226699623514969, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699619320666, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, Huzhou 313000,China), AuthorCompanyExt(id=1251226699631903578, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699619320666, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2浙江省现代农业资源智慧管理与应用研究重点实验室,浙江 湖州 313000)]), AuthorCompany(id=1251226699719983968, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, xref=3, ext=[AuthorCompanyExt(id=1251226699724178273, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699719983968, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=3Huzhou Key Laboratory of Aquatic Robot Technology,Huzhou 313000,China), AuthorCompanyExt(id=1251226699732566883, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, companyId=1251226699719983968, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=3湖州市水域机器人技术重点实验室,浙江 湖州 313000)])], figs=[ArticleFig(id=1251226701699694593, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=EN, label=null, caption=null, figureFileSmall=9PzllCyTZGsJeYduuzpNIQ==, figureFileBig=gMBAWS7LqZgbYjksGqF4XQ==, tableContent=null), ArticleFig(id=1251226701779386375, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=CN, label=图1, caption=本文方法的主要框架, figureFileSmall=9PzllCyTZGsJeYduuzpNIQ==, figureFileBig=gMBAWS7LqZgbYjksGqF4XQ==, tableContent=null), ArticleFig(id=1251226701997490200, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=EN, label=null, caption=null, figureFileSmall=BF6m7bKN8+s1LX1R+9X+FQ==, figureFileBig=K27UP2qwgCJaelt8GHjb9A==, tableContent=null), ArticleFig(id=1251226702102347808, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=CN, label=图2, caption=各数据集聚类准确率, figureFileSmall=BF6m7bKN8+s1LX1R+9X+FQ==, figureFileBig=K27UP2qwgCJaelt8GHjb9A==, tableContent=null), ArticleFig(id=1251226702194622499, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=EN, label=null, caption=null, figureFileSmall=VamHfBoBTrEwqgOaQKucYA==, figureFileBig=tDSOpA4VZQTiL2EG3sV0XA==, tableContent=null), ArticleFig(id=1251226703788458024, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=CN, label=图3, caption=各数据集聚类互信息, figureFileSmall=VamHfBoBTrEwqgOaQKucYA==, figureFileBig=tDSOpA4VZQTiL2EG3sV0XA==, tableContent=null), ArticleFig(id=1251226703889121328, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=EN, label=null, caption=null, figureFileSmall=+J2Xk5G2UNhodWbTBqFTpQ==, figureFileBig=4tdyubvGsvvfE8i/APepAw==, tableContent=null), ArticleFig(id=1251226703998173236, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=CN, label=图4, caption=λ在不同数据集的参数实验, figureFileSmall=+J2Xk5G2UNhodWbTBqFTpQ==, figureFileBig=4tdyubvGsvvfE8i/APepAw==, tableContent=null), ArticleFig(id=1251226704094642234, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=EN, label=null, caption=null, figureFileSmall=NVORSkc5j+K2e3gg1Qik3Q==, figureFileBig=HuS1lU7IAJBHpnwFLm0uxA==, tableContent=null), ArticleFig(id=1251226704199499840, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=CN, label=图5, caption=各数据集的概念漂移次数及其聚类精度, figureFileSmall=NVORSkc5j+K2e3gg1Qik3Q==, figureFileBig=HuS1lU7IAJBHpnwFLm0uxA==, tableContent=null), ArticleFig(id=1251226704308551752, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=EN, label=null, caption=null, figureFileSmall=++5EDq6hMUwN+EUFKvhmhw==, figureFileBig=GTs5ezoVSS2CkmtzKZjluA==, tableContent=null), ArticleFig(id=1251226704413409356, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=CN, label=图6, caption=不同数据集的运行时间对比, figureFileSmall=++5EDq6hMUwN+EUFKvhmhw==, figureFileBig=GTs5ezoVSS2CkmtzKZjluA==, tableContent=null), ArticleFig(id=1251226704501489745, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=EN, label=null, caption=null, figureFileSmall=null, figureFileBig=null, tableContent=
数据集样本数类簇数特征维数
PEMS-SF[22]4407138672
AR[23]14001002200
ExYaleB[24]2414382016
MPIE[25]8916286115
NusWide[26]3000031639
Electricity[27]4531228
2023_1_WX15182125
2022_WX117414773
), ArticleFig(id=1251226704597958743, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=CN, label=表1, caption=

数据集信息

, figureFileSmall=null, figureFileBig=null, tableContent=
数据集样本数类簇数特征维数
PEMS-SF[22]4407138672
AR[23]14001002200
ExYaleB[24]2414382016
MPIE[25]8916286115
NusWide[26]3000031639
Electricity[27]4531228
2023_1_WX15182125
2022_WX117414773
), ArticleFig(id=1251226704707010653, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=EN, label=null, caption=null, figureFileSmall=null, figureFileBig=null, tableContent=
数据集平均准确率±标准差
CluStreamDenStreamEmCStreamSVStreamOSRCTSSRCS2LCStream
PEMS-SF[22]h=50)0.433±0.0410.278±0.0640.338±0.0330.215±0.0280.37±0.0280.41±0.0410.478±0.053
AR [23]h=200)0.289±0.0220.379±0.0230.275±0.1000.151±0.0400.552±0.0220.504±0.0350.671±0.080
ExYaleB[24]h=200)0.156±0.0160.311±0.0460.175±0.0490.093±0.0050.285±0.0280.244±0.0190.443±0.120
MPIE[25]h=500)0.212±0.0110.233±0.0140.082±0.0450.050±0.0100.578±0.0030.482±0.0570.513±0.135
NusWide[26]h=500)0.135±0.0160.162±0.0210.105±0.0090.202±0.0590.269±0.0430.258±0.0420.184±0.015
Electricity[27]h=500)0.556±0.040.199±0.0330.531±0.0380.482±0.0330.589±0.0630.612±0.0560.574±0.023
2023_1_WX(h=100)0.957±0.0260.401±0.1670.721±0.1840.721±0.1840.93±0.020.97±0.0190.966±0.038
2022_WX(h=500)0.664±0.0560.418±0.2040.670±0.1190.886±0.0480.884±0.0480.884±0.0480.769±0.138
), ArticleFig(id=1251226704807673954, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=CN, label=表2, caption=

单窗口聚类平均准确率及标准差

, figureFileSmall=null, figureFileBig=null, tableContent=
数据集平均准确率±标准差
CluStreamDenStreamEmCStreamSVStreamOSRCTSSRCS2LCStream
PEMS-SF[22]h=50)0.433±0.0410.278±0.0640.338±0.0330.215±0.0280.37±0.0280.41±0.0410.478±0.053
AR [23]h=200)0.289±0.0220.379±0.0230.275±0.1000.151±0.0400.552±0.0220.504±0.0350.671±0.080
ExYaleB[24]h=200)0.156±0.0160.311±0.0460.175±0.0490.093±0.0050.285±0.0280.244±0.0190.443±0.120
MPIE[25]h=500)0.212±0.0110.233±0.0140.082±0.0450.050±0.0100.578±0.0030.482±0.0570.513±0.135
NusWide[26]h=500)0.135±0.0160.162±0.0210.105±0.0090.202±0.0590.269±0.0430.258±0.0420.184±0.015
Electricity[27]h=500)0.556±0.040.199±0.0330.531±0.0380.482±0.0330.589±0.0630.612±0.0560.574±0.023
2023_1_WX(h=100)0.957±0.0260.401±0.1670.721±0.1840.721±0.1840.93±0.020.97±0.0190.966±0.038
2022_WX(h=500)0.664±0.0560.418±0.2040.670±0.1190.886±0.0480.884±0.0480.884±0.0480.769±0.138
), ArticleFig(id=1251226704958668907, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=EN, label=null, caption=null, figureFileSmall=null, figureFileBig=null, tableContent=
数据集平均互信息±标准差
CluStreamDenStreamEmCStreamSVStreamOSRCTSSRCS2LCStream
PEMS-SF[22]h=50)0.441±0.0400.147±0.1360.334±0.0420.002±0.0230.239±0.0620.335±0.0420.114±0.073
AR[23]h=200)0.734±0.0110.469±0.0470.583±0.1450.179±0.0920.816±0.0040.825±0.010.882±0.037
ExYaleB[24]h=200)0.377±0.0460.189±0.0690.177±0.0810.509±0.1470.547±0.0180.525±0.0140.673±0.098
MPIE[25]h=500)0.756±0.0100.585±0.0260.346±0.2010.136±0.0520.83±0.0040.824±0.0140.833±0.061
NusWide[26]h=500)0.026±0.0180.080±0.0180.023±0.0530.006±0.0060.194±0.0180.198±0.0160.176±0.046
Electricity[27]h=500)0.013±0.0150.060±0.0290.005±0.0170.0001±0.0070.009±0.0170.02±0.0250.015±0.003
2023_1_WX(h=100)0.004±0.0040.040±0.0280.055±0.1460.0007±0.00060.042±0.0710.013±0.3550.075±0.257
2022_WX(h=500)0.020±0.0150.098±0.0300.146±0.1380.006±0.0040.059±0.0460.024±0.0210.039±0.053
), ArticleFig(id=1251226705059332210, tenantId=1146029695717560320, journalId=1251194772300279900, articleId=1251226692853908040, language=CN, label=表3, caption=

单窗口聚类平均互信息及标准差

, figureFileSmall=null, figureFileBig=null, tableContent=
数据集平均互信息±标准差
CluStreamDenStreamEmCStreamSVStreamOSRCTSSRCS2LCStream
PEMS-SF[22]h=50)0.441±0.0400.147±0.1360.334±0.0420.002±0.0230.239±0.0620.335±0.0420.114±0.073
AR[23]h=200)0.734±0.0110.469±0.0470.583±0.1450.179±0.0920.816±0.0040.825±0.010.882±0.037
ExYaleB[24]h=200)0.377±0.0460.189±0.0690.177±0.0810.509±0.1470.547±0.0180.525±0.0140.673±0.098
MPIE[25]h=500)0.756±0.0100.585±0.0260.346±0.2010.136±0.0520.83±0.0040.824±0.0140.833±0.061
NusWide[26]h=500)0.026±0.0180.080±0.0180.023±0.0530.006±0.0060.194±0.0180.198±0.0160.176±0.046
Electricity[27]h=500)0.013±0.0150.060±0.0290.005±0.0170.0001±0.0070.009±0.0170.02±0.0250.015±0.003
2023_1_WX(h=100)0.004±0.0040.040±0.0280.055±0.1460.0007±0.00060.042±0.0710.013±0.3550.075±0.257
2022_WX(h=500)0.020±0.0150.098±0.0300.146±0.1380.006±0.0040.059±0.0460.024±0.0210.039±0.053
)], attaches=null, journal=Journal(id=1251193998841266264, delFlag=0, nameCn=电讯技术, nameEn=Telecommunication Engineering, nameHistory1=null, nameHistory2=null, issn=1001-893X, eissn=null, cn=51-1267/TN, coden=null, periodic=0, language=CN, oaType=null, ccby=null, superviseOffice=null, ownerOffice=null, pubOffice=null, editorOffice=null, officeType=null, aims=null, clcCode=null, officeProv=null, officeCity=null, officeAddr=null, officeZip=null, officeEmail=null, officePhone=null, editDirector=null, officeDirector=null, officeDirectorPhone=null, officeStaffNum=null, officeEmpNum=null, coverPicUrl=CpBmHoMzpESavU+iEMTBmw==, journalPrice=null, startedYear=null, abbrevIsoEn=Telecommunication Engineering, journalRemark=null, publicationField=null, createdTime=1776237495387, updatedTime=1776238086301, createdBy=18614031015, updatedBy=13701087609, firstLetterCn=T, firstLetterEn=T, subjectCode=Engineering, subjectName=null, subjectCodeEn=Engineering, subjectNameEn=null, picCn=CpBmHoMzpESavU+iEMTBmw==, picEn=jCOIy2zOaGJZ/y3z2gPZzg==, jcr=null, cjcr=null, exts=[JournalExt(id=1251196477385687352, language=CN, name=电讯技术, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=, createdTime=1776238086315, updatedTime=1776238086315, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=https://www.teleonline.cn/dxjs/ch/author/login.aspx, submissionEditorUrl=https://www.teleonline.cn/dxjs/ch/login.aspx, submissionReviewUrl=https://www.teleonline.cn/dxjs/ch/auditor/login.aspx, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""}), JournalExt(id=1251196477469573433, language=EN, name=Telecommunication Engineering, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=, createdTime=1776238086335, updatedTime=1776238086335, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=https://www.teleonline.cn/dxjs/ch/author/login.aspx, submissionEditorUrl=https://www.teleonline.cn/dxjs/ch/login.aspx, submissionReviewUrl=https://www.teleonline.cn/dxjs/ch/auditor/login.aspx, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""})], databaseList=null, tenantJournalId=1251194772300279900, websiteList=[Website(id=1251197148327522670, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1251194772300279900, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/dxjs/CN, language=CN, createTime=1776238246280, createBy=18614031015, updateTime=1776238378770, updateBy=18614031015, name=电讯技术-中文, tplId=1146099689490845704, title=电讯技术, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1251197904854135502, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148327522670, code=articleTextType, value=kx, createTime=1776238426650, updateTime=1776238426650, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197904833163979, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148327522670, code=banner, value=null, createTime=1776238426645, updateTime=1776238426645, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197904870912721, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148327522670, code=grayFlag, value=0, createTime=1776238426654, updateTime=1776238426654, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197904824775370, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148327522670, code=logo, value=https://castjournals.cast.org.cn/joweb/dxjs/CN/file/pic?fileId=BBd4SC9puESjyaw04bneig==, createTime=1776238426643, updateTime=1776238426643, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197904883495635, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148327522670, code=minRunFlag, value=0, createTime=1776238426657, updateTime=1776238426657, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197904845746893, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148327522670, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/dxjs/CN/file/pic, createTime=1776238426648, updateTime=1776238426648, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197904875107026, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148327522670, code=silenceFlag, value=0, createTime=1776238426655, updateTime=1776238426655, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197904841552588, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148327522670, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_cn_619/, createTime=1776238426647, updateTime=1776238426647, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197904858329807, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148327522670, code=themeColor, value=null, createTime=1776238426651, updateTime=1776238426651, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197904866718416, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148327522670, code=themeStyle, value=null, createTime=1776238426653, updateTime=1776238426653, creator=18614031015, updator=18614031015)]), Website(id=1251197148512072052, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1251194772300279900, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/dxjs/EN, language=EN, createTime=1776238246324, createBy=18614031015, updateTime=1776238398944, updateBy=18614031015, name=电讯技术-英文, tplId=1146101810881728533, title=Telecommunication Engineering, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1251197930175152619, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148512072052, code=articleTextType, value=kx, createTime=1776238432687, updateTime=1776238432687, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197930154181096, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148512072052, code=banner, value=null, createTime=1776238432682, updateTime=1776238432682, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197930200318446, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148512072052, code=grayFlag, value=0, createTime=1776238432693, updateTime=1776238432693, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197930141598183, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148512072052, code=logo, value=https://castjournals.cast.org.cn/joweb/dxjs/EN/file/pic?fileId=BBd4SC9puESjyaw04bneig==, createTime=1776238432679, updateTime=1776238432679, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197930212901360, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148512072052, code=minRunFlag, value=0, createTime=1776238432696, updateTime=1776238432696, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197930170958314, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148512072052, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/dxjs/EN/file/pic, createTime=1776238432686, updateTime=1776238432686, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197930204512751, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148512072052, code=silenceFlag, value=0, createTime=1776238432694, updateTime=1776238432694, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197930162569705, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148512072052, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_en_623/, createTime=1776238432684, updateTime=1776238432684, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197930183541228, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148512072052, code=themeColor, value=null, createTime=1776238432689, updateTime=1776238432689, creator=18614031015, updator=18614031015), WebsiteProps(id=1251197930191929837, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251197148512072052, code=themeStyle, value=null, createTime=1776238432691, updateTime=1776238432691, creator=18614031015, updator=18614031015)])], journalTitle=电讯技术, weixinUrl=null, journalUrl=https://www.teleonline.cn/, iacademicId=null, status=1, seqNo=null, journalTitleEn=Telecommunication Engineering, journalPhotoCn=CpBmHoMzpESavU+iEMTBmw==, journalPhotoEn=jCOIy2zOaGJZ/y3z2gPZzg==, journalFirstLetter=T, journalRecommend=null, journalNew=null, journalCollection=null, jcrJf=null, cjcrJf=null, jcrJfStr=null, cjcrJfStr=null, submissionFirstDecision=null, sciSubjectClassification=null, casSubjectClassification=null, citeScore=null, totalCitationFrequency=null, icpCode=null, psCode=null, advertisingLicenseCode=null, copyrightInformation=null, country=null, option=, provinceCode=null, provinceName=null, collectFlag=false), detailUrlCn=https://castjournals.cast.org.cn/joweb/dxjs/CN/10.20079/j.issn.1001-893x.240618002, detailUrlEn=https://castjournals.cast.org.cn/joweb/dxjs/EN/10.20079/j.issn.1001-893x.240618002, pdfUrlCn=https://castjournals.cast.org.cn/joweb/dxjs/CN/PDF/10.20079/j.issn.1001-893x.240618002, pdfUrlEn=https://castjournals.cast.org.cn/joweb/dxjs/EN/PDF/10.20079/j.issn.1001-893x.240618002, aliStartDate=null, aliEndDate=null, collectionFlag=false, citedCount=null, citedUrl=null, reference=null)
收藏切换
基于可扩展子空间学习的数据流聚类方法
收藏切换
PDF下载
尹宏伟 1, 2, 3 , 倪钰洲 1, 2 , 胡文军 1, 2, 3
电讯技术 | 应用基础与前沿技术 2025,65(11): 1836-1843
收起
收藏切换
电讯技术 | 应用基础与前沿技术 2025, 65(11): 1836-1843
基于可扩展子空间学习的数据流聚类方法
全屏
尹宏伟1, 2, 3, 倪钰洲1, 2, 胡文军1, 2, 3
作者信息
  • 1湖州师范学院 信息工程学院,浙江 湖州 313000
  • 2浙江省现代农业资源智慧管理与应用研究重点实验室,浙江 湖州 313000
  • 3湖州市水域机器人技术重点实验室,浙江 湖州 313000
  • 尹宏伟 男,1990年生于安徽宿松,2019年获博士学位,现为副教授,主要研究方向为机器学习、数据挖掘和聚类分析等。

    倪钰洲 男,1999年生于江苏苏州,2018年获学士学位,现为硕士研究生,主要研究方向为聚类分析。

    胡文军 男,1977年生于安徽绩溪,2012年获博士学位,现为教授,主要研究方向为机器学习、模式识别、数据挖掘、智能系统等。

通讯作者:

胡文军 Email:
Scalable Subspace Learning for Clustering Data Streams
Hongwei YIN1, 2, 3, Yuzhou NI1, 2, Wenjun HU1, 2, 3
Affiliations
  • 1School of Information Engineering,Huzhou University,Huzhou 31300,China
  • 2Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, Huzhou 313000,China
  • 3Huzhou Key Laboratory of Aquatic Robot Technology,Huzhou 313000,China
出版时间: 2025-11-28 doi: 10.20079/j.issn.1001-893x.240618002
文章导航
收藏切换

传统数据流聚类方法缺乏对高维数据的在线降维能力,导致其聚类性能受限。为解决此问题,提出了一种基于可扩展子空间学习的数据流聚类方法(Scalable Subspace Learning for Clustering Data Streams,S2 LCStream)。首先,通过可扩展子空间学习建立历史数据与新增数据之间的投影关系,将新增数据投影至历史数据张成的子空间中,以实时获取其聚类划分。其次,为保持不同时刻聚类划分的准确性,对持续到达的数据流进行数据分布的一致性检测,捕获其中存在的概念漂移,并结合回溯机制对聚类划分进行调整以适应动态变化的数据分布。最后,通过在多个真实数据集上进行测试,验证了所提方法在处理高维数据流的效能。所提方法在保持较高聚类性能的同时,能够高效处理数据流中的概念漂移。

数据流聚类  /  子空间学习  /  可扩展子空间学习  /  概念漂移检测

Traditional data stream clustering methods lack online dimensionality reduction capabilities for high-dimensional data, leading to limited clustering performance. To address this issue,a Scalable Subspace Learning for Clustering Data Streams(S2LCStream) method is proposed. Firstly,this method establishes a projection relationship between historical data and new data through scalable subspace learning,projecting the new data into the subspace spanned by historical data to obtain its clustering assignment in real-time. Secondly,to maintain the accuracy of clustering assignments over time, the method performs consistency detection of data distribution on the continuously arriving data stream,capturing concept drifts and adjusting clustering assignments through a backtracking mechanism to adapt to dynamically changing data distributions. Finally,the proposed method is validated on multiple real-world datasets, demonstrating its efficiency in handling high-dimensional data streams. Specifically, S2LCStream maintains high clustering accuracy while efficiently handling concept drift.

data stream clustering  /  subspace learning  /  scalable subspace learning  /  concept drift detection
尹宏伟, 倪钰洲, 胡文军. 基于可扩展子空间学习的数据流聚类方法. 电讯技术, 2025 , 65 (11) : 1836 -1843 . DOI: 10.20079/j.issn.1001-893x.240618002
Hongwei YIN, Yuzhou NI, Wenjun HU. Scalable Subspace Learning for Clustering Data Streams[J]. Telecommunication Engineering, 2025 , 65 (11) : 1836 -1843 . DOI: 10.20079/j.issn.1001-893x.240618002
随着信息技术的迅速发展,数据流正逐渐成为数据处理与分析的核心对象。作为一种持续到达且动态变化的数据对象,数据流广泛存在于交通监管、电子商务、社交媒体以及医疗诊断等多领域[1]。例如,通过分析交通数据流,可以预测交通拥堵模式并优化交通信号控制[2] ;在金融市场中,通过分析实时交易数据流,可以捕捉异常交易以降低投资风险[3] ;此外,通过分析用户行为数据流,可以构建用户画像,实现个性化内容推荐[4]。为了在资源受限且缺少监督信息的条件下,实时发现海量数据流中的潜在规律与关联,并从中提取出关键知识,无监督的数据流聚类已经成为当前机器学习与数据挖掘的重要任务之一[5-7]
与面向封闭静态数据的传统聚类方法不同,数据流聚类面向开放动态的流式数据。开放是指其样本规模会随时间持续增加,动态是指其潜在数据分布同样会随时间产生动态变化。针对持续增加的样本规模,数据流聚类需要持续到达的新增数据进行高效的在线聚类分析。此外,针对动态变化的数据分布,又被称为概念漂移[8],数据流聚类需要具备检测并适应概念漂移的能力,从而实现对动态变化数据分布的精准描述[7-8]
为实现在线获取数据流聚类划分,现有数据流聚类通常采用两阶段方法,又称为在线-离线双层框架[9],包括CluStream[10]和DenStream[11]等。在这些方法中,数据流的概要信息被在线生成并存储在特定的数据结构中。当新增样本到达时,数据流的概要信息会被实时更新,以保持对数据分布的准确描述。在离线聚类阶段,定期对生成的数据概要执行聚类算法。为进一步提高实时聚类性能,近年来提出了完全在线聚类,旨在对每个不断到达的数据实例进行重新聚类,以此来保持最新的聚类结果,包括DPClust[12]、FEAC-Stream[13]和Adaptive Stream k-means[14]等。
通过建立数据流中的概念漂移检测机制,能够有效提高聚类划分的准确性。DenStream[11]和CluStream[10]提出通过淘汰部分历史数据,以保持在线组件中准确更新数据概要,从而适应动态变化的数据分布。但是,此类方法无法显式捕捉概念漂移。SVStream[15]通过支持向量描述建立数据流聚类方法框架,通过迭代维护数据的最小球体,以动态维持数据流中各类簇的边界,但仍无法显式捕捉概念漂移。为实现概念漂移的显式捕捉,一种基于等密度分区的概念漂移检测方法[16]被提出。该方法通过对数据进行等密度分区,利用卡方检验对每个分区进行统计和计算,从而检测数据分布变化,以达到概念漂移检测的目的。在文献[17]中,通过统一流形逼近与投影对演化数据流进行在线嵌入和聚类,能自适应捕捉概念漂移,从而提高聚类性能。
尽管以上方法在数据流在线聚类与概念漂移检测上取得了较为理想的结果,但由于缺乏高效的在线降维机制,其处理高维数据流的能力受到较大限制,难以捕捉高维数据流中存在的概念漂移。为解决此问题,本文提出一种基于可扩展子空间学习的新型数据流聚类方法(Scalable Subspace Learning for Clustering Data Streams,S2LCStream)。首先,该算法通过可扩展子空间学习[18]建立历史数据与新增数据之间的投影关系,将新增样本投影至历史数据张成的子空间中,以实时获取其聚类划分。其次,为保持不同时刻聚类划分的准确性,对持续到达的数据流进行数据分布的一致性检测,捕获其中存在概念漂移的窗口,并结合回溯机制对聚类划分进行调整以适应动态变化的数据分布。通过在多个仿真数据集以及真实数据流上的实验,验证本文所提算法在聚类性能上优于当前数据流聚类算法,并且能够高效捕捉高维数据流中存在的概念漂移。
在传统自表示子空间学习方法[19-21]中,对于数据集X={x1x2,…,xn}∈Rd×n,通过设定数据集本体为字典,将各数据样本表示为其他样本的仿射组合,可捕获数据在低维子空间的几何结构,其目标函数如下所示:
式中:Z∈Rn×n为原始数据的自表示矩阵;E∈Rd×n为噪声的扰动矩阵;λ为权衡参数。根据不同的结构保持规则,‖·‖可选择多种矩阵范数对自表示矩阵进行约束。例如,引入0范数来捕获数据的局部结构,又因为0范数诱发的NP-Hard问题,通常采用其最优凸近似的1范数进行替代。引入核范数对自表示矩阵的进行低秩约束,可增强保持原始数据的全局结构,并降低异常值的影响。
在公式(1)的模型中,所获取自表示矩阵Z被视为表示原始数据X中所有样本间相似度关系的邻接矩阵。进一步利用谱聚类等方法,可获得原始数据的聚类划分。但是,传统的自表示子空间学习只能实现对历史数据学习,无法扩展至新增数据。令Y={y1y2,…,yl}∈Rd×l表示新增数据,通过建立历史数据与新增数据之间的投影关系,将样本yi投影至由X张成的n维子空间中,以获得其聚类标签。可扩展子空间学习的目标函数如下所示:
其中通过最小化2范数建立历史数据和新增数据之间的投影。此外,通过对投影向量的二次约束避免过拟合,γ为权衡参数。该优化问题的解为
为获取新增数据的聚类标签,在获得最优投影向量ci后,取该向量中第j个非零元素为δjci),则样本yiX张成的子空间在第j维度的残差定义如下:
通过比较样本yi在各维度上的残差值,确定具有最小残差的维度为该样本的聚类标签fyi),公式如下:
对于数据流X={x1x2,…,xn},其特征维度为d,样本数据持续到达,且样本总量趋于无穷。为实时获取新增数据的聚类划分,通过可扩展子空间学习建立历史数据和新增数据之间的投影关系。首先,采用滑动窗口h对数据流中不同时刻到达的数据进行表示,令t时刻滑动窗口内样本集合为ht,则第1时刻和第2时刻获取的样本集合分别为h1h2。根据可扩展子空间学习的原理,联合h1h2构成历史数据,并通过公式(1)获得其聚类标签H
当新增数据ht到达时,通过公式(2)将新增数据投影至历史数据张成的子空间内,进而通过公式(5)可实时获取其聚类标签Ht
由于数据分布会随着时间持续变化,在数据流聚类任务中,根据历史数据建立的学习模型无法适应新增数据的数据分布。为保持聚类划分的准确性,需要对持续到达的数据流进行数据分布的一致性检测,捕获其中存在的概念漂移,即概念漂移检测。为实现此目标,在数据流聚类过程中设置概念漂移检测周期p,对周期内获取的新增数据进行两次聚类,包含通过可扩展子空间学习方法获取各窗口的聚类划分,另一次则通过传统子空间表示学习独立获取周期内所有新增数据的聚类划分。通过计算两次聚类划分结果之间的调整兰德指数(Adjusted Rand Index,ARI),判断周期内是否存在概念漂移。
图1所示,以首次概念漂移检测周期为例,联合h1h2窗口构成历史数据,并通过公式(1)获得其聚类标签H1:2。当新增数据窗口h3h4h5到达时,通过公式(2)将新增数据投影至历史数据张成的子空间内,进而通过公式(5)可实时获取其聚类标签H3:5,最后联合获得聚类标签H1:5,再联合h1h2h3h4h5窗口并通过公式(1)获得其聚类标签Q1:5。因而对于相同的数据有两个不同的聚类标签H1:5Q1:5,其中Q1:5是不使用任何先验知识得到的聚类标签。在获得两次的聚类标签后,计算标签H1:5Q1:5之间的一致性,令ζ= ARI(HQ)表示标签H1:5Q1:5之间的调整兰德系数,再令θ表示概念漂移的阈值,来进行概念漂移的决策。当标签一致性小于概念漂移的阈值,即ζθ,这意味着数据特征发生了变化,存在概念漂移,此时自适应减少概念漂移检测周期p的大小和自适应调整概念漂移的阈值θ,并返回初始化阶段,重新计算两次聚类标签H1:4Q1:4之间的调整兰德系数,直至满足ζθ。当标签一致性不小于概念漂移的阈值,即ζθ,这说明输入数据的特征保持一致,未发生概念漂移,此时算法输出当前周期内的聚类结果并继续处理下一周期内的新增数据。
结合以上描述,本文提出了一种基于可扩展子空间学习的数据流聚类方法(S2LCStream)。S2LCStream算法从初始化阶段开始,随后持续按滑动窗口投影数据并定期进行概念漂移检查。到达检查周期时,算法聚类最近一段数据并检测概念漂移。如无漂移,输出聚类结果并继续数据投影直至下一周期;若检测到漂移,则回溯至上一检查点并重新初始化。
基于可扩展子空间学习的数据流聚类(S2LCStream)算法具体描述如下:
输入:X:数据流,k:类簇数,h:窗口大小,nDim:特征维度。
输出:X的聚类标签。
1 初始化算法,基于可扩展子空间学习建立历史数据和新增数据之间的投影关系。
2 通过公式(1)获得历史数据的聚类标签H,再通过公式(5)实时获取新增数据的聚类标签Ht
3 通过公式(1)获取新增数据的聚类标签Qt
4 到达概念漂移检测周期,计算ζ=ARI(HtQt)来进行概念漂移检测。
5 检测到概念漂移发生,返回第1步。
6 检测到未发生概念漂移,输出聚类结果并继续处理新增数据。
S2 LCStream按照窗口大小处理输入数据。h是窗口大小,即滑动窗口内样本集合。S2 LCStream检查一次概念漂移,并输出结果。这个过程称为漂移检查周期(p),并设置为(3×h),当检测到概念漂移时,概念漂移的阈值θ和漂移检查周期p会自适应调整。在漂移检查期间,S2LCStream在大小为h的窗口数据上进行3次投影聚类。为了漂移检查的目的,在大小为p的窗口数据上进行1次投影聚类。
可扩展子空间学习算法针对大小为h的窗口数据进行3次投影聚类,针对大小为p的窗口数据进行1次投影聚类。可扩展子空间学习算法的时间复杂度是Ot1hm2+h3)+nh2+t2h3),其中,h是窗口实例数量,m是特征维数,n是数据点总数,t1是增广拉格朗日乘子法(Augmented Lagrange Method,ALM)迭代的次数,t2是k-means迭代次数。
对于每个周期,可扩展子空间学习算法在大小为h的窗口数据上运行3次,针对大小为p的窗口数据运行1次,因此整体的时间复杂度为O(4(t1·(hm2+h3)+nh2+t2h3))。这是单个周期内的计算复杂度。对于整个数据流处理,如果有N个周期,整体复杂度为ON×4×(t1hm2+h3)+nh2+ t2h3))。该复杂度依赖于数据维度m、窗口大小h、数据点总数n、k-means的迭代次数t2和ALM迭代的次数t1
本实验旨在验证S2LCStream算法的性能和效率,选择在8个不同的数据集上进行实验,并将其与2种静态聚类算法以及6种数据流聚类算法进行比较。实验使用Python及相关数据处理和机器学习库。本实验采用了独立测试重复实验,多次运行整个算法,每次都使用不同的随机种子以确保数据顺序不同。计算多次实验的平均值和标准差,以评估算法的稳定性和准确性。实验包括对比实验,以比较各算法在各数据集上的表现;参数实验,以评估关键参数变化对聚类效果的影响;效率评估,记录算法的运行时间对比。采用两种常见聚类评估指标来衡量算法的性能:归一化互信息(Normalized Mutual Information,NMI)和准确率(Accuracy,ACC)。ACC数值范围为[0-1],趋近1时代表标签和聚类结果接近。NMI是一种从信息论的角度衡量聚类效果的方法。NMI对互信息进行了归一化处理,使其取值范围固定在0~1之间,0表示两个聚类结果完全不相关,1表示两个聚类结果完全一致。归一化互信息使用聚类结果的熵将互信息归一化至同一取值范围,使之能够对比不同聚类结果的优劣。NMI越大,聚类效果与真实分类越接近。通过这些实验,深入分析了S2LCStream在高维数据流聚类任务中对概念漂移的适应能力和处理效率。
为验证本文所提方法有效性,在8个真实数据集上进行试验,如表1所示。PEMS-SF[22]数据集包含440个交通流序列数据,描述了旧金山湾区高速公路不同车道的占用率,其中每一天为一个单独的时间序列,其特征维度为138672。此数据集将每一天分类到正确的一周中的某一天,共有7个类别。AR[23]数据集包含1400张人脸图像,涵盖100名受试者,图像特征维度为2200。ExYaleB[24]数据集包含38名受试者的2414张人脸图像,特征维度为2016。MPIE[25]数据集包含286个个体在不同环境下的8916张面部图像,并通过主成分分析处理以保留98%的信息。NusWide[26]数据集包含30000张网络图像,属于31个类别,其特征维度为639。Electricity[27]电厂数据集提供了45312个电力能源的市场价格波动情况的数据,其中有8个影响价格的因素,分2个类别。此外,在本实验中,采用两个真实电梯数据集验证所提方法对于真实数据流的聚类性能。2023_1_WX收集了湖州市吴兴区2023年1月1218台电梯发生的125种故障信息,被划分为2种类型的风险等级。2022_WX收集了湖州市吴兴区2022年11741台电梯发生的773种故障信息,被划分为4种类型的风险等级。
在本实验中,采用2种静态聚类算法k-means和SLRR[18],以及6种数据流聚类算法作为基准对比。CluStream[10]通过在线微簇计算与离线宏聚类分析相结合实现数据流聚类任务。DenStream[11]对密度聚类进行拓展,强化数据流聚类过程的孤立点检测,并将聚类过程分为微簇在线更新和微簇离线处理。EmCStream[17]利用UMAP技术在线获取数据的二维嵌入,并通过滑动窗口模型对概念漂移进行检测。SVStream[15]基于支持向量描述,通过迭代维护数据的最小球体,以获取各类簇的边界。TSSRC[28]通过精确的簇数量评估标志窗口中数据对象之间的有效关系,这有效地将以前学到的知识随时间传递到当前的标志窗口。OSRC[29]通过引入低维投影到稀疏表示中以适应性地减少高维数据的噪声和冗余,并利用l2,1范数优化技术选取代表性数据对象,形成特定字典,从而有效地评估演变数据流中高维数据对象之间的关系并适应性地利用其演变子空间结构。
在本实验中,通过比较在多个数据集上的聚类指标及单窗口的平均指标,来对实验结果进行分析。图2图3所示为各方法在多个数据集上的聚类准确率及聚类互信息,表2表3所示为各方法在多个数据集上的单窗口平均聚类准确率及平均聚类互信息。如图表所示,尽管S2LCStream在整体性能上可能不如传统聚类算法如k-means,但在处理分块数据方面表现出显著优势。特别是在高维图像数据集AR和交通流量数据集PEMS-SF上,S2LCStream的平均ACC值相较于其他6个数据流对比算法提升显著。这表明S2LCStream能更好地处理高维数据流,这主要得益于两个关键策略:使用可扩展的子空间学习来进行投影学习和自适应概念漂移检测机制。
在本文中,算法中涉及一个平衡参数λλ用来平衡目标函数中的不同部分,参数的选择依赖于数据的分布。对于S2LCStream,当λ的取值范围在2.0~3.9之间时,其准确率和NMI趋于平缓。图4展示了在两个公共数据集AR和ExYaleB上,不同λ的ACC和NMI结果。
图5展示了S2LCStream和EmCStream在8个数据集上的概念漂移次数和聚类性能对比。在数据集PEMS-SF、AR、MPIE和NusWide上,S2LCStream相比EmCStream检测到一样的漂移次数,但聚类精度显著增强,证明了本文算法在处理高维数据上的性能较为优异。对于两个真实电梯数据集,尽管S2LCStream检测到的漂移次数少于EmCStream,但其聚类性能更优越。结合以上数据集的分析,S2LCStream在多个高维数据集上均表现出色,不仅能检测到高维数据流发生的概念漂移,而且在聚类准确率上也远超EmCStream,突显了所提算法处理高维数据流的高效性和实用性。
代表性的数据流聚类方法在8个数据集上的执行时间结果如图6所示,这些数据流聚类方法的时间复杂度通常处于不同的水平。所提的算法复杂度为ON×4×(t1hm2+h3)+nh2+t2h3))。从时间复杂度分析来看,所提算法的时间复杂度相对较高。同时,从执行时间比较来看,所提算法耗时与EmCStream相当,当处理大规模的数据集时,其运行时间是优于EmCStream算法的。由于所提算法将投影和回溯过程结合到一个模型中,确实需要相当多的时间。在对比图2图3中的聚类结果后可以得出结论,虽然S2LCStream比其他数据流聚类方法耗时,但它可以实现更好的聚类性能,同时可以自适应检测和适应概念漂移。
本文基于可扩展子空间学习处理高维数据流聚类,并与概念漂移检测机制相结合。在此基础上,提出了一种新型基于可扩展子空间学习的数据流聚类方法(S2LCStream)。在6个公开的高维和大规模的数据集和2个真实电梯状态数据集上进行实验评估,结果显示S2LCStream在聚类质量方面显著优于现有的DenStream和CluStream等算法。
综上,S2LCStream算法在处理具有高动态性和多样性的数据流中展示出了优越的聚类性能和适应能力。通过智能地应对概念漂移并利用低秩表示来提高数据处理的精度和稳定性,S2LCStream成功地克服了传统聚类算法在复杂数据环境中面临的挑战。但是当数据规模过大时,本文方法的处理能力仍然存在可提升空间。未来将进一步针对此类问题进行深入研究,以适应更广泛的应用场景。
参考文献 引证文献
排序方式:
[1]
SUÁREZ-CETRULO A L, QUINTANA D, CERVANTES A. A survey on machine learning for recurring concept drifting data streams[J]. Expert Systems with Applications, 2023, 213:1-17.
[2]
GAO Y J, FANG Z Q, XU J C, et al. An efficient and distributed framework for real-time trajectory stream clustering[J]. IEEE Transactions on Knowledge and Data Engineering, 2024, 36(5):1857-1873.
[3]
LIN C C, CHEN C S, CHEN A P. Using intelligent computing and data stream mining for behavioral finance associated with market profile and financial physics[J]. Applied Soft Computing, 2018, 68:756-764.
[4]
庞兴龙, 朱国胜.基于半监督学习的网络流量分析研究[J].计算机科学, 2022, 49(增刊1):544-554.
[5]
KASHANI E S, BAGHERI SHOURAKI S, NOROUZI Y. Evolving data stream clustering based on constant false clustering probability[J]. Information Sciences, 2022, 614:1-18.
[6]
张国毅, 王晓峰, 张旭洲.基于数据流聚类的动态信号分选框架[J].电讯技术, 2011, 51(9):65-68.
[7]
BEZDEK J C, KELLER J M. Streaming data analysis:clustering or classification?[J]. IEEE Transactions on Systems,Man,and Cybernetics:Systems, 2021, 51(1):91-102.
[8]
LI J P, YU H, ZHANG Z Y, et al. Concept drift adaptation by exploiting drift type[J]. ACM Transactions on Knowledge Discovery from Data, 2024, 18(4):1-22.
[9]
SILVA J A, FARIA E R, BARROS R C, et al. Data stream clustering: a survey[J]. ACM Computing Surveys, 2013, 46(1):1-31.
[10]
AGGARWAL C C, HAN J W, WANG J Y, et al. A framework for clustering evolving data streams[C]//The 29th International Conference on very Large Data Bases. Berlin:Morgan Kaufmann, 2003:81-92.
[11]
CAO F, ESTERT M, QIAN W N, et al. Density-based clustering over an evolving data stream with noise[C]//The 2006 SIAM International Conference on Data Mining. Bethesda: Society for Industrial and Applied Mathematics, 2006:328-339.
[12]
XU J, WANG G Y, LI T R, et al. Fat node leading tree for data stream clustering with density peaks[J]. Knowledge-Based Systems, 2017, 120:99-117.
[13]
DE ANDRADE J, HRUSCHKA E R, GAMA J. An evolutionary algorithm for clustering data streams with a variable number of clusters[J]. Expert Systems with Applications, 2017, 67:228-238.
[14]
PUSCHMANN D, BARNAGHI P, TAFAZOLLI R. Adaptive clustering for dynamic IoT data streams[J]. IEEE Internet of Things Journal, 2017, 4(1):64-74.
[15]
WANG C D, LAI J H, HUANG D, et al. SVStream: a support vector-based algorithm for clustering data streams[J]. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(6):1410-1424.
[16]
陈圆圆, 王志海.基于聚类分区的多维数据流概念漂移检测方法[J].计算机科学, 2022, 49(7):25-30.
[17]
ZUBAROGLU A, ATALAY V. Online embedding and clustering of evolving data streams[J]. Statistical Analysis and Data Mining: the ASA Data Science Journal, 2023, 16(1):29-44.
[18]
PENG X, TANG H J, ZHANG L, et al. A unified framework for representation-based subspace clustering of out-of-sample and large-scale data[J]. IEEE Transactions on Neural Networks and Learning Systems, 2016, 27(12):2499-2512.
[19]
刘博, 谢博鋆, 朱杰, .快速可扩展的子空间聚类算法[J].模式识别与人工智能, 2016, 29(1):11-21.
[20]
朱林, 雷景生, 毕忠勤, .一种基于数据流的软子空间聚类算法[J].软件学报, 2013, 24(11):2610-2627.
[21]
陈金立, 付善腾, 朱熙铖, .阵元失效下基于核范数和SCAD惩罚的MIMO雷达DOA估计[J].电讯技术, 2023, 63(1):39-46.
[22]
CUTURI M,UCI machine learning repository[EB/OL].[2024-05-25]. https://doi.org/10.24432/C52G70.
[23]
MARTINEZ A, BENAVENTE R. The AR face database[R]. Columbus:Ohio State University, 1998:318-323.
[24]
GEORGHIADES A S, BELHUMEUR P N, KRIEGMAN D J. From few to many:illumination cone models for face recognition under variable lighting and pose[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(6):643-660.
[25]
GROSS R, MATTHEWS I, COHN J, et al. Multi-PIE[J].Image and Vision Computing, 2010, 28(5):807-813.
[26]
CHUA T S, TANG J H, HONG R C, et al. NUS-WIDE:a real-world web image database from National University of Singapore[C]//The 8th ACM International Conference on Image and Video Retrieval. Santorini:ACM, 2009:1-9.
[27]
陈圆圆.数据流概念漂移检测及自适应聚类算法研究[D].北京:北京交通大学, 2022.
[28]
CHEN J, WANG Z, YANG S X, et al. Two-stage sparse representation clustering for dynamic data streams[J]. IEEE Transactions on Cybernetics, 2023, 53(10):6408-6420.
[29]
CHEN J, YANG S X, FAHY C, et al. Online sparse representation clustering for evolving data streams[J]. IEEE Transactions on Neural Networks and Learning Systems, 2025, 36(1):525-539.
2025年第65卷第11期
PDF下载
92
44
引用本文
BibTeX
文章信息
doi: 10.20079/j.issn.1001-893x.240618002
  • 接收时间:2024-06-18
  • 首发时间:2026-04-15
  • 出版时间:2025-11-28
补充材料
相关文章
文章信息
作者
出版历史
  • 收稿日期:2024-06-18
  • 修回日期:2024-08-05
基金
作者信息
    1湖州师范学院 信息工程学院,浙江 湖州 313000
    2浙江省现代农业资源智慧管理与应用研究重点实验室,浙江 湖州 313000
    3湖州市水域机器人技术重点实验室,浙江 湖州 313000

通讯作者:

胡文军 Email:
参考文献
分享链接
https://castjournals.cast.org.cn/joweb/dxjs/CN/10.20079/j.issn.1001-893x.240618002
分享至
全文二维码

扫描看全文

引用本文
BibTeX
本文的引用情况
2种不同金属材料的力学参数

Family
属数
Number of
genus
种数
Number of
species
占总种数比例
Percentage of
total species (%)

Genus
种数
Number of
species
占总种数比例
Percentage of total
species (%)
鹅膏菌科Amanitaceae 2 11 5.26 鹅膏菌属 Amanita 10 4.78
小菇科 Mycenaceae 2 12 5.74 丝盖伞属 Inocybe 5 2.39
多孔菌科 Polyporaceae 8 14 6.70 蜡蘑属 Laccaria 5 2.39
红菇科 Russulaceae 3 23 11.00 小皮伞属 Marasmius 6 2.87
小菇属 Mycena 11 5.26
光柄菇属 Pluteus 5 2.39
红菇属 Russula 17 8.13
栓菌属 Trametes 5 2.39
关闭全屏