Article(id=1242137252592165315, tenantId=1146029695717560320, journalId=1146031591421210625, issueId=1242137233701024038, articleNumber=null, orderNo=14, doi=10.3981/j.issn.1000-7857.2018.23.012, pmid=null, cstr=null, oa=null, hot=null, price=null, onlineType=0, articleFormat=0, articleType=null, articleTypeStr=null, receivedDate=1539014400000, receivedDateStr=2018-10-09, revisedDate=1542643200000, revisedDateStr=2018-11-20, acceptedDate=null, acceptedDateStr=null, onlineDate=1545106494520, onlineDateStr=2018-12-18, pubDate=1544630400000, pubDateStr=2018-12-13, doiRegisterDate=null, doiRegisterDateStr=null, onlineIssueDate=1545106494520, onlineIssueDateStr=2018-12-18, onlineJustAcceptDate=null, onlineJustAcceptDateStr=null, onlineFirstDate=null, onlineFirstDateStr=null, sourceXml=null, magXml=null, createTime=1774078198809, creator=sys-migrate, updateTime=1774078198809, updator=sys-migrate, issue=Issue{id=1242137233701024038, tenantId=1146029695717560320, journalId=1146031591421210625, year='2018', volume='36', issue='23', pageStart='1', pageEnd='104', issueExtLink='null', onlineDate='null', pubDate='null', beforeIssueId=null, nextIssueId=null, price=null, status=1, issueComplete=1, articleOrder=3, issueType=-1, specialIssue=null, createTime=1774078194304, creator=sys-migrate, updateTime=1774078194304, updator=sys-migrate, preIssue=null, nextIssue=null, ext=null, issueFiles=null}, startPage=93, endPage=101, ext={EN=ArticleExt(id=1242137256610312592, articleId=1242137252592165315, tenantId=1146029695717560320, journalId=1146031591421210625, language=EN, title=A large scale social networking community detection prototype system based on Spark, columnId=1242116080374710456, journalTitle=Science & Technology Review, columnName=Articles, runingTitle=null, highlight=null, articleAbstract=In order to effectively explore the user information in large-scale social networks and improve the understanding of the relationship between users, a community detection prototype system based on Spark is designed and developed. The ActiveMQ is used to acquire a large amount of the user data, taking advantage of the naive Bayesian algorithm provided by Spark-based MLlib to clean the user data, and using the PageRank algorithm provided by Spark-based GraphX and the Z-Score algorithm provided by MLlib to calculate the user ranking. In the prototype system, the LPA algorithm is finally used and optimized, to group the users of similar features and close ties into the same community quickly, as a foundation for further analysis and utilization of the community user data., correspAuthors=null, authorNote=null, correspAuthorsNote=10.3981/j.issn.1000-7857.2018.23.012, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=null, magXml=null, pdfUrl=null, pdf=NxFDVfac2pGAQLXTSHWAog==, pdfFileSize=1127576, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=null, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=null, mapNumber=null, authorCompany=1. Institute of Scientific and Technical Information of China, Beijing 100038, China;
2. KNET Co., Ltd., Beijing 100190, China, fund=null, authors=YE Xiaorong1, SHAO Qing2, authorsList=YE Xiaorong, SHAO Qing), CN=ArticleExt(id=1242137255788228997, articleId=1242137252592165315, tenantId=1146029695717560320, journalId=1146031591421210625, language=CN, title=基于Spark的大规模社交网络社区发现原型系统, columnId=1146540929516700224, journalTitle=科技导报, columnName=研究论文, runingTitle=null, highlight=null, articleAbstract=为有效发掘大规模社交网络上的用户信息,提高对用户之间关系的深入了解,设计开发了基于Spark的大规模社交网络社区发现原型系统。系统利用ActiveMQ实现对大量用户数据的抓取,使用基于Spark的MLlib提供的朴素贝叶斯算法对用户数据进行清洗,利用Spark的GraphX提供的PageRank算法和MLlib提供的Z-Score算法计算用户排名,最终应用并优化LPA算法,将特征相近、联系较密切的用户快速地划分到同一社区中,为进一步分析利用社区用户数据打下了基础。, correspAuthors=null, authorNote=叶小榕,高级工程师,研究方向为计算机软件、数字图书馆,电子信箱:yeelfine@sina.com, correspAuthorsNote=null, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=null, magXml=null, pdfUrl=null, pdf=XXWvpTlICX730bmqUl9tVg==, pdfFileSize=1127576, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=null, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=null, mapNumber=null, authorCompany=1. 中国科学技术信息研究所, 北京 100038;
2. 北龙中网(北京)科技有限责任公司, 北京 100190, fund=null, authors=叶小榕1, 邵晴2, authorsList=叶小榕, 邵晴)}, authors=null, keywords=[Keyword(id=1242137255381381505, tenantId=1146029695717560320, journalId=1146031591421210625, articleId=1242137252592165315, language=CN, orderNo=1, keyword=Spark), Keyword(id=1242137255477850498, tenantId=1146029695717560320, journalId=1146031591421210625, articleId=1242137252592165315, language=CN, orderNo=1, keyword=GraphX), Keyword(id=1242137255557542275, tenantId=1146029695717560320, journalId=1146031591421210625, articleId=1242137252592165315, language=CN, orderNo=1, keyword=MLlib), Keyword(id=1242137255641428356, tenantId=1146029695717560320, journalId=1146031591421210625, articleId=1242137252592165315, language=CN, orderNo=1, keyword=社区发现), Keyword(id=1242137256237019529, tenantId=1146029695717560320, journalId=1146031591421210625, articleId=1242137252592165315, language=EN, orderNo=1, keyword=Spark), Keyword(id=1242137256304128395, tenantId=1146029695717560320, journalId=1146031591421210625, articleId=1242137252592165315, language=EN, orderNo=1, keyword=GraphX), Keyword(id=1242137256379625869, tenantId=1146029695717560320, journalId=1146031591421210625, articleId=1242137252592165315, language=EN, orderNo=1, keyword=MLlib), Keyword(id=1242137256455123342, tenantId=1146029695717560320, journalId=1146031591421210625, articleId=1242137252592165315, language=EN, orderNo=1, keyword=community detection)], refs=null, funds=null, companyList=null, figs=null, attaches=null, journal=Journal(id=1125356956822126595, delFlag=0, nameCn=科技导报, nameEn=Science & Technology Review, nameHistory1=null, nameHistory2=null, issn=1000-7857, eissn=, cn=11-1421/N, coden=null, periodic=3, language=CN, oaType=0, ccby=null, superviseOffice=null, ownerOffice=null, pubOffice=null, editorOffice=null, officeType=null, aims=null, clcCode=null, officeProv=null, officeCity=null, officeAddr=null, officeZip=null, officeEmail=null, officePhone=null, editDirector=null, officeDirector=null, officeDirectorPhone=null, officeStaffNum=null, officeEmpNum=null, coverPicUrl=aEuqdCNQUjPEKa3rm5A/8Q==, journalPrice=null, startedYear=null, abbrevIsoEn=Sci Technol Rev, journalRemark=null, publicationField=null, createdTime=null, updatedTime=1754267492363, createdBy=null, updatedBy=13701087609, firstLetterCn=S, firstLetterEn=S, subjectCode=Natural Sciences, subjectName=自然科学, subjectCodeEn=Natural Sciences, subjectNameEn=null, picCn=aEuqdCNQUjPEKa3rm5A/8Q==, picEn=4AIQ9/oc3H8lvjeELJ6WWw==, jcr=null, cjcr=null, exts=[JournalExt(id=1159045127382855686, language=CN, name=科技导报, nameHistory1=null, nameHistory2=null, managedBy=中国科学技术协会, sponsoredBy=中国科学技术协会, publishedBy=科技导报社, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=null, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=http://www.kjdb.org/CN/home, createdTime=1754267492385, updatedTime=1754267492385, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=http://www.kjdb.org/CN/column/column7.shtml, submissionAuthorUrl=https://kjdbauthor.cast.org.cn/webm, submissionEditorUrl=https://kjdbeditor.cast.org.cn/webm/, submissionReviewUrl=https://kjdbauthor.cast.org.cn/webm, submissionCeEditorUrl=https://kjdbeditor.cast.org.cn/webm/, submissionAeEditorUrl=https://kjdbeditor.cast.org.cn/webm/, option={"copyright":""}), JournalExt(id=1159045127433187335, language=EN, name=Science & Technology Review, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=null, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=http://www.kjdb.org/EN/home, createdTime=1754267492398, updatedTime=1754267492398, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=http://www.kjdb.org/EN/column/column7.shtml, submissionAuthorUrl=https://kjdbauthor.manuscriptcloud.com/login, submissionEditorUrl=https://kjdbeditor.manuscriptcloud.com/login, submissionReviewUrl=https://kjdbauthor.manuscriptcloud.com/login, submissionCeEditorUrl=https://kjdbeditor.manuscriptcloud.com/login, submissionAeEditorUrl=https://kjdbeditor.manuscriptcloud.com/login, option={"copyright":""})], databaseList=null, tenantJournalId=1146031591421210625, websiteList=[Website(id=1146104741081231361, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1146031591421210625, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/kjdb/CN, language=CN, createTime=1751182263881, createBy=18614031015, updateTime=1751778001962, updateBy=18614031015, name=科技导报, tplId=1146099689490845704, title=科技导报, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1148021146403992296, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1146104741081231361, code=articleTextType, value=kx, createTime=1751639170504, updateTime=1751639170504, creator=18614031015, updator=18614031015), WebsiteProps(id=1148021146378826469, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1146104741081231361, code=banner, value=null, createTime=1751639170498, updateTime=1751639170498, creator=18614031015, updator=18614031015), WebsiteProps(id=1148021146366243556, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1146104741081231361, code=logo, value=https://castjournals.cast.org.cn/joweb/kjdb/CN/file/pic?fileId=9GHSf7eGlIPH0Tv/OOdstA==, createTime=1751639170495, updateTime=1751639170495, creator=18614031015, updator=18614031015), WebsiteProps(id=1148021146395603687, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1146104741081231361, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/kjdb/CN/file/pic, createTime=1751639170502, updateTime=1751639170502, creator=18614031015, updator=18614031015), WebsiteProps(id=1148021146387215078, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1146104741081231361, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_cn_619/, createTime=1751639170500, updateTime=1751639170500, creator=18614031015, updator=18614031015)]), Website(id=1146105254833139715, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1146031591421210625, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/kjdb/EN, language=EN, createTime=1751182386363, createBy=18614031015, updateTime=1753500121937, updateBy=18614031015, name=科技导报, tplId=1146101810881728533, title=Science & Technology Review, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1155838567709528217, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1146105254833139715, code=articleTextType, value=kx, createTime=1753502988984, updateTime=1753502988984, creator=18614031015, updator=18614031015), WebsiteProps(id=1155838567692750998, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1146105254833139715, code=banner, value=null, createTime=1753502988980, updateTime=1753502988980, creator=18614031015, updator=18614031015), WebsiteProps(id=1155838567688556693, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1146105254833139715, code=logo, value=https://castjournals.cast.org.cn/joweb/kjdb/EN/file/pic?fileId=9GHSf7eGlIPH0Tv/OOdstA==, createTime=1753502988979, updateTime=1753502988979, creator=18614031015, updator=18614031015), WebsiteProps(id=1155838567705333912, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1146105254833139715, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/kjdb/EN/file/pic, createTime=1753502988983, updateTime=1753502988983, creator=18614031015, updator=18614031015), WebsiteProps(id=1155838567701139607, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1146105254833139715, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_en_623/, createTime=1753502988982, updateTime=1753502988982, creator=18614031015, updator=18614031015)])], journalTitle=科技导报, weixinUrl=null, journalUrl=null, iacademicId=null, status=0, seqNo=null, journalTitleEn=Science & Technology Review, journalPhotoCn=aEuqdCNQUjPEKa3rm5A/8Q==, journalPhotoEn=4AIQ9/oc3H8lvjeELJ6WWw==, journalFirstLetter=S, journalRecommend=null, journalNew=null, journalCollection=1, jcrJf=null, cjcrJf=0.91, jcrJfStr=null, cjcrJfStr=null, submissionFirstDecision=null, sciSubjectClassification=null, casSubjectClassification=null, citeScore=null, totalCitationFrequency=null, icpCode=null, psCode=null, advertisingLicenseCode=null, copyrightInformation=null, country=null, option=null, provinceCode=null, provinceName=null, collectFlag=false), detailUrlCn=https://castjournals.cast.org.cn/joweb/kjdb/CN/10.3981/j.issn.1000-7857.2018.23.012, detailUrlEn=https://castjournals.cast.org.cn/joweb/kjdb/EN/10.3981/j.issn.1000-7857.2018.23.012, pdfUrlCn=https://castjournals.cast.org.cn/joweb/kjdb/CN/PDF/10.3981/j.issn.1000-7857.2018.23.012, pdfUrlEn=https://castjournals.cast.org.cn/joweb/kjdb/EN/PDF/10.3981/j.issn.1000-7857.2018.23.012, aliStartDate=null, aliEndDate=null, collectionFlag=false, citedCount=null, citedUrl=null, reference=null)
收藏切换
基于Spark的大规模社交网络社区发现原型系统
收藏切换
PDF下载
科技导报 | 研究论文 2018,36(23): 93-101
收起
收藏切换
科技导报 | 研究论文 2018, 36(23): 93-101
基于Spark的大规模社交网络社区发现原型系统
全屏
叶小榕1, 邵晴2
作者信息
    1. 中国科学技术信息研究所, 北京 100038;
    2. 北龙中网(北京)科技有限责任公司, 北京 100190
A large scale social networking community detection prototype system based on Spark
Affiliations
出版时间: 2018-12-13 doi: 10.3981/j.issn.1000-7857.2018.23.012
文章导航
收藏切换
为有效发掘大规模社交网络上的用户信息,提高对用户之间关系的深入了解,设计开发了基于Spark的大规模社交网络社区发现原型系统。系统利用ActiveMQ实现对大量用户数据的抓取,使用基于Spark的MLlib提供的朴素贝叶斯算法对用户数据进行清洗,利用Spark的GraphX提供的PageRank算法和MLlib提供的Z-Score算法计算用户排名,最终应用并优化LPA算法,将特征相近、联系较密切的用户快速地划分到同一社区中,为进一步分析利用社区用户数据打下了基础。
Spark  /  GraphX  /  MLlib  /  社区发现
In order to effectively explore the user information in large-scale social networks and improve the understanding of the relationship between users, a community detection prototype system based on Spark is designed and developed. The ActiveMQ is used to acquire a large amount of the user data, taking advantage of the naive Bayesian algorithm provided by Spark-based MLlib to clean the user data, and using the PageRank algorithm provided by Spark-based GraphX and the Z-Score algorithm provided by MLlib to calculate the user ranking. In the prototype system, the LPA algorithm is finally used and optimized, to group the users of similar features and close ties into the same community quickly, as a foundation for further analysis and utilization of the community user data.
Spark  /  GraphX  /  MLlib  /  community detection
叶小榕, 邵晴. 基于Spark的大规模社交网络社区发现原型系统. 科技导报, 2018 , 36 (23) : 93 -101 . DOI: 10.3981/j.issn.1000-7857.2018.23.012
YE Xiaorong, SHAO Qing. A large scale social networking community detection prototype system based on Spark[J]. Science & Technology Review, 2018 , 36 (23) : 93 -101 . DOI: 10.3981/j.issn.1000-7857.2018.23.012
2018年第36卷第23期
PDF下载
214
17
引用本文
BibTeX
文章信息
doi: 10.3981/j.issn.1000-7857.2018.23.012
  • 接收时间:2018-10-09
  • 首发时间:2018-12-18
  • 出版时间:2018-12-13
补充材料
相关文章
文章信息
作者
出版历史
  • 收稿日期:2018-10-09
  • 修回日期:2018-11-20
基金
作者信息
参考文献
分享链接
https://castjournals.cast.org.cn/joweb/kjdb/CN/10.3981/j.issn.1000-7857.2018.23.012
分享至
全文二维码

扫描看全文

引用本文
BibTeX
本文的引用情况
2种不同金属材料的力学参数

Family
属数
Number of
genus
种数
Number of
species
占总种数比例
Percentage of
total species (%)

Genus
种数
Number of
species
占总种数比例
Percentage of total
species (%)
鹅膏菌科Amanitaceae 2 11 5.26 鹅膏菌属 Amanita 10 4.78
小菇科 Mycenaceae 2 12 5.74 丝盖伞属 Inocybe 5 2.39
多孔菌科 Polyporaceae 8 14 6.70 蜡蘑属 Laccaria 5 2.39
红菇科 Russulaceae 3 23 11.00 小皮伞属 Marasmius 6 2.87
小菇属 Mycena 11 5.26
光柄菇属 Pluteus 5 2.39
红菇属 Russula 17 8.13
栓菌属 Trametes 5 2.39
关闭全屏