Article(id=1251893515534418500, tenantId=1146029695717560320, journalId=1251234473337991274, issueId=1251893504037831074, articleNumber=null, orderNo=null, doi=10.3969/j.issn.1003-3114.2025.05.014, pmid=null, cstr=null, oa=null, hot=null, price=null, onlineType=0, articleFormat=0, articleType=null, articleTypeStr=null, receivedDate=1746460800000, receivedDateStr=2025-05-06, revisedDate=null, revisedDateStr=null, acceptedDate=null, acceptedDateStr=null, onlineDate=1776404273159, onlineDateStr=2026-04-17, pubDate=1758124800000, pubDateStr=2025-09-18, doiRegisterDate=null, doiRegisterDateStr=null, onlineIssueDate=1776404273159, onlineIssueDateStr=2026-04-17, onlineJustAcceptDate=null, onlineJustAcceptDateStr=null, onlineFirstDate=null, onlineFirstDateStr=null, sourceXml=null, magXml=null, createTime=1776404273159, creator=13701087609, updateTime=1776404273159, updator=13701087609, issue=Issue{id=1251893504037831074, tenantId=1146029695717560320, journalId=1251234473337991274, year='2025', volume='51', issue='5', pageStart='877', pageEnd='1134', issueExtLink='null', onlineDate='null', pubDate='null', beforeIssueId=null, nextIssueId=null, price=null, status=1, issueComplete=1, articleOrder=1, issueType=1, specialIssue=null, createTime=1776404270419, creator=13701087609, updateTime=1776404832543, updator=13701087609, preIssue=null, nextIssue=null, ext={EN=IssueExt(id=1251895861849043019, tenantId=1146029695717560320, journalId=1251234473337991274, issueId=1251893504037831074, language=EN, specialIssueTitle=, coverIllustrator=null, specialIssueEditor=, specialIssueAbout=), CN=IssueExt(id=1251895861849043020, tenantId=1146029695717560320, journalId=1251234473337991274, issueId=1251893504037831074, language=CN, specialIssueTitle=, coverIllustrator=null, specialIssueEditor=, specialIssueAbout=)}, issueFiles=null}, startPage=1016, endPage=1024, ext={EN=ArticleExt(id=1251893516352307816, articleId=1251893515534418500, tenantId=1146029695717560320, journalId=1251234473337991274, language=EN, title=Multi-task Based Approach for Semantic Transfer of Images, columnId=1251893508886446519, journalTitle=Radio Communications Technology, columnName=Special Topic:Frontiers in Intelligent Communication, Storage, and Information Processing Technologies, runingTitle=null, highlight=null, articleAbstract=

In recent years, Transformer-based visual models (e. g. , Swin Transformer) show good prospects in visual tasks, however, these methods usually focus on reducing signal distortion between original and reconstructed data, while ignoring perceptual quality. Considering that the conventional Mean Square Error (MSE) loss fails to reflect perceptual and semantic quality effectively, we propose a weighted loss function combining MSE and Learned Perceptual Image Patch Similarity (LPIPS), and accordingly construct a Swin Transformer-based semantic communication framework, called Swin Transformer with LPIPS-based Joint Source-Channel Coding (STL-JSCC) method, which significantly enhances image reconstruction quality and semantic consistency. For performance evaluation, two semantic-aware metrics are introduced: the Images Semantic Deviation (ISD) value and Iamges Semantic Similarity(ISS). These indicators form a joint perceptual-semantic evaluation system, which breaks through the limitations of traditional evaluation methods. Experimental results show that the proposed STL-JSCC outperforms other models in all the indexes, verifying the significant potential and advantages of the proposed method in improving the image reconstruction quality and semantic extraction capability.

, correspAuthors=null, authorNote=null, correspAuthorsNote=null, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=null, magXml=null, pdfUrl=null, pdf=null, pdfFileSize=null, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=null, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=null, mapNumber=null, authorCompany=null, fund=null, authors=null, authorsList=Zhongdong WU, Bingkun GAN, Pengbo WANG, Jingcong GOU, Shangsi DING), CN=ArticleExt(id=1251893526427026339, articleId=1251893515534418500, tenantId=1146029695717560320, journalId=1251234473337991274, language=CN, title=基于多任务的图像语义传输方法, columnId=1251893509079384505, journalTitle=无线电通信技术, columnName=专题:智能通信、存储与信息处理技术前沿, runingTitle=null, highlight=null, articleAbstract=

近年来,基于Transformer的视觉模型,如Swin Transformer,在视觉任务中展现出良好的前景,然而这些方法通常侧重于减少原始数据与重建数据之间的信号失真,而忽略感知质量。针对传统均方误差(Mean Square Error,MSE)损失难以反映图像感知与语义质量的不足,设计了MSE与学习感知图像块相似度(Learned Perceptual Image Patch Similarity,LPIPS)的加权组合损失函数,从而构建基于Swin Transformer的语义通信框架,称为融合感知损失的联合信源信道编码(Swin Transformer with LPIPS-based Joint Source-Channel Coding,STL-JSCC)方法,显著提升了图像重建质量与语义还原能力。在性能评估方面,设计了图像语义偏差值(Images Semantic Deviation,ISD)与语义相似度(Images Semantic Similarity,ISS)2项指标,构建联合感知-语义评估体系,突破传统评价方法局限。实验结果表明,提出的STL-JSCC在各项指标上均优于其他模型,验证了所提方法在提升图像重建质量和语义提取能力上所具有的显著潜力和优势。

, correspAuthors=null, authorNote=null, correspAuthorsNote=null, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=MdmfEDVAlzABHa1gbjI1cw==, magXml=jSNvuMMACatHVx5HMNS+XQ==, pdfUrl=null, pdf=CkuyYM3dIV1vjy1zcQyrkQ==, pdfFileSize=6080136, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=c3kRmczCN9F2NgTUN/nlpA==, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=ByS0KKIsn8O6BOfVfw1/YA==, mapNumber=null, authorCompany=null, fund=null, authors=

伍忠东 男,(1968—),硕士,教授,硕士生导师。主要研究方向:深度学习、智能无线通信等。

甘炳坤 男,(2000—),硕士研究生。主要研究方向:语义通信等。

王鹏波 男,(2000—),硕士研究生。主要研究方向:深度学习、信号去噪及识别。

苟敬聪 男,(1998—),硕士研究生。主要研究方向:深度学习、信号去噪及分类。

丁尚思 男,(1999—),硕士研究生。主要研究方向:目标检测等。

, authorsList=伍忠东, 甘炳坤, 王鹏波, 苟敬聪, 丁尚思)}, authors=[Author(id=1251895515969958573, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, orderNo=0, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1251895516062233269, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, authorId=1251895515969958573, language=EN, stringName=Zhongdong WU, firstName=Zhongdong, middleName=null, lastName=WU, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1251895516146119352, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, authorId=1251895515969958573, language=CN, stringName=伍忠东, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=兰州交通大学 电子与信息工程学院,甘肃 兰州 730070, bio={"content":"

伍忠东 男,(1968—),硕士,教授,硕士生导师。主要研究方向:深度学习、智能无线通信等。

"}, bioImg=null, bioContent=

伍忠东 男,(1968—),硕士,教授,硕士生导师。主要研究方向:深度学习、智能无线通信等。

, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1251895514367734439, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, xref=null, ext=[AuthorCompanyExt(id=1251895514376123048, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, companyId=1251895514367734439, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China), AuthorCompanyExt(id=1251895514384511657, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, companyId=1251895514367734439, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=兰州交通大学 电子与信息工程学院,甘肃 兰州 730070)])]), Author(id=1251895516217422525, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, orderNo=1, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1251895516288725699, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, authorId=1251895516217422525, language=EN, stringName=Bingkun GAN, firstName=Bingkun, middleName=null, lastName=GAN, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1251895516343251656, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, authorId=1251895516217422525, language=CN, stringName=甘炳坤, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=兰州交通大学 电子与信息工程学院,甘肃 兰州 730070, bio={"content":"

甘炳坤 男,(2000—),硕士研究生。主要研究方向:语义通信等。

"}, bioImg=null, bioContent=

甘炳坤 男,(2000—),硕士研究生。主要研究方向:语义通信等。

, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1251895514367734439, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, xref=null, ext=[AuthorCompanyExt(id=1251895514376123048, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, companyId=1251895514367734439, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China), AuthorCompanyExt(id=1251895514384511657, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, companyId=1251895514367734439, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=兰州交通大学 电子与信息工程学院,甘肃 兰州 730070)])]), Author(id=1251895516435526349, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, orderNo=2, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1251895516511023826, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, authorId=1251895516435526349, language=EN, stringName=Pengbo WANG, firstName=Pengbo, middleName=null, lastName=WANG, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1251895516603298518, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, authorId=1251895516435526349, language=CN, stringName=王鹏波, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=兰州交通大学 电子与信息工程学院,甘肃 兰州 730070, bio={"content":"

王鹏波 男,(2000—),硕士研究生。主要研究方向:深度学习、信号去噪及识别。

"}, bioImg=null, bioContent=

王鹏波 男,(2000—),硕士研究生。主要研究方向:深度学习、信号去噪及识别。

, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1251895514367734439, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, xref=null, ext=[AuthorCompanyExt(id=1251895514376123048, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, companyId=1251895514367734439, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China), AuthorCompanyExt(id=1251895514384511657, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, companyId=1251895514367734439, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=兰州交通大学 电子与信息工程学院,甘肃 兰州 730070)])]), Author(id=1251895516699767515, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, orderNo=3, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1251895516800430820, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, authorId=1251895516699767515, language=EN, stringName=Jingcong GOU, firstName=Jingcong, middleName=null, lastName=GOU, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1251895516909482732, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, authorId=1251895516699767515, language=CN, stringName=苟敬聪, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=兰州交通大学 电子与信息工程学院,甘肃 兰州 730070, bio={"content":"

苟敬聪 男,(1998—),硕士研究生。主要研究方向:深度学习、信号去噪及分类。

"}, bioImg=null, bioContent=

苟敬聪 男,(1998—),硕士研究生。主要研究方向:深度学习、信号去噪及分类。

, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1251895514367734439, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, xref=null, ext=[AuthorCompanyExt(id=1251895514376123048, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, companyId=1251895514367734439, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China), AuthorCompanyExt(id=1251895514384511657, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, companyId=1251895514367734439, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=兰州交通大学 电子与信息工程学院,甘肃 兰州 730070)])]), Author(id=1251895516976591601, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, orderNo=4, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1251895517152752378, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, authorId=1251895516976591601, language=EN, stringName=Shangsi DING, firstName=Shangsi, middleName=null, lastName=DING, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1251895517245027071, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, authorId=1251895516976591601, language=CN, stringName=丁尚思, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=null, address=兰州交通大学 电子与信息工程学院,甘肃 兰州 730070, bio={"content":"

丁尚思 男,(1999—),硕士研究生。主要研究方向:目标检测等。

"}, bioImg=null, bioContent=

丁尚思 男,(1999—),硕士研究生。主要研究方向:目标检测等。

, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1251895514367734439, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, xref=null, ext=[AuthorCompanyExt(id=1251895514376123048, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, companyId=1251895514367734439, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China), AuthorCompanyExt(id=1251895514384511657, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, companyId=1251895514367734439, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=兰州交通大学 电子与信息工程学院,甘肃 兰州 730070)])])], keywords=[Keyword(id=1251895517421187851, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, orderNo=1, keyword=Swin Transformer), Keyword(id=1251895517509268239, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, orderNo=2, keyword=semantic communication), Keyword(id=1251895517588960020, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, orderNo=3, keyword=LPIPS), Keyword(id=1251895517664457500, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, orderNo=4, keyword=semantic evaluation), Keyword(id=1251895517744149282, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, orderNo=1, keyword=Swin Transformer), Keyword(id=1251895517861589801, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, orderNo=2, keyword=语义通信), Keyword(id=1251895517966447410, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, orderNo=3, keyword=学习感知图像块相似度), Keyword(id=1251895518088082234, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, orderNo=4, keyword=语义评估)], refs=[Reference(id=1251895525100957791, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2020, volume=19, issue=5, pageStart=3133, pageEnd=3143, url=null, language=null, rfNumber=[1], rfOrder=0, authorNames=YE H, LIANG L, LI G Y, journalName=IEEE Transactions on Wireless Communications, refType=null, unstructuredReference=YE H,LIANG L,LI G Y,et al. Deep Learning-based End-to-End Wireless Communication Systems with Conditional GANs as Unknown Channels[J]. IEEE Transactions on Wireless Communications,2020,19(5):3133-3143., articleTitle=Deep Learning-based End-to-End Wireless Communication Systems with Conditional GANs as Unknown Channels, refAbstract=null), Reference(id=1251895525193232489, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2024, volume=50, issue=3, pageStart=519, pageEnd=527, url=null, language=null, rfNumber=[2], rfOrder=1, authorNames=陈建侨, 马楠, 许晓东, journalName=无线电通信技术, refType=null, unstructuredReference=陈建侨,马楠,许晓东,.面向语义通信的信道知识库构建与信道处理研究综述[J].无线电通信技术, 2024,50(3):519-527., articleTitle=面向语义通信的信道知识库构建与信道处理研究综述, refAbstract=null), Reference(id=1251895525289701485, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=1948, volume=27, issue=3, pageStart=379, pageEnd=423, url=null, language=null, rfNumber=[3], rfOrder=2, authorNames=SHANNON C E, journalName=The Bell System Technical Journal, refType=null, unstructuredReference=SHANNON C E. A Mathematical Theory of Communication[J]. The Bell System Technical Journal, 1948, 27(3):379-423., articleTitle=A Mathematical Theory of Communication, refAbstract=null), Reference(id=1251895525377781875, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2010, volume=27, issue=6, pageStart=104, pageEnd=113, url=null, language=null, rfNumber=[4], rfOrder=3, authorNames=FRESIA M, PEREZ-CRUZ F, POOR H V, journalName=IEEE Signal Processing Magazine, refType=null, unstructuredReference=FRESIA M, PEREZ-CRUZ F, POOR H V, et al. Joint Source and Channel Coding[J]. IEEE Signal Processing Magazine,2010,27(6):104-113., articleTitle=Joint Source and Channel Coding, refAbstract=null), Reference(id=1251895525503611004, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2019, volume=5, issue=3, pageStart=567, pageEnd=579, url=null, language=null, rfNumber=[5], rfOrder=4, authorNames=BOURTSOULATE E, KURKA D B, GUNDUZ D, journalName=IEEE Transactions on Cognitive Communications and Networking, refType=null, unstructuredReference=BOURTSOULATE E, KURKA D B, GUNDUZ D. Deep Joint Source-channel Coding for Wireless Image Transmission[J]. IEEE Transactions on Cognitive Communications and Networking,2019,5(3):567-579., articleTitle=Deep Joint Source-channel Coding for Wireless Image Transmission, refAbstract=null), Reference(id=1251895525604274305, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2022, volume=3, issue=4, pageStart=720, pageEnd=731, url=null, language=null, rfNumber=[6], rfOrder=5, authorNames=TUNG T Y, KURKA D B, JANKOWSKI M, journalName=IEEE Journal on Selected Areas in Information Theory, refType=null, unstructuredReference=TUNG T Y, KURKA D B, JANKOWSKI M,et al. DeepJSCC-Q: Constellation Constrained Deep Joint Source Channel Coding[J]. IEEE Journal on Selected Areas in Information Theory,2022,3(4):720-731., articleTitle=DeepJSCC-Q: Constellation Constrained Deep Joint Source Channel Coding, refAbstract=null), Reference(id=1251895525776240776, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2022, volume=40, issue=9, pageStart=2570, pageEnd=2583, url=null, language=null, rfNumber=[7], rfOrder=6, authorNames=TUNG T Y, GUNDUZ D, journalName=IEEE Journal on Selected Areas in Communications, refType=null, unstructuredReference=TUNG T Y, GUNDUZ D. DeepWiVe: Deep-learning-aided Wireless Video Transmission[J]. IEEE Journal on Selected Areas in Communications,2022,40(9):2570-2583., articleTitle=DeepWiVe: Deep-learning-aided Wireless Video Transmission, refAbstract=null), Reference(id=1251895525881098382, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2022, volume=8, issue=2, pageStart=584, pageEnd=599, url=null, language=null, rfNumber=[8], rfOrder=7, authorNames=YANG M Y, BIAN C H, KIM H S, journalName=IEEE Transactions on Cognitive Communications and Networking, refType=null, unstructuredReference=YANG M Y,BIAN C H,KIM H S. OFDM-guided Deep Joint Source Channel Coding for Wireless Multipath Fading Channels[J]. IEEE Transactions on Cognitive Communications and Networking,2022,8(2):584-599., articleTitle=OFDM-guided Deep Joint Source Channel Coding for Wireless Multipath Fading Channels, refAbstract=null), Reference(id=1251895525990150296, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2021, volume=20, issue=12, pageStart=8081, pageEnd=8095, url=null, language=null, rfNumber=[9], rfOrder=8, authorNames=KURKA D B, GUNDUZ D, journalName=IEEE Transactions on Wireless Communications, refType=null, unstructuredReference=KURKA D B,GUNDUZ D. Bandwidth-agile Image Transmission with Deep Joint Source-channel Coding[J]. IEEE Transactions on Wireless Communications,2021,20(12):8081-8095., articleTitle=Bandwidth-agile Image Transmission with Deep Joint Source-channel Coding, refAbstract=null), Reference(id=1251895526090813598, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2022, volume=32, issue=4, pageStart=2315, pageEnd=2328, url=null, language=null, rfNumber=[10], rfOrder=9, authorNames=XU J L, AI B, C W, journalName=IEEE Transactions on Circuits and Systems for Video Technology, refType=null, unstructuredReference=XU J L,AI B,C W,et al. Wireless Image Transmission Using Deep Source Channel Coding with Attention Modules[J]. IEEE Transactions on Circuits and Systems for Video Technology,2022,32(4):2315-2328., articleTitle=Wireless Image Transmission Using Deep Source Channel Coding with Attention Modules, refAbstract=null), Reference(id=1251895526199865512, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2022, volume=8, issue=null, pageStart=60, pageEnd=73, url=null, language=null, rfNumber=[11], rfOrder=10, authorNames=ZHANG P, XU W J, GAO H, journalName=Engineering, refType=null, unstructuredReference=ZHANG P,XU W J,GAO H,et al. Toward Wisdom-evolutionary and Primitive-concise 6G:A New Paradigm of Semantic Communication Networks[J]. Engineering,2022, 8:60-73., articleTitle=Toward Wisdom-evolutionary and Primitive-concise 6G:A New Paradigm of Semantic Communication Networks, refAbstract=null), Reference(id=1251895526275362988, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2025-04-29, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[12], rfOrder=11, authorNames=YANG M Y, KIM H S, journalName=null, refType=null, unstructuredReference=YANG M Y,KIM H S. Deep Joint Source-channel Coding for Wireless Image Transmission with Adaptive Rate Control[EB/OL]. (2021-10-09)[2025-04-29]. https://arxiv.org/abs/2110.04456., articleTitle=Deep Joint Source-channel Coding for Wireless Image Transmission with Adaptive Rate Control, refAbstract=null), Reference(id=1251895526376026290, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2023, volume=22, issue=8, pageStart=5486, pageEnd=5501, url=null, language=null, rfNumber=[13], rfOrder=12, authorNames=ZHANG W Y, ZHANG H J, MA H, journalName=IEEE Transactions on Wireless Communications, refType=null, unstructuredReference=ZHANG W Y,ZHANG H J,MA H,et al. Predictive and Adaptive Deep Coding for Wireless Image Transmission in Semantic Communication[J]. IEEE Transactions on Wireless Communications,2023,22(8):5486-5501., articleTitle=Predictive and Adaptive Deep Coding for Wireless Image Transmission in Semantic Communication, refAbstract=null), Reference(id=1251895526493466811, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2022, volume=45, issue=1, pageStart=87, pageEnd=110, url=null, language=null, rfNumber=[14], rfOrder=13, authorNames=HAN K, WANG Y H, CHEN H T, journalName=IEEE Transactions on Pattern Analysis and Machine Intelligence, refType=null, unstructuredReference=HAN K,WANG Y H,CHEN H T,et al. A Survey on Vision Transformer[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(1):87-110., articleTitle=A Survey on Vision Transformer, refAbstract=null), Reference(id=1251895526577352897, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2023, volume=11, issue=null, pageStart=71528, pageEnd=71541, url=null, language=null, rfNumber=[15], rfOrder=14, authorNames=YOO H J, DAI L L, KIM S K, journalName=IEEE Access, refType=null, unstructuredReference=YOO H J,DAI L L,KIM S K,et al. On the Role of ViT and CNN in Semantic Communications:Analysis and Prototype Validation[J]. IEEE Access,2023,11:71528-71541., articleTitle=On the Role of ViT and CNN in Semantic Communications:Analysis and Prototype Validation, refAbstract=null), Reference(id=1251895526665433285, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2023, volume=37, issue=5, pageStart=223, pageEnd=229, url=null, language=null, rfNumber=[16], rfOrder=15, authorNames=刘铁, 段勇, journalName=电子测量与仪器学报, refType=null, unstructuredReference=刘铁,段勇.融合CNN和Transformer的机器人室内场景识别[J].电子测量与仪器学报,2023,37(5):223-229., articleTitle=融合CNN和Transformer的机器人室内场景识别, refAbstract=null), Reference(id=1251895526740930763, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2023, volume=null, issue=null, pageStart=1, pageEnd=5, url=null, language=null, rfNumber=[17], rfOrder=16, authorNames=YANG K, WANG S X, DAI J C, journalName=null, refType=null, unstructuredReference=YANG K,WANG S X,DAI J C,et al. WITT:A Wireless Image Transmission Transformer for Semantic Communications[C]//ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing. Rhodes Island:ICASSP,2023:1-5., articleTitle=WITT:A Wireless Image Transmission Transformer for Semantic Communications, refAbstract=null), Reference(id=1251895526833205457, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2022, volume=19, issue=null, pageStart=1, pageEnd=5, url=null, language=null, rfNumber=[18], rfOrder=17, authorNames=LIU X Y, WU Y, LIANG W K, journalName=IEEE Geoscience and Remote Sensing Letters, refType=null, unstructuredReference=LIU X Y,WU Y,LIANG W K,et al. High Resolution SAR Image Classification Using Global-local Network Structure Based on Vision Transformer and CNN[J]. IEEE Geoscience and Remote Sensing Letters,2022,19:1-5., articleTitle=High Resolution SAR Image Classification Using Global-local Network Structure Based on Vision Transformer and CNN, refAbstract=null), Reference(id=1251895526950645978, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2019, volume=null, issue=null, pageStart=675, pageEnd=685, url=null, language=null, rfNumber=[19], rfOrder=18, authorNames=BLAU Y, MICHAELI T, journalName=null, refType=null, unstructuredReference=BLAU Y,MICHAELI T. Rethinking Lossy Compression:The Rate-distortion-perception Tradeoff[C]//International Conference on Machine Learning. Long Beach: PMLR, 2019:675-685., articleTitle=Rethinking Lossy Compression:The Rate-distortion-perception Tradeoff, refAbstract=null), Reference(id=1251895527059697887, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2017, volume=null, issue=null, pageStart=4681, pageEnd=4690, url=null, language=null, rfNumber=[20], rfOrder=19, authorNames=LEDIG C, THEIS L, HUSZÁR F, journalName=null, refType=null, unstructuredReference=LEDIG C,THEIS L,HUSZÁR F,et al. Photo-realistic Single Image Super-resolution Using a Generative Adversarial Network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu:IEEE,2017:4681-4690., articleTitle=Photo-realistic Single Image Super-resolution Using a Generative Adversarial Network, refAbstract=null), Reference(id=1251895527156166886, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2018, volume=null, issue=null, pageStart=586, pageEnd=595, url=null, language=null, rfNumber=[21], rfOrder=20, authorNames=ZHANG R, ISOLA P, EFROS A A, journalName=null, refType=null, unstructuredReference=ZHANG R,ISOLA P,EFROS A A, et al. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City:IEEE, 2018:586-595., articleTitle=The Unreasonable Effectiveness of Deep Features as a Perceptual Metric, refAbstract=null), Reference(id=1251895527256830186, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2021, volume=null, issue=null, pageStart=9992, pageEnd=10002, url=null, language=null, rfNumber=[22], rfOrder=21, authorNames=LIU Z, LIN Y T, CAO Y, journalName=null, refType=null, unstructuredReference=LIU Z,LIN Y T,CAO Y,et al. Swin Transformer:Hierarchical Vision Transformer Using Shifted Windows[C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV). Montreal:IEEE,2021:9992-10002., articleTitle=Swin Transformer:Hierarchical Vision Transformer Using Shifted Windows, refAbstract=null), Reference(id=1251895527349104879, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2021, volume=129, issue=4, pageStart=1258, pageEnd=1281, url=null, language=null, rfNumber=[23], rfOrder=22, authorNames=DING K Y, MA K D, WANG S Q, journalName=International Journal of Computer Vision, refType=null, unstructuredReference=DING K Y,MA K D,WANG S Q,et al. Comparison of Full-reference Image Quality Models for Optimization of Image Processing Systems[J]. International Journal of Computer Vision,2021,129(4):1258-1281., articleTitle=Comparison of Full-reference Image Quality Models for Optimization of Image Processing Systems, refAbstract=null), Reference(id=1251895527449768182, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2003, volume=null, issue=null, pageStart=1398, pageEnd=1402, url=null, language=null, rfNumber=[24], rfOrder=23, authorNames=ZHOU W, SIMONCELLI E P, BOVIK A C, journalName=null, refType=null, unstructuredReference=ZHOU W,SIMONCELLI E P,BOVIK A C. Multiscale Structural Similarity for Image Quality Assessment[C]//The Thrity-Seventh Asilomar Conference on Signals,Systems & Computers. Pacific Grove:CIEEE,2003:1398-1402., articleTitle=Multiscale Structural Similarity for Image Quality Assessment, refAbstract=null), Reference(id=1251895527571403002, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2018, volume=null, issue=null, pageStart=1, pageEnd=2, url=null, language=null, rfNumber=[25], rfOrder=24, authorNames=ZHANG Z J, journalName=null, refType=null, unstructuredReference=ZHANG Z J. Improved Adam Optimizer for Deep Neural Networks[C]//Proceedings of 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS). Banff:IEEE,2018:1-2., articleTitle=Improved Adam Optimizer for Deep Neural Networks, refAbstract=null), Reference(id=1251895527672066303, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, doi=null, pmid=null, pmcid=null, year=2020, volume=1, issue=2, pageStart=79, pageEnd=null, url=null, language=null, rfNumber=[26], rfOrder=25, authorNames=THECKEDATH D, SEDAMKAR R R, journalName=SN Computer Science, refType=null, unstructuredReference=THECKEDATH D, SEDAMKAR R R. Detecting Affect States Using VGG16, ResNet50 and SE-ResNet50 Networks[J]. SN Computer Science,2020,1(2):79., articleTitle=Detecting Affect States Using VGG16, ResNet50 and SE-ResNet50 Networks, refAbstract=null)], funds=null, companyList=[AuthorCompany(id=1251895514367734439, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, xref=null, ext=[AuthorCompanyExt(id=1251895514376123048, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, companyId=1251895514367734439, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China), AuthorCompanyExt(id=1251895514384511657, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, companyId=1251895514367734439, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=兰州交通大学 电子与信息工程学院,甘肃 兰州 730070)])], figs=[ArticleFig(id=1251895518335546184, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, label=Fig. 1, caption=Taditional processing flow, figureFileSmall=cIVmB7B0hrk27hG4eRLl+A==, figureFileBig=c3kRmczCN9F2NgTUN/nlpA==, tableContent=null), ArticleFig(id=1251895518415237967, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, label=图1, caption=传统处理流程, figureFileSmall=cIVmB7B0hrk27hG4eRLl+A==, figureFileBig=c3kRmczCN9F2NgTUN/nlpA==, tableContent=null), ArticleFig(id=1251895518520095574, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, label=Fig. 2, caption=JSCC, figureFileSmall=0rxE4Dciu+X8NAjbYT6ovQ==, figureFileBig=21t61Tp6EmIDppbOSikTEw==, tableContent=null), ArticleFig(id=1251895518612370269, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, label=图2, caption=JSCC, figureFileSmall=0rxE4Dciu+X8NAjbYT6ovQ==, figureFileBig=21t61Tp6EmIDppbOSikTEw==, tableContent=null), ArticleFig(id=1251895518754976615, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, label=Fig. 3, caption=Encoder structure, figureFileSmall=7mHz8GxCWCLvJFjnr1u0MA==, figureFileBig=6kb+OIa35ZW5TiFsRBQhJA==, tableContent=null), ArticleFig(id=1251895518897582958, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, label=图3, caption=编码器结构, figureFileSmall=7mHz8GxCWCLvJFjnr1u0MA==, figureFileBig=6kb+OIa35ZW5TiFsRBQhJA==, tableContent=null), ArticleFig(id=1251895520466252661, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, label=Fig. 4, caption=Decodder structure, figureFileSmall=tEVXpTUw9fn/qL28VNJehg==, figureFileBig=furm6TR9S9iCtw82pGVt4g==, tableContent=null), ArticleFig(id=1251895520562721658, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, label=图4, caption=解码器结构, figureFileSmall=tEVXpTUw9fn/qL28VNJehg==, figureFileBig=furm6TR9S9iCtw82pGVt4g==, tableContent=null), ArticleFig(id=1251895520705327999, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, label=Fig. 5, caption=Swin Transformer block, figureFileSmall=lQDj9iveJVKz8tqCpTX8OQ==, figureFileBig=klcGJHj6k74kYkMUUfgPfw==, tableContent=null), ArticleFig(id=1251895520818574215, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, label=图5, caption=Swin Transformer模块, figureFileSmall=lQDj9iveJVKz8tqCpTX8OQ==, figureFileBig=klcGJHj6k74kYkMUUfgPfw==, tableContent=null), ArticleFig(id=1251895520969569164, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, label=Fig. 6, caption=Plot of PSNR vs. SNR for different channels on CIFAR10, figureFileSmall=LH65N6LwiKsSCsEv2ADHyw==, figureFileBig=YsPiXAdumFyk/sg7lbMjKw==, tableContent=null), ArticleFig(id=1251895521070232464, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, label=图6, caption=CIFAR10上不同信道下的PSNR与SNR关系, figureFileSmall=LH65N6LwiKsSCsEv2ADHyw==, figureFileBig=YsPiXAdumFyk/sg7lbMjKw==, tableContent=null), ArticleFig(id=1251895521162507157, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, label=Fig. 7, caption=Plot of PSNR vs. SNR for different channels on BIRDS-400, figureFileSmall=qjrnlkgwaVX8WPi9rUm1Lw==, figureFileBig=m4QSilSopFauI7x++rjGBw==, tableContent=null), ArticleFig(id=1251895521250587548, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, label=图7, caption=BIRDS-400上不同信道下的PSNR与SNR关系, figureFileSmall=qjrnlkgwaVX8WPi9rUm1Lw==, figureFileBig=m4QSilSopFauI7x++rjGBw==, tableContent=null), ArticleFig(id=1251895521338667940, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, label=Fig. 8, caption=Plot of MS-SSIM vs. SNR for different channels on CIFAR10, figureFileSmall=n2B81NhRO/AJ9DS6tSASow==, figureFileBig=QdrS5y2hDolIoK42z1Cejg==, tableContent=null), ArticleFig(id=1251895521430942634, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, label=图8, caption=CIFAR10上不同信道下的MS-SSIM与SNR关系, figureFileSmall=n2B81NhRO/AJ9DS6tSASow==, figureFileBig=QdrS5y2hDolIoK42z1Cejg==, tableContent=null), ArticleFig(id=1251895521527411633, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, label=Fig. 9, caption=Plot of MS-SSIM vs. SNR for different channels on BIRDS-400, figureFileSmall=TOfhUeL/+vopBRy9wz5ZaQ==, figureFileBig=yM5Anl5jiC9854ealRyf4w==, tableContent=null), ArticleFig(id=1251895521615492024, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, label=图9, caption=BIRDS-400上不同信道下的MS-SSIM与SNR关系, figureFileSmall=TOfhUeL/+vopBRy9wz5ZaQ==, figureFileBig=yM5Anl5jiC9854ealRyf4w==, tableContent=null), ArticleFig(id=1251895521732932547, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, label=Fig. 10, caption=Plot of VISD vs. SNR for different channels on CIFAR10, figureFileSmall=/jzubqOtpFdq/q9SCrl39w==, figureFileBig=s6tLME2Ul2KLCwe9dRZcUQ==, tableContent=null), ArticleFig(id=1251895521821012938, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, label=图10, caption=在CIFAR10上不同信道下的VISD与SNR关系, figureFileSmall=/jzubqOtpFdq/q9SCrl39w==, figureFileBig=s6tLME2Ul2KLCwe9dRZcUQ==, tableContent=null), ArticleFig(id=1251895521976202200, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, label=Fig. 11, caption=Plot of VISD vs. SNR for different channels on BIRDS-400, figureFileSmall=rHhD1cOqcZGOehLTKPpEEQ==, figureFileBig=ozqzhixuTrq8hB7yxsiicg==, tableContent=null), ArticleFig(id=1251895522139780071, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, label=图11, caption=在BIRDS-400上不同信道下的VISD与SNR关系, figureFileSmall=rHhD1cOqcZGOehLTKPpEEQ==, figureFileBig=ozqzhixuTrq8hB7yxsiicg==, tableContent=null), ArticleFig(id=1251895522273997811, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, label=Fig. 12, caption=Plot of VISS vs. SNR for different channels on CIFAR10, figureFileSmall=l37D7M3xIPggRoiQCrDsVQ==, figureFileBig=TXhtisfZQPh3HFSXTk1ctA==, tableContent=null), ArticleFig(id=1251895522353689596, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, label=图12, caption=在CIFAR10不同信道下的VISS与SNR关系, figureFileSmall=l37D7M3xIPggRoiQCrDsVQ==, figureFileBig=TXhtisfZQPh3HFSXTk1ctA==, tableContent=null), ArticleFig(id=1251895522450157571, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, label=Fig. 13, caption=Plot of VISS vs. SNR for different channels on BIRDS-400, figureFileSmall=vDJMUcJGPAWornnPmCoU4g==, figureFileBig=vueS3P/grz/FB631+MJHqQ==, tableContent=null), ArticleFig(id=1251895522559209481, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, label=图13, caption=在BIRDS-400上不同信道下的VISS与SNR关系, figureFileSmall=vDJMUcJGPAWornnPmCoU4g==, figureFileBig=vueS3P/grz/FB631+MJHqQ==, tableContent=null), ArticleFig(id=1251895522664067090, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, label=Tab. 1, caption=

Experimental results for different communication models on CIFAR10

, figureFileSmall=null, figureFileBig=null, tableContent=
模型信道PSNR/dBMS-SSIM/dBVISDVISS
DeepJSCCAWGN28.1310.88-0.370.77
Rayleigh24.998.92-0.300.64
WITTAWGN42.4929.37-1.022.56
Rayleigh34.7621.70-0.681.74
STL-JSCCAWGN44.1731.56-1.262.96
Rayleigh35.5222.60-0.741.88
), ArticleFig(id=1251895522768924700, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, label=表1, caption=

在CIFAR10上不同通信模型的实验结果

, figureFileSmall=null, figureFileBig=null, tableContent=
模型信道PSNR/dBMS-SSIM/dBVISDVISS
DeepJSCCAWGN28.1310.88-0.370.77
Rayleigh24.998.92-0.300.64
WITTAWGN42.4929.37-1.022.56
Rayleigh34.7621.70-0.681.74
STL-JSCCAWGN44.1731.56-1.262.96
Rayleigh35.5222.60-0.741.88
), ArticleFig(id=1251895522903142436, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, label=Tab. 2, caption=

Experimental results for different communication models on BIRDS-400

, figureFileSmall=null, figureFileBig=null, tableContent=
模型信道PSNR/dBMS-SSIM/dBVISDVISS
DeepJSCCAWGN31.1017.05-0.861.30
Rayleigh30.1915.86-0.821.24
WITTAWGN38.4523.60-1.152.20
Rayleigh32.5618.19-0.891.54
STL-JSCCAWGN40.3325.16-1.182.29
Rayleigh33.0418.86-0.911.56
), ArticleFig(id=1251895522987028522, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, label=表2, caption=

在BIRDS-400上不同通信模型的实验结果

, figureFileSmall=null, figureFileBig=null, tableContent=
模型信道PSNR/dBMS-SSIM/dBVISDVISS
DeepJSCCAWGN31.1017.05-0.861.30
Rayleigh30.1915.86-0.821.24
WITTAWGN38.4523.60-1.152.20
Rayleigh32.5618.19-0.891.54
STL-JSCCAWGN40.3325.16-1.182.29
Rayleigh33.0418.86-0.911.56
), ArticleFig(id=1251895523075108915, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, label=Tab. 3, caption=

Results of ablation experiments

, figureFileSmall=null, figureFileBig=null, tableContent=
数据集MSELPIPSPSNR/dBMS-SSIM/dBVISDVISS
CIFAR101.0042.4929.37-1.022.56
0.30.735.0120.31-0.621.43
0.50.544.1731.56-1.262.96
BIRDS-4001.0038.4523.60-1.152.20
0.30.737.8022.69-1.081.82
0.50.540.3325.16-1.182.29
), ArticleFig(id=1251895523200938044, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, label=表3, caption=

消融实验结果

, figureFileSmall=null, figureFileBig=null, tableContent=
数据集MSELPIPSPSNR/dBMS-SSIM/dBVISDVISS
CIFAR101.0042.4929.37-1.022.56
0.30.735.0120.31-0.621.43
0.50.544.1731.56-1.262.96
BIRDS-4001.0038.4523.60-1.152.20
0.30.737.8022.69-1.081.82
0.50.540.3325.16-1.182.29
), ArticleFig(id=1251895523289018437, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=EN, label=Tab. 4, caption=

Analysis results of complexity and real-time performance

, figureFileSmall=null, figureFileBig=null, tableContent=
图像尺寸/pixel深度FLOPs/G参数量/M推理时间/ms
32×3220.7713.7512.60
224×224369.7171.6628.31
), ArticleFig(id=1251895524891242573, tenantId=1146029695717560320, journalId=1251234473337991274, articleId=1251893515534418500, language=CN, label=表4, caption=

复杂度与实时性分析结果

, figureFileSmall=null, figureFileBig=null, tableContent=
图像尺寸/pixel深度FLOPs/G参数量/M推理时间/ms
32×3220.7713.7512.60
224×224369.7171.6628.31
)], attaches=null, journal=Journal(id=1251231494887223395, delFlag=0, nameCn=无线电通信技术, nameEn=Radio Communications Technology, nameHistory1=null, nameHistory2=null, issn=1003-3114, eissn=, cn=13-1099/TN, coden=null, periodic=1, language=CN, oaType=1, ccby=null, superviseOffice=null, ownerOffice=null, pubOffice=null, editorOffice=null, officeType=null, aims=null, clcCode=null, officeProv=null, officeCity=null, officeAddr=null, officeZip=null, officeEmail=, officePhone=, editDirector=null, officeDirector=null, officeDirectorPhone=null, officeStaffNum=null, officeEmpNum=null, coverPicUrl=veWCdfK9mJVXm/uFgI4wQA==, journalPrice=null, startedYear=null, abbrevIsoEn=Radio Communications Technology, journalRemark=null, publicationField=null, createdTime=1776246435141, updatedTime=1776397604574, createdBy=18614031015, updatedBy=13701087609, firstLetterCn=R, firstLetterEn=R, subjectCode=Engineering, subjectName=工程, subjectCodeEn=Engineering, subjectNameEn=null, picCn=veWCdfK9mJVXm/uFgI4wQA==, picEn=OSQVHuARoHUd1TQ4ONLQrQ==, jcr=null, cjcr=null, exts=[JournalExt(id=1251865545604285354, language=CN, name=无线电通信技术, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=, createdTime=1776397604609, updatedTime=1776397604609, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=https://wxdt.cbpt.cnki.net/index.aspx?t=1, submissionEditorUrl=https://wxdt.cbpt.cnki.net/index.aspx?t=3, submissionReviewUrl=https://wxdt.cbpt.cnki.net/index.aspx?t=2, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""}), JournalExt(id=1251865545646228395, language=EN, name=Radio Communications Technology, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=, createdTime=1776397604619, updatedTime=1776397604619, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=https://wxdt.cbpt.cnki.net/index.aspx?t=1, submissionEditorUrl=https://wxdt.cbpt.cnki.net/index.aspx?t=3, submissionReviewUrl=https://wxdt.cbpt.cnki.net/index.aspx?t=2, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""})], databaseList=null, tenantJournalId=1251234473337991274, websiteList=[Website(id=1251257283515203650, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1251234473337991274, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/wxdtxjs/CN, language=CN, createTime=1776252583627, createBy=18614031015, updateTime=1776253691546, updateBy=18614031015, name=无线电通信技术-中文, tplId=1146099689490845704, title=无线电通信技术, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1251262047678313076, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283515203650, code=articleTextType, value=kx, createTime=1776253719491, updateTime=1776253719491, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262047653147249, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283515203650, code=banner, value=null, createTime=1776253719485, updateTime=1776253719485, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262047707673207, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283515203650, code=grayFlag, value=0, createTime=1776253719498, updateTime=1776253719498, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262047644758640, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283515203650, code=logo, value=https://castjournals.cast.org.cn/joweb/wxdtxjs/CN/file/pic?fileId=sk5LMh+QbAm+98l18HjovQ==, createTime=1776253719483, updateTime=1776253719483, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262047720256121, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283515203650, code=minRunFlag, value=0, createTime=1776253719501, updateTime=1776253719501, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262047669924467, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283515203650, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/wxdtxjs/CN/file/pic, createTime=1776253719489, updateTime=1776253719489, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262047716061816, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283515203650, code=silenceFlag, value=0, createTime=1776253719500, updateTime=1776253719500, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262047661535858, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283515203650, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_cn_619/, createTime=1776253719487, updateTime=1776253719487, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262047682507381, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283515203650, code=themeColor, value=null, createTime=1776253719492, updateTime=1776253719492, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262047690895990, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283515203650, code=themeStyle, value=null, createTime=1776253719494, updateTime=1776253719494, creator=18614031015, updator=18614031015)]), Website(id=1251257283607478339, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1251234473337991274, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/wxdtxjs/EN, language=EN, createTime=1776252583648, createBy=18614031015, updateTime=1776253687916, updateBy=18614031015, name=无线电通信技术-英文, tplId=1146101810881728533, title=Radio Communications Technology, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1251262071707484468, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283607478339, code=articleTextType, value=kx, createTime=1776253725220, updateTime=1776253725220, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262071690707249, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283607478339, code=banner, value=null, createTime=1776253725216, updateTime=1776253725216, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262071724261687, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283607478339, code=grayFlag, value=0, createTime=1776253725224, updateTime=1776253725224, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262071682318640, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283607478339, code=logo, value=https://castjournals.cast.org.cn/joweb/wxdtxjs/EN/file/pic?fileId=sk5LMh+QbAm+98l18HjovQ==, createTime=1776253725214, updateTime=1776253725214, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262071732650297, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283607478339, code=minRunFlag, value=0, createTime=1776253725226, updateTime=1776253725226, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262071703290163, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283607478339, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/wxdtxjs/EN/file/pic, createTime=1776253725219, updateTime=1776253725219, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262071728455992, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283607478339, code=silenceFlag, value=0, createTime=1776253725225, updateTime=1776253725225, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262071694901554, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283607478339, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_en_623/, createTime=1776253725217, updateTime=1776253725217, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262071711678773, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283607478339, code=themeColor, value=null, createTime=1776253725221, updateTime=1776253725221, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262071720067382, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283607478339, code=themeStyle, value=null, createTime=1776253725223, updateTime=1776253725223, creator=18614031015, updator=18614031015)])], journalTitle=无线电通信技术, weixinUrl=null, journalUrl=https://wxdt.cbpt.cnki.net/, iacademicId=null, status=1, seqNo=null, journalTitleEn=Radio Communications Technology, journalPhotoCn=veWCdfK9mJVXm/uFgI4wQA==, journalPhotoEn=OSQVHuARoHUd1TQ4ONLQrQ==, journalFirstLetter=R, journalRecommend=null, journalNew=null, journalCollection=null, jcrJf=null, cjcrJf=null, jcrJfStr=null, cjcrJfStr=null, submissionFirstDecision=null, sciSubjectClassification=null, casSubjectClassification=null, citeScore=null, totalCitationFrequency=null, icpCode=null, psCode=null, advertisingLicenseCode=null, copyrightInformation=null, country=null, option=, provinceCode=null, provinceName=null, collectFlag=false), detailUrlCn=https://castjournals.cast.org.cn/joweb/wxdtxjs/CN/10.3969/j.issn.1003-3114.2025.05.014, detailUrlEn=https://castjournals.cast.org.cn/joweb/wxdtxjs/EN/10.3969/j.issn.1003-3114.2025.05.014, pdfUrlCn=https://castjournals.cast.org.cn/joweb/wxdtxjs/CN/PDF/10.3969/j.issn.1003-3114.2025.05.014, pdfUrlEn=https://castjournals.cast.org.cn/joweb/wxdtxjs/EN/PDF/10.3969/j.issn.1003-3114.2025.05.014, aliStartDate=null, aliEndDate=null, collectionFlag=false, citedCount=null, citedUrl=null, reference=null)
收藏切换
基于多任务的图像语义传输方法
收藏切换
PDF下载
伍忠东 , 甘炳坤 , 王鹏波 , 苟敬聪 , 丁尚思
无线电通信技术 | 专题:智能通信、存储与信息处理技术前沿 2025,51(5): 1016-1024
收起
收藏切换
无线电通信技术 | 专题:智能通信、存储与信息处理技术前沿 2025, 51(5): 1016-1024
基于多任务的图像语义传输方法
全屏
伍忠东, 甘炳坤, 王鹏波, 苟敬聪, 丁尚思
作者信息
  • 兰州交通大学 电子与信息工程学院,甘肃 兰州 730070
  • 伍忠东 男,(1968—),硕士,教授,硕士生导师。主要研究方向:深度学习、智能无线通信等。

    甘炳坤 男,(2000—),硕士研究生。主要研究方向:语义通信等。

    王鹏波 男,(2000—),硕士研究生。主要研究方向:深度学习、信号去噪及识别。

    苟敬聪 男,(1998—),硕士研究生。主要研究方向:深度学习、信号去噪及分类。

    丁尚思 男,(1999—),硕士研究生。主要研究方向:目标检测等。

Multi-task Based Approach for Semantic Transfer of Images
Zhongdong WU, Bingkun GAN, Pengbo WANG, Jingcong GOU, Shangsi DING
Affiliations
  • School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
出版时间: 2025-09-18 doi: 10.3969/j.issn.1003-3114.2025.05.014
文章导航
收藏切换

近年来,基于Transformer的视觉模型,如Swin Transformer,在视觉任务中展现出良好的前景,然而这些方法通常侧重于减少原始数据与重建数据之间的信号失真,而忽略感知质量。针对传统均方误差(Mean Square Error,MSE)损失难以反映图像感知与语义质量的不足,设计了MSE与学习感知图像块相似度(Learned Perceptual Image Patch Similarity,LPIPS)的加权组合损失函数,从而构建基于Swin Transformer的语义通信框架,称为融合感知损失的联合信源信道编码(Swin Transformer with LPIPS-based Joint Source-Channel Coding,STL-JSCC)方法,显著提升了图像重建质量与语义还原能力。在性能评估方面,设计了图像语义偏差值(Images Semantic Deviation,ISD)与语义相似度(Images Semantic Similarity,ISS)2项指标,构建联合感知-语义评估体系,突破传统评价方法局限。实验结果表明,提出的STL-JSCC在各项指标上均优于其他模型,验证了所提方法在提升图像重建质量和语义提取能力上所具有的显著潜力和优势。

Swin Transformer  /  语义通信  /  学习感知图像块相似度  /  语义评估

In recent years, Transformer-based visual models (e. g. , Swin Transformer) show good prospects in visual tasks, however, these methods usually focus on reducing signal distortion between original and reconstructed data, while ignoring perceptual quality. Considering that the conventional Mean Square Error (MSE) loss fails to reflect perceptual and semantic quality effectively, we propose a weighted loss function combining MSE and Learned Perceptual Image Patch Similarity (LPIPS), and accordingly construct a Swin Transformer-based semantic communication framework, called Swin Transformer with LPIPS-based Joint Source-Channel Coding (STL-JSCC) method, which significantly enhances image reconstruction quality and semantic consistency. For performance evaluation, two semantic-aware metrics are introduced: the Images Semantic Deviation (ISD) value and Iamges Semantic Similarity(ISS). These indicators form a joint perceptual-semantic evaluation system, which breaks through the limitations of traditional evaluation methods. Experimental results show that the proposed STL-JSCC outperforms other models in all the indexes, verifying the significant potential and advantages of the proposed method in improving the image reconstruction quality and semantic extraction capability.

Swin Transformer  /  semantic communication  /  LPIPS  /  semantic evaluation
伍忠东, 甘炳坤, 王鹏波, 苟敬聪, 丁尚思. 基于多任务的图像语义传输方法. 无线电通信技术, 2025 , 51 (5) : 1016 -1024 . DOI: 10.3969/j.issn.1003-3114.2025.05.014
Zhongdong WU, Bingkun GAN, Pengbo WANG, Jingcong GOU, Shangsi DING. Multi-task Based Approach for Semantic Transfer of Images[J]. Radio Communications Technology, 2025 , 51 (5) : 1016 -1024 . DOI: 10.3969/j.issn.1003-3114.2025.05.014
近年来,人工智能和通信技术的融合催生了一种新的通信范式——语义通信。作为一种基于深度学习的端到端传输系统[1],语义通信是一种新型通信范式,旨在传输信息时不仅保留数据的原始内容,更注重其语义信息的有效传递[2]。与传统通信系统(基于Shannon分离定理[3],分离源编码和信道编码的方法)不同,语义通信通过联合信源信道编码(Joint Source-Channel Coding,JSCC)实现端到端优化,强调在复杂信道条件下保持信息的语义一致性,从而提升传输效率和重建质量[4]
在图像传输任务中,近年来一种基于深度学习的端到端优化的JSCC方法——DeepJSCC,成为语义通信领域研究的热点方向。该方法通过神经网络对图像编码和信道映射过程进行联合学习,在保证语义保真度的同时显著增强了通信系统在复杂环境下的鲁棒性[5-11]。例如,Bourtsoulate等[5]开创性地使用DeepJSCC实现端到端图像传输,后续研究在信道自适应[12]、信噪比联合优化[13]等方面持续改进。针对卷积神经网络(Convolutional Neural Network,CNN)局部感受野的局限,Vision Transforme[14]在视觉任务中展现出独特优势:Yoo等[15]首次将ViT的架构引入无线图像传输(SemViT),在语义通信领域取得较为出色的成果。后续对ViT网络进行改良,使得在图像处理任务中有很大的提升[16]。随之,Yang等[17]采用Swin Transformer构建无线图像传输转换器(Wireless Image Transmission Transformer,WITT)。相比传统卷积网络,Transformer架构能有效捕捉全局语义特征,显著提升复杂场景的传输性能。
虽然上述方法取得了显著进展,可以保证较好的语义通信效果,但是优化目标仍存在很多局限。
①编解码器。基于Swin Transformer的编码器虽然具备良好的局部特征提取能力,但在建模图像全局依赖方面存在一定局限性[18]
②模型训练。现有工作普遍以MSE、峰值信噪比(Peak Signal to Noise Ratio,PSNR)、结构相似性指数(Structural Similarity Index,SSIM)等传统失真度量作为优化目标,但这些指标仅衡量像素级保真度,与人类视觉系统的感知质量(Perceptual Quality)相关性较弱[19]。过度优化MSE甚至会导致解码图像出现语义信息失真[20]
③评价指标。当前语义通信系统常依赖传统评估指标或特定任务指标,前者无法反映语义保真度,后者则因任务差异性难以通用化,以至于评估体系不完善。
针对以上问题,本文构建了用于语义通信的感知-失真联合优化的通信架构:在信号表征层面,设计了MSE与LPIPS[21]的加权组合损失函数,通过像素精度与感知相似度的自适应加权机制,实现低阶纹理特征与高阶语义特征的协同优化,有效提升了图像语义重建效果;在系统评估层面,提出联合感知-语义的多任务评估体系,以ISD量化语义信息传递的保真度,以ISS评估任务相关的语义一致性。结合基于传统指标的质量评估,一定程度上统一了语义通信系统“保真重建”“任务适配”的评估范式,可更全面、准确地度量图像语义通信系统在不同任务中的表现。
本文利用联合源信道编码方法,构建的端到端图像语义通信系统,包含编码器、信道、解码器3个核心模块。传统方法通常采用“先源编码、后信道编码”的分离式架构,如图1所示。发送端首先对输入图像进行特征提取和语义编码,然后经过独立的信道编码模块进行调制处理,最终将编码后的信息传输至信道;接收端则依次完成信道解码和语义解码;接收端则经信道解码器逆处理后再由语义解码器根据接收到的语义信息进行图像重建。如图2所示,本文使用的方法是编码器、解码器模块将语义编码和信道编码过程结合,以实现语义编码与信道编码高效协同。
用于无线图像传输的STR-JSCC网络架构中实现无线图像传输的语义-信道联合编码器如图3所示,编码器采用多级特征压缩:先通过补丁嵌入(Patch Embedding,PE)模块将输入的RGB图像xRH×W×3划分为不重叠的补丁P1∈(H/2)×(W/2)。再通过补丁合并(Patch Merging,PM)模块拼接和线性降采样,以及级联Swin Transformer模块进行特征变换,以保持特征分辨率。编码过程分多个阶段,每个阶段集成PM模块和对应数量的Swin Transformer模块、嵌入维度和注意力头数。随着阶段推进,图像特征分辨率呈指数级下降,不同分辨率图像所需阶段数量亦有所不同。一般来说,更高分辨率的图像需要更多的阶段数量。编码过程如下:
式中:R为传输图片集,Eθ为可训练的编码器,输入图像由向量xRH×W×3表示,SNR为通道信噪比,R=K/(H×W×3)表示带宽压缩比。
编码器输出的语义特征输入到通道ModNet中,经过全连接层处理,与信噪比对应的m维特征向量连续融合。经非线性激活函数Sigmoid函数处理后,将特征与输入特征的残差相结合,从而得到调制后的输出。调制信号在经过功率归一化处理后被分解为I/Q分量,并通过无线信道进行传输,信号在此过程中受到噪声干扰。
本文主要关注2类典型的信道模型:加性高斯白噪声(Additive White Gaussian Noise,AWGN)信道与瑞利衰落信道。信号在含噪信道中的传输过程可以建模为:
式中:h表示服从瑞利分布的信道增益,n为AWGN。对于复数信道而言,n的实部和虚部均独立服从正态分布。
解码器Dψ与编码器Eθ具有对称结构,如图4所示。通过补丁划分(Patch Reverse Merging,PRM)模块的线性上采样与逆向特征变换重建图像。
输出的重建图像表示为:
训练优化以端到端MSE最小化为目标,联合约束编码器与解码器的网络参数θψ,平衡压缩效率与重建质量:
式中:MSE()表示原始图像和重建图像的失真测量。
传统的图像语义通信模型通常仅依赖MSE来更新整个网络模型,但这种方法存在一个明显的问题,即在上采样过程中,图像的细节冗余丢失,从而导致重建图像的语义信息出现失真。为了尽可能地重建出与原图像相似的图像,采用学习到的LPIPS损失[21]与MSE组合成的加权损失,作为图像语义通信模型的目标函数,即:
式中:γ1γ2为MSE和LPIPS损失的权重。这种加权损失策略能够更有效地捕捉图像的语义特征,从而提升重建图像的整体质量与语义一致性。
图5所示,Swin Transformer模块采用了窗口多头自注意力(Windowed Multi-head Self-Attention,W-MSA)和移位窗口多头注意力(Shifted Window Multi-head Self-Attention,SW-MSA),以交替的方式成对使用[22]。具体而言,W-MSA将图像划分为多个不相交的局部窗口,并在每个窗口内执行自注意力计算,有效减少计算复杂度。而SW-MSA则通过窗口偏移操作实现跨窗口的信息交互,弥补W-MSA仅在局部窗口内计算的限制,使模型能够建模更长距离的全局依赖关系。
每个Swin Transformer模块内部还包含残差连接和前馈神经网络多层感知器(Multilayer Perceptron,MLP)模块,用于提升模型和特征表达能力。对于2个连续Swin Transformer模块的计算如下所示:
式中:yk分别表示W-MSA模块、SW-MSA和MLP模块的输出特征。LN(Layer Normalization)表示层归一化。
在JSCC算法和其他基准方案中,均采用PSNR和MS-SSIM对图像重建质量进行客观评价[23],在此基础上引入ISD和ISS指标对原图像与重建图像进行语义层级上的评估,从而构建完善的评估体系,对模型的通信效果进行全面评估。
PSNR通过计算原始图像与重建图像之间的均方误差来量化图像之间的差异,PSNR的值越高,重建图像的质量越高:
MS-SSIM是一种多尺度结构相似性方法,通过对多个尺度的图像进行比较,从而获取全面的结构信息评估图像质量[24]。相比于PSNR,MS-SSIM考虑了图像的结构信息,能够更好地反映人眼对图像质量的感知,其数值越大表示图像的结构相似度越高,即对图像的感知质量越高。为了更直观地观察比较,将其转换为分贝(dB)的形式:
VISD旨在衡量重建图像相较于原图像分类结果的准确性。利用图像分类器对原图像和重建图像进行识别分类,并根据识别结果系统性地计算准确率。假设数据集有N个类别,计算如下:
式中:Pix)和Pix′)分别为原图像x的真实类别和重建图像x′的预测类别。VICA为重建图像的相对分类准确率,当VISD越小,分类准确率越高,说明通过该方法传输语义信息的准确率越高;反之则说明其不能准确地传输语义信息。
VISS旨在利用特征提取任务的特性,通过语义特征角度来评估重建图像与原始图像的语义一致性。具体方法是对原始图像和重建图像进行特征提取,并计算二者特征向量的余弦距离,从而客观衡量它们在语义空间中的相似程度:
式中:cos()表示余弦距离计算,fx)和fx′)分别表示对原图像x和重建图像x′进行特征提取得到的特征向量。当VISS值较高时,表明二者的特征相似度高,即语义更加接近;反之,则意味着重建图像无法正确还原原图像的语义特征。
实验环境:基于PyTorch 2.0.1+CUDA 11.8,使用Python 3.10。
数据集:使用2个公开图像数据集进行训练与评估。第一个为公开的CIFAR-10图像数据集,每张图片为3×32 pixel×32 pixel的RGB图片,包含了10个不同类别,每个类别有6 000张图像,其中5 000张用于训练,1 000张用于测试。训练时,数据会被翻转,以扩展10倍,避免过拟合。
为验证模型在高分辨率图像上的表现,本文引入了BIRDS-400数据集。该数据集包含400个鸟类类别,图像分辨率为224 pixel×224 pixel,类别细粒度划分明显。训练与测试样本按照标准划分使用,用于评估模型在复杂语义和高分辨率场景下的性能。
实验采用DeepJSCC[5]、WITT[17]作为对照的基准模型,使用Adam优化器[25],学习率设置为0.000 1,数据批次大小为128,信噪比为0~25 dB,5步长分布。
实验采用PSNR、MS-SSIM作为评估指标,以评估实验所用模型的通信质量,实验结果如图6图9所示。
图6图7可以看出,随SNR增加,各模型的PSNR值均呈上升趋势,说明更高的SNR有助于提升图像重建质量;相比于其他模型,STL-JSCC在全SNR范围内始终保持最高的PSNR值,特别是在高SNR条件下,其性能优势更为明显,表明其具有更高的通信质量。此外,在AWGN信道下,STLJSCC的性能显著优于Rayleigh信道,其PSNR上升速度更快,表明其具有更强的鲁棒性。
图8图9所示,STL-JSCC在高SNR条件下的MS-SSIM值明显高于WITT,表明其在细节结构还原方面的能力更强;而在低SNR环境下,二者性能相近,说明在恶劣信道下二者均具备一定鲁棒性。特别是在高SNR下的AWGN信道中,STL-JSCC的MS-SSIM达到较高的水平,表明其在稳定信道条件下能够实现较高质量的图像重建。
为了评估重建图像与原图之间的语义一致性,本文采用迁移学习策略,使用具有预训练权重的深度神经网络ResNet50[26],训练得到基于CIFAR10数据集预训练的深度神经网络作为公共图像分类器;冻结其特征层作为特征提取器以实现有效的特征提取,以此作为语义一致性评价的基础。
表1表2所示,在不同数据集上,SNR=25 dB时,本文方法在AWGN和Rayleigh信道下的图像重建质量和语义一致性指标均优于对比模型。
具体而言,在CIFAR10数据集上,AWGN信道下的实验结果显示PSNR=44.17 dB、MS-SSIM=31.56 dB,VISD=-1.26、VISS=2.96,表明图像重建质量的提升在语义层面表现为更低的ISD和更高的语义相似度。
在高分辨率的BIRDS-400数据集上,尽管整体提升幅度相对较小,如PSNR=40.33 dB、MSSSIM=25.16 dB,VISD=-1.18,VISS=2.29,仍可看出ISD降低、ISS上升的趋势,与CIFAR-10数据集上的实验结果保持一致,说明所提出的语义评估指标在不同分辨率和图像复杂度下依然具备良好的一致性与稳定性,能够对模型实现更全面的评估。
为了验证模型的语义信息传输准确率,通过图像分类任务衡量语义传输的一致性。不同的通信模型在不同信道条件下的性能表现如图10图11所示。可以看出,随着SNR增加,各模型的ISD整体呈下降趋势。相比于WITT,STL-JSCC在不同SNR条件下均展现出更优的语义保持能力,且显著优于DeepJSCC。
无论是在AWGN信道还是在Rayleigh信道上,STL-JSCC在图像分类任务上的语义表现均优于其他基准模型,其ISD始终保持较低水平,说明该模型在图像重建过程中更好地保留了原图的语义信息。这一结果充分验证了STL-JSCC在保持图像语义信息方面的优越性和稳定性。
为了进一步验证模型能够更好地还原图像的语义,通过特征提取任务对原图像和重建图像的语义特征进行比较,如图12图13所示。对比了不同模型在不同信道、SNR条件下的特征相似度的变化,为了方便比较,计算余弦距离的对数并取正值作为实验指标。可以看出,随着SNR增加,各模型语义相似度整体呈上升趋势,其中STL-JSCC的语义相似度提升最为显著,明显优于其他基准模型,而后者的提升幅度相对较小。在SNR≥15 dB时,STLJSCC在不同信道中均取得最高的语义相似度,表现最为优异。值得注意的是,在Rayleigh信道下STLJSCC与WITT模型之间的性能差距有所缩小,但整体语义相似度明显低于AWGN信道,反映出信道衰落对语义传输的影响。各模型在不同条件下的表现趋势均较为稳定,进一步验证了STL-JSCC在复杂条件下的鲁棒性和语义保持能力。
为验证本文所引入加权损失函数的必要性及语义评估指标的有效性与合理性,在SNR=25 dB和AWGN信道条件下,开展了不同权重分配的消融实验,结果如表3所示。
实验结果表明,传统基于MSE的训练目标侧重于像素级误差的最小,难以保持语义一致性。引入感知损失后,尽管PSNR提升有限,但VISS和MSSSIM显著改善,说明该损失组合有助于提升语义一致性与人眼感知质量。合理分配MSE与感知损失的权重是提升语义通信系统性能的关键,纯MSE损失虽能优化传统指标,但语义保持能力不足;而过度依赖感知损失会导致重建质量下降。通过均衡权重设计,STL-JSCC模型在PSNR、MS-SSIM传统指标与VISDVISS语义指标上均取得最优表现,证实了加权损失函数的合理性和必要性。
以上消融实验充分证明,本文提出的感知损失设计在提升语义通信模型的图像还原质量和语义保真度方面具有显著优势。此外,LPIPS损失在低分辨率(32 pixel×32 pixel)场景中对语义和感知质量的提升更为显著,在高分辨率(224 pixel×224 pixel)场景中,其边际增益相对较小。这可能是由于高分辨率图像中结构细节更多,模型已具备更强的重建能力,感知损失的优势相对弱化。
为全面评估所提出的基于Swin Transformer的STJSCC通信模型在不同分辨率场景下的计算复杂度与实时性表现,本文分别对低分辨率(32 pixel×32 pixel)和高分辨率(224 pixel×224 pixel)输入图像进行了建模与统计分析。由于模型在不同输入尺度下采用了分辨率自适应结构,编码器与解码器的实际深度与通道数均有所调整,因而在复杂度与延迟方面也表现出显著差异,结果如表4所示。在低分辨率(32 pixel×32 pixel)输入条件下,模型的总的每秒浮点运算数(Floating Point Operations Per Second,FLOPs)约为0.77 G,参数量约为13.75 M,平均单张图像推理时间为12.60 ms。而在高分辨率(224 pixel×224 pixel)输入条件下,随着模型结构进一步加深,FLOPs提升至69.71 G,参数量为71.66 M,平均推理时间为28.31 ms。
上述结果表明,STJSCC模型在不同输入分辨率下均具有良好的计算效率与实时性能。
针对传统Transformer模块依赖的传统MSE损失仅关注像素层级的误差,难以感知语义信息,本文设计了感知-失真联合优化的通信架构STL-JSCC,构建基于MSE与LPIPS的加权组合损失函数,兼顾低级像素一致性与高级语义结构一致性,显著提升了图像重建的真实感与语义保真度。为了更全面地衡量图像语义传输质量,提出了结合ISD与ISS的评估方法,构建联合感知-语义的评估体系,从而突破了传统评估指标仅关注视觉质量的限制。实验结果表明,在AWGN与Rayleigh信道中,STL-JSCC在PSNR、MS-SSIM、VISDVISS等指标上全面优于现有方案,验证了所提方法在图像语义保真和传输鲁棒性方面的优势。
未来研究尝试提升低信噪比条件下的传输质量,还将探索有效的模型压缩技术,通过降低发送端和接收端的计算开销,在提升通信性能的同时缩减时延,确保语义通信系统的实时性。
参考文献 引证文献
排序方式:
[1]
YE H,LIANG L,LI G Y,et al. Deep Learning-based End-to-End Wireless Communication Systems with Conditional GANs as Unknown Channels[J]. IEEE Transactions on Wireless Communications,2020,19(5):3133-3143.
[2]
陈建侨,马楠,许晓东,.面向语义通信的信道知识库构建与信道处理研究综述[J].无线电通信技术, 2024,50(3):519-527.
[3]
SHANNON C E. A Mathematical Theory of Communication[J]. The Bell System Technical Journal, 1948, 27(3):379-423.
[4]
FRESIA M, PEREZ-CRUZ F, POOR H V, et al. Joint Source and Channel Coding[J]. IEEE Signal Processing Magazine,2010,27(6):104-113.
[5]
BOURTSOULATE E, KURKA D B, GUNDUZ D. Deep Joint Source-channel Coding for Wireless Image Transmission[J]. IEEE Transactions on Cognitive Communications and Networking,2019,5(3):567-579.
[6]
TUNG T Y, KURKA D B, JANKOWSKI M,et al. DeepJSCC-Q: Constellation Constrained Deep Joint Source Channel Coding[J]. IEEE Journal on Selected Areas in Information Theory,2022,3(4):720-731.
[7]
TUNG T Y, GUNDUZ D. DeepWiVe: Deep-learning-aided Wireless Video Transmission[J]. IEEE Journal on Selected Areas in Communications,2022,40(9):2570-2583.
[8]
YANG M Y,BIAN C H,KIM H S. OFDM-guided Deep Joint Source Channel Coding for Wireless Multipath Fading Channels[J]. IEEE Transactions on Cognitive Communications and Networking,2022,8(2):584-599.
[9]
KURKA D B,GUNDUZ D. Bandwidth-agile Image Transmission with Deep Joint Source-channel Coding[J]. IEEE Transactions on Wireless Communications,2021,20(12):8081-8095.
[10]
XU J L,AI B,C W,et al. Wireless Image Transmission Using Deep Source Channel Coding with Attention Modules[J]. IEEE Transactions on Circuits and Systems for Video Technology,2022,32(4):2315-2328.
[11]
ZHANG P,XU W J,GAO H,et al. Toward Wisdom-evolutionary and Primitive-concise 6G:A New Paradigm of Semantic Communication Networks[J]. Engineering,2022, 8:60-73.
[12]
YANG M Y,KIM H S. Deep Joint Source-channel Coding for Wireless Image Transmission with Adaptive Rate Control[EB/OL]. (2021-10-09)[2025-04-29]. https://arxiv.org/abs/2110.04456.
[13]
ZHANG W Y,ZHANG H J,MA H,et al. Predictive and Adaptive Deep Coding for Wireless Image Transmission in Semantic Communication[J]. IEEE Transactions on Wireless Communications,2023,22(8):5486-5501.
[14]
HAN K,WANG Y H,CHEN H T,et al. A Survey on Vision Transformer[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(1):87-110.
[15]
YOO H J,DAI L L,KIM S K,et al. On the Role of ViT and CNN in Semantic Communications:Analysis and Prototype Validation[J]. IEEE Access,2023,11:71528-71541.
[16]
刘铁,段勇.融合CNN和Transformer的机器人室内场景识别[J].电子测量与仪器学报,2023,37(5):223-229.
[17]
YANG K,WANG S X,DAI J C,et al. WITT:A Wireless Image Transmission Transformer for Semantic Communications[C]//ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing. Rhodes Island:ICASSP,2023:1-5.
[18]
LIU X Y,WU Y,LIANG W K,et al. High Resolution SAR Image Classification Using Global-local Network Structure Based on Vision Transformer and CNN[J]. IEEE Geoscience and Remote Sensing Letters,2022,19:1-5.
[19]
BLAU Y,MICHAELI T. Rethinking Lossy Compression:The Rate-distortion-perception Tradeoff[C]//International Conference on Machine Learning. Long Beach: PMLR, 2019:675-685.
[20]
LEDIG C,THEIS L,HUSZÁR F,et al. Photo-realistic Single Image Super-resolution Using a Generative Adversarial Network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu:IEEE,2017:4681-4690.
[21]
ZHANG R,ISOLA P,EFROS A A, et al. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City:IEEE, 2018:586-595.
[22]
LIU Z,LIN Y T,CAO Y,et al. Swin Transformer:Hierarchical Vision Transformer Using Shifted Windows[C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV). Montreal:IEEE,2021:9992-10002.
[23]
DING K Y,MA K D,WANG S Q,et al. Comparison of Full-reference Image Quality Models for Optimization of Image Processing Systems[J]. International Journal of Computer Vision,2021,129(4):1258-1281.
[24]
ZHOU W,SIMONCELLI E P,BOVIK A C. Multiscale Structural Similarity for Image Quality Assessment[C]//The Thrity-Seventh Asilomar Conference on Signals,Systems & Computers. Pacific Grove:CIEEE,2003:1398-1402.
[25]
ZHANG Z J. Improved Adam Optimizer for Deep Neural Networks[C]//Proceedings of 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS). Banff:IEEE,2018:1-2.
[26]
THECKEDATH D, SEDAMKAR R R. Detecting Affect States Using VGG16, ResNet50 and SE-ResNet50 Networks[J]. SN Computer Science,2020,1(2):79.
2025年第51卷第5期
PDF下载
128
62
引用本文
BibTeX
文章信息
doi: 10.3969/j.issn.1003-3114.2025.05.014
  • 接收时间:2025-05-06
  • 首发时间:2026-04-17
  • 出版时间:2025-09-18
补充材料
相关文章
文章信息
作者
出版历史
  • 收稿日期:2025-05-06
基金
作者信息
    兰州交通大学 电子与信息工程学院,甘肃 兰州 730070
参考文献
分享链接
https://castjournals.cast.org.cn/joweb/wxdtxjs/CN/10.3969/j.issn.1003-3114.2025.05.014
分享至
全文二维码

扫描看全文

引用本文
BibTeX
本文的引用情况
2种不同金属材料的力学参数

Family
属数
Number of
genus
种数
Number of
species
占总种数比例
Percentage of
total species (%)

Genus
种数
Number of
species
占总种数比例
Percentage of total
species (%)
鹅膏菌科Amanitaceae 2 11 5.26 鹅膏菌属 Amanita 10 4.78
小菇科 Mycenaceae 2 12 5.74 丝盖伞属 Inocybe 5 2.39
多孔菌科 Polyporaceae 8 14 6.70 蜡蘑属 Laccaria 5 2.39
红菇科 Russulaceae 3 23 11.00 小皮伞属 Marasmius 6 2.87
小菇属 Mycena 11 5.26
光柄菇属 Pluteus 5 2.39
红菇属 Russula 17 8.13
栓菌属 Trametes 5 2.39
关闭全屏