Article(id=1253994382005297722, tenantId=1146029695717560320, journalId=1251234571887362144, issueId=1253994374069682477, articleNumber=null, orderNo=null, doi=10.20249/j.cnki.2096-5974.2025.05.004, pmid=null, cstr=null, oa=null, hot=null, price=null, onlineType=0, articleFormat=0, articleType=null, articleTypeStr=null, receivedDate=null, receivedDateStr=null, revisedDate=null, revisedDateStr=null, acceptedDate=null, acceptedDateStr=null, onlineDate=1776905158758, onlineDateStr=2026-04-23, pubDate=null, pubDateStr=null, doiRegisterDate=null, doiRegisterDateStr=null, onlineIssueDate=1776905158758, onlineIssueDateStr=2026-04-23, onlineJustAcceptDate=null, onlineJustAcceptDateStr=null, onlineFirstDate=null, onlineFirstDateStr=null, sourceXml=null, magXml=null, createTime=1776905158758, creator=13041195026, updateTime=1776905158758, updator=13041195026, issue=Issue{id=1253994374069682477, tenantId=1146029695717560320, journalId=1251234571887362144, year='2025', volume='8', issue='5', pageStart='1', pageEnd='114', issueExtLink='null', onlineDate='null', pubDate='null', beforeIssueId=null, nextIssueId=null, price=null, status=1, issueComplete=1, articleOrder=1, issueType=1, specialIssue=null, createTime=1776905156856, creator=13041195026, updateTime=1777355348378, updator=13041195026, preIssue=null, nextIssue=null, ext={EN=IssueExt(id=1255882614297248384, tenantId=1146029695717560320, journalId=1251234571887362144, issueId=1253994374069682477, language=EN, specialIssueTitle=, coverIllustrator=null, specialIssueEditor=, specialIssueAbout=), CN=IssueExt(id=1255882614301442689, tenantId=1146029695717560320, journalId=1251234571887362144, issueId=1253994374069682477, language=CN, specialIssueTitle=, coverIllustrator=null, specialIssueEditor=, specialIssueAbout=)}, issueFiles=null}, startPage=34, endPage=43, ext={EN=ArticleExt(id=1253994382366007869, articleId=1253994382005297722, tenantId=1146029695717560320, journalId=1251234571887362144, language=EN, title=Online Imaging Scheduling Method of Multiple Satellites for Ground Moving Target Observation, columnId=1253994380264669540, journalTitle=Flight Control & Detection, columnName=Navigation, Guidance and Control Technology, runingTitle=null, highlight=null, articleAbstract=

Aiming at the problem of multiple satellites collaborative planning with unobservable ground moving targets,this paper studies the online imaging scheduling method based on improved deep reinforcement learning algorithms. First,based on the satellite imaging coverage calculation method,an imaging strip partitioning algorithm is proposed. Second,based on the visible time window,strip and satellite orbit information,a multiple satellites cooperative observation mission planning model based on the partially observable Markov decision process (POMDP)is established. Then,a proximal policy optimization algorithm with an action mask and advantage normalization mechanism is proposed,which improves the convergence rate of the algorithm for solving the partial strip coverage task area scheduling problem. Finally,the correctness and superiority of the proposed algorithm are verified by three sets of simulations.

, correspAuthors=null, authorNote=null, correspAuthorsNote=null, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=null, magXml=null, pdfUrl=null, pdf=null, pdfFileSize=null, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=null, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=null, mapNumber=null, authorCompany=null, fund=null, authors=null, authorsList=Yunwen XIONG, Yi LI, Caisheng WEI), CN=ArticleExt(id=1253994406164488961, articleId=1253994382005297722, tenantId=1146029695717560320, journalId=1251234571887362144, language=CN, title=面向地面移动目标观测的多星成像在线调度方法, columnId=1253994380436636006, journalTitle=飞控与探测, columnName=导航制导与控制技术, runingTitle=null, highlight=null, articleAbstract=

针对地面移动目标信息不可观的多星成像协同规划问题,开展了基于改进深度强化学习的在线成像调度方法研究。首先,基于卫星成像覆盖计算方法设计了一种成像条带划分算法;其次,基于可见时间窗口、条带、卫星轨道信息,建立了基于部分可观马尔可夫决策过程(Partially Observable Markov Decision Process,POMDP)的多星协同观测任务规划模型;然后,提出了一种融合动作掩码与优势度归一化机制的在线近端策略优化强化学习算法,提升了算法对求解部分条带覆盖任务区域调度问题的收敛速率;最后,通过3组仿真验证了所提出算法对在线求解该问题的正确性与优越性。

, correspAuthors=null, authorNote=null, correspAuthorsNote=
通信作者简介:魏才盛,男,博士,教授,博士生导师。
, copyrightStatement=null, copyrightOwner=null, extLink=null, articleAbsUrl=null, sourceXml=xkTDnWQH+W75nqLbl9gLrA==, magXml=+tYbrHWcyPEaVUubu5Ir3g==, pdfUrl=null, pdf=p3nAuzNeTrTPvJIdgSsIcA==, pdfFileSize=2153316, pdfExtLink=null, richHtmlUrl=null, mobilePdfUrl=null, reviewReport=null, pdfFirstPage=null, abstractGraph=3EieYwMR6ucwdb+pthicLg==, abstractGraphContent=null, abstractVideo=null, citation=null, cebUrl=null, magXmlContent=RINQ4RpiXS42nk9q8BN9ZA==, mapNumber=null, authorCompany=null, fund=null, authors=

熊韫文,男,博士生。

, authorsList=熊韫文, 李毅, 魏才盛)}, authors=[Author(id=1253994406953018122, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, orderNo=0, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1253994407137567500, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, authorId=1253994406953018122, language=EN, stringName=Yunwen XIONG, firstName=Yunwen, middleName=null, lastName=XIONG, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1School of Automation, Central South University, Changsha 410083, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1253994408731403021, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, authorId=1253994406953018122, language=CN, stringName=熊韫文, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1中南大学 自动化学院·长沙·410083, bio={"content":"

熊韫文,男,博士生。

"}, bioImg=null, bioContent=

熊韫文,男,博士生。

, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1253994406651028227, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, xref=1, ext=[AuthorCompanyExt(id=1253994406680388356, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, companyId=1253994406651028227, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1School of Automation, Central South University, Changsha 410083), AuthorCompanyExt(id=1253994406739108613, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, companyId=1253994406651028227, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1中南大学 自动化学院·长沙·410083)])]), Author(id=1253994408882397967, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, orderNo=1, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1253994409033392913, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, authorId=1253994408882397967, language=EN, stringName=Yi LI, firstName=Yi, middleName=null, lastName=LI, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2China Satellite Network Application Co., Ltd., Beijing 100001, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1253994409138250514, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, authorId=1253994408882397967, language=CN, stringName=李毅, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=2, address=2中国星网网络应用研究院有限公司·北京·100001, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1253994406831383302, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, xref=2, ext=[AuthorCompanyExt(id=1253994406839771911, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, companyId=1253994406831383302, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2China Satellite Network Application Co., Ltd., Beijing 100001), AuthorCompanyExt(id=1253994406864937736, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, companyId=1253994406831383302, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2中国星网网络应用研究院有限公司·北京·100001)])]), Author(id=1253994409251496724, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, orderNo=2, firstName=null, middleName=null, lastName=null, nameCn=null, orcid=null, stid=null, country=null, authorPic=null, dead=0, email=null, emailSecond=null, emailThird=null, correspondingAuthor=0, authorType=1, ext={EN=AuthorExt(id=1253994409339577110, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, authorId=1253994409251496724, language=EN, stringName=Caisheng WEI, firstName=Caisheng, middleName=null, lastName=WEI, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1School of Automation, Central South University, Changsha 410083, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null), CN=AuthorExt(id=1253994409436046103, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, authorId=1253994409251496724, language=CN, stringName=魏才盛, firstName=null, middleName=null, lastName=null, prefix=null, suffix=null, authorComment=null, nameInitials=null, affiliation=null, department=null, xref=1, address=1中南大学 自动化学院·长沙·410083, bio=null, bioImg=null, bioContent=null, aboutCorrespAuthor=null)}, companyList=[AuthorCompany(id=1253994406651028227, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, xref=1, ext=[AuthorCompanyExt(id=1253994406680388356, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, companyId=1253994406651028227, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1School of Automation, Central South University, Changsha 410083), AuthorCompanyExt(id=1253994406739108613, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, companyId=1253994406651028227, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1中南大学 自动化学院·长沙·410083)])])], keywords=[Keyword(id=1253994409553486616, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, orderNo=1, keyword=moving target), Keyword(id=1253994409658344217, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, orderNo=2, keyword=imaging satellite scheduling), Keyword(id=1253994409729647386, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, orderNo=3, keyword=partially observable Markov decision process(POMDP)), Keyword(id=1253994409817727771, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, orderNo=4, keyword=deep reinforcement learning), Keyword(id=1253994409884836636, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, orderNo=1, keyword=移动目标), Keyword(id=1253994409998082845, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, orderNo=2, keyword=成像卫星调度), Keyword(id=1253994410077774622, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, orderNo=3, keyword=部分可观马尔可夫决策过程(POMDP)), Keyword(id=1253994410170049311, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, orderNo=4, keyword=深度强化学习)], refs=[Reference(id=1253994414683120444, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2001, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[1], rfOrder=0, authorNames=总装备部, journalName=卫星应用现状与发展, refType=null, unstructuredReference=总装备部.卫星应用现状与发展[M].北京:中国科学技术出版社,2001., articleTitle=null, refAbstract=null), Reference(id=1253994414775395133, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2001, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[1], rfOrder=1, authorNames=null, journalName=The General Equipment DepartmentCurrent status and development of satellite applications, refType=null, unstructuredReference=The General Equipment DepartmentCurrent status and development of satellite applications[M].Beijing:Science and Technology of China Press,2001 (in Chinese)., articleTitle=null, refAbstract=null), Reference(id=1253994414876058430, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2003, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[2], rfOrder=2, authorNames=王永刚, 刘玉文, journalName=军事卫星及应用概论, refType=null, unstructuredReference=王永刚,刘玉文.军事卫星及应用概论[M].北京:国防工业出版社,2003., articleTitle=null, refAbstract=null), Reference(id=1253994414934778687, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2003, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[2], rfOrder=3, authorNames=WANG Y G, LIU Y W, journalName=Introduction to military sat ellites and applications, refType=null, unstructuredReference=WANG Y G,LIU Y W.Introduction to military sat ellites and applications[M].Beijing:National Defense Industry Press,2003(in Chinese)., articleTitle=null, refAbstract=null), Reference(id=1253994415014470464, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=1994, volume=94, issue=null, pageStart=307, pageEnd=312, url=null, language=null, rfNumber=[3], rfOrder=4, authorNames=VERFAILLIE G, SCHIEX T, journalName=null, refType=null, unstructuredReference=VERFAILLIE G,SCHIEX T.Solution reuse in dynamic constraint satisfaction problems[C]//Proceedings of the Twelfth AAAI National Conference on Artificial Intelligence,Seattle:AAAI.1994,94:307-312., articleTitle=Solution reuse in dynamic constraint satisfaction problems, refAbstract=null), Reference(id=1253994415098356545, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2005, volume=56, issue=8, pageStart=962, pageEnd=968, url=null, language=null, rfNumber=[4], rfOrder=5, authorNames=CORDEAU J F, LAPORTE G, journalName=Journal of the Operational Research Society, refType=null, unstructuredReference=CORDEAU J F,LAPORTE G.Maximizing the value of an Earth observation satellite orbit[J].Journal of the Operational Research Society,2005,56(8):962-968., articleTitle=Maximizing the value of an Earth observation satellite orbit, refAbstract=null), Reference(id=1253994415165465410, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2010, volume=31, issue=2, pageStart=457, pageEnd=465, url=null, language=null, rfNumber=[5], rfOrder=6, authorNames=冉承新, 王慧林, 熊纲要, journalName=宇航学报, refType=null, unstructuredReference=冉承新,王慧林,熊纲要,.基于改进遗传算法的移动目标成像侦测任务规划问题研究[J].宇航学报, 2010,31(2):457-465., articleTitle=基于改进遗传算法的移动目标成像侦测任务规划问题研究, refAbstract=null), Reference(id=1253994415245157187, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2010, volume=31, issue=2, pageStart=457, pageEnd=465, url=null, language=null, rfNumber=[5], rfOrder=7, authorNames=RAN C X, WANG H L, XIONG G Y, journalName=Journal of Astronautics, refType=null, unstructuredReference=RAN C X,WANG H L,XIONG G Y,et al.Research on mission-planing of ocean moving targets imaging reconnaissance based on improved genetic algorithm[J]. Journal of Astronautics,2010,31(2):457-465(in Chinese)., articleTitle=Research on mission-planing of ocean moving targets imaging reconnaissance based on improved genetic algorithm, refAbstract=null), Reference(id=1253994415312266052, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2018, volume=39, issue=11, pageStart=1266, pageEnd=1274, url=null, language=null, rfNumber=[6], rfOrder=8, authorNames=王海蛟, 贺欢, 杨震, journalName=宇航学报, refType=null, unstructuredReference=王海蛟,贺欢,杨震.敏捷成像卫星调度的改进量子遗传算法[J].宇航学报,2018,39(11):1266-1274., articleTitle=敏捷成像卫星调度的改进量子遗传算法, refAbstract=null), Reference(id=1253994415387763525, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2018, volume=39, issue=11, pageStart=1266, pageEnd=1274, url=null, language=null, rfNumber=[6], rfOrder=9, authorNames=WANG H J, HE H, YANG Z, journalName=Journal of Astronautics, refType=null, unstructuredReference=WANG H J,HE H,YANG Z.Scheduling of agile satellites based on an improved quantum genetic algorithm[J].Journal of Astronautics,2018,39(11):1266-1274(in Chinese)., articleTitle=Scheduling of agile satellites based on an improved quantum genetic algorithm, refAbstract=null), Reference(id=1253994415463260998, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2023, volume=44, issue=11, pageStart=1693, pageEnd=1705, url=null, language=null, rfNumber=[7], rfOrder=10, authorNames=陈雄姿, 谢松, 蔡熙, journalName=宇航学报, refType=null, unstructuredReference=陈雄姿,谢松,蔡熙,.敏捷卫星动中成像自主任务规划算法[J].宇航学报,2023,44(11):1693-1705., articleTitle=敏捷卫星动中成像自主任务规划算法, refAbstract=null), Reference(id=1253994415542952775, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2023, volume=44, issue=11, pageStart=1693, pageEnd=1705, url=null, language=null, rfNumber=[7], rfOrder=11, authorNames=CHEN X Z, XIE S, CAI X, journalName=Journal of Astronautics, refType=null, unstructuredReference=CHEN X Z,XIE S,CAI X,et al.Algorithms of autonomous mission planning for agile satellite active pushbroom imaging[J].Journal of Astronautics,2023,44(11):1693-1705(in Chinese)., articleTitle=Algorithms of autonomous mission planning for agile satellite active pushbroom imaging, refAbstract=null), Reference(id=1253994415647810376, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2012, volume=33, issue=12, pageStart=1806, pageEnd=1814, url=null, language=null, rfNumber=[8], rfOrder=12, authorNames=王建江, 邱涤珊, 贺川, journalName=宇航学报, refType=null, unstructuredReference=王建江,邱涤珊,贺川,.考虑目标间不同转换方式的成像卫星调度[J].宇航学报,2012,33(12):1806-1814., articleTitle=考虑目标间不同转换方式的成像卫星调度, refAbstract=null), Reference(id=1253994415719113545, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2012, volume=33, issue=12, pageStart=1806, pageEnd=1814, url=null, language=null, rfNumber=[8], rfOrder=13, authorNames=WANG J J, QIU D S, HE C, journalName=Journal of Astronautics, refType=null, unstructuredReference=WANG J J,QIU D S,HE C,et al.Scheduling of imaging satellite with different transition modes between adjacent targets[J].Journal of Astronautics, 2012,33(12):1806-1814(in Chinese)., articleTitle=Scheduling of imaging satellite with different transition modes between adjacent targets, refAbstract=null), Reference(id=1253994415811388234, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2021, volume=47, issue=11, pageStart=2521, pageEnd=2537, url=null, language=null, rfNumber=[9], rfOrder=14, authorNames=李凯文, 张涛, 王锐, journalName=自动化学报, refType=null, unstructuredReference=李凯文,张涛,王锐,.基于深度强化学习的组合优化研究进展[J].自动化学报,2021,47(11):2521-2537., articleTitle=基于深度强化学习的组合优化研究进展, refAbstract=null), Reference(id=1253994415882691403, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2021, volume=47, issue=11, pageStart=2521, pageEnd=2537, url=null, language=null, rfNumber=[9], rfOrder=15, authorNames=LI K W, ZHANG T, WANG R, journalName=Acta Automatica Sinica, refType=null, unstructuredReference=LI K W,ZHANG T,WANG R,et al.Research reviews of combinatorial optimization methods based on deep reinforcement learning[J].Acta Automatica Sinica,2021,47(11):2521-2537(in Chinese)., articleTitle=Research reviews of combinatorial optimization methods based on deep reinforcement learning, refAbstract=null), Reference(id=1253994415970771788, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2020, volume=52, issue=3, pageStart=1463, pageEnd=1474, url=null, language=null, rfNumber=[10], rfOrder=16, authorNames=HE Y, XING L, CHEN Y, journalName=IEEE Transactions on Systems,Man,and Cybernetics:Systems, refType=null, unstructuredReference=HE Y,XING L,CHEN Y,et al.A generic Markov decision process model and reinforcement learning method for scheduling agile earth observation satellites[J].IEEE Transactions on Systems,Man,and Cybernetics:Systems,2020,52(3):1463-1474., articleTitle=A generic Markov decision process model and reinforcement learning method for scheduling agile earth observation satellites, refAbstract=null), Reference(id=1253994416058852173, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2019, volume=null, issue=null, pageStart=126, pageEnd=132, url=null, language=null, rfNumber=[11], rfOrder=17, authorNames=CHEN M, CHEN Y, CHEN Y, journalName=null, refType=null, unstructuredReference=CHEN M,CHEN Y,CHEN Y,et al.Deep reinforcement learning for agile satellite scheduling problem[C]//2019 IEEE Symposium Series on Computational Intelligence (SSCI).Xiamen:IEEE,2019:126-132., articleTitle=Deep reinforcement learning for agile satellite scheduling problem, refAbstract=null), Reference(id=1253994417635910478, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2021, volume=13, issue=12, pageStart=2377, pageEnd=null, url=null, language=null, rfNumber=[12], rfOrder=18, authorNames=HUANG Y, MU Z, WU S, journalName=Remote Sensing, refType=null, unstructuredReference=HUANG Y,MU Z,WU S,et al.Revising the observation satellite scheduling problem based on deep reinforcement learning[J].Remote Sensing,2021,13(12):2377., articleTitle=Revising the observation satellite scheduling problem based on deep reinforcement learning, refAbstract=null), Reference(id=1253994417715602255, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2021, volume=110, issue=null, pageStart=107607, pageEnd=null, url=null, language=null, rfNumber=[13], rfOrder=19, authorNames=WEI L, CHEN Y, CHEN M, journalName=Applied Soft Computing, refType=null, unstructuredReference=WEI L,CHEN Y,CHEN M,et al.Deep reinforcement learning and parameter transfer based approach for the multi-objective agile earth observation satellite scheduling problem[J].Applied Soft Computing, 2021,110:107607., articleTitle=Deep reinforcement learning and parameter transfer based approach for the multi-objective agile earth observation satellite scheduling problem, refAbstract=null), Reference(id=1253994417786905424, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2021, volume=55, issue=2, pageStart=395, pageEnd=401, url=null, language=null, rfNumber=[14], rfOrder=20, authorNames=马一凡, 赵凡宇, 王鑫, journalName=浙江大学学报(工学版), refType=null, unstructuredReference=马一凡,赵凡宇,王鑫,.基于改进指针网络的卫星对地观测任务规划方法[J].浙江大学学报(工学版), 2021,55(2):395-401., articleTitle=基于改进指针网络的卫星对地观测任务规划方法, refAbstract=null), Reference(id=1253994417866597201, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2021, volume=55, issue=2, pageStart=395, pageEnd=401, url=null, language=null, rfNumber=[14], rfOrder=21, authorNames=MA Y F, ZHAO F Y, WANG X, journalName=Journal of Zhejiang University (Engineering Science), refType=null, unstructuredReference=MA Y F,ZHAO F Y,WANG X,et al.Satellite earth observation task planning method based on improved pointer networks[J].Journal of Zhejiang University (Engineering Science),2021,55(2):395-401(in Chinese)., articleTitle=Satellite earth observation task planning method based on improved pointer networks, refAbstract=null), Reference(id=1253994417946288978, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=1998, volume=101, issue=1-2, pageStart=99, pageEnd=134, url=null, language=null, rfNumber=[15], rfOrder=22, authorNames=KAELBLING L P, LITTMAN M L, CASSANDRA A R, journalName=Artificial Intelligence, refType=null, unstructuredReference=KAELBLING L P,LITTMAN M L,CASSANDRA A R.Planning and acting in partially observable stochastic domains[J].Artificial Intelligence,1998,101(1-2):99-134., articleTitle=Planning and acting in partially observable stochastic domains, refAbstract=null), Reference(id=1253994418009203539, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2009, volume=24, issue=7, pageStart=1007, pageEnd=1012, url=null, language=null, rfNumber=[16], rfOrder=23, authorNames=慈元卓, 贺仁杰, 徐一帆, journalName=控制与决策, refType=null, unstructuredReference=慈元卓,贺仁杰,徐一帆,.卫星搜索移动目标问题中的目标运动预测方法研究[J].控制与决策,2009, 24(7):1007-1012., articleTitle=卫星搜索移动目标问题中的目标运动预测方法研究, refAbstract=null), Reference(id=1253994418093089620, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2009, volume=24, issue=7, pageStart=1007, pageEnd=1012, url=null, language=null, rfNumber=[16], rfOrder=24, authorNames=CI Y Z, HE R J, XU Y F, journalName=Control and Decision, refType=null, unstructuredReference=CI Y Z,HE R J,XU Y F,et al.Method of target motion prediction for moving target search by satellite[J].Control and Decision,2009,24(7):1007-1012(in Chinese)., articleTitle=Method of target motion prediction for moving target search by satellite, refAbstract=null), Reference(id=1253994418176975701, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, doi=null, pmid=null, pmcid=null, year=2017, volume=null, issue=null, pageStart=null, pageEnd=null, url=null, language=null, rfNumber=[17], rfOrder=25, authorNames=SCHULMAN J, WOLSKI F, DHARIWAL P, journalName=arXiv preprint arXiv:1707.06347, refType=null, unstructuredReference=SCHULMAN J,WOLSKI F,DHARIWAL P,et al. Proximal policy optimization algorithms[J].arXiv preprint arXiv:1707.06347,2017.Available at:https://arxiv.org/abs/1707.06347., articleTitle=Proximal policy optimization algorithms, refAbstract=null)], funds=null, companyList=[AuthorCompany(id=1253994406651028227, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, xref=1, ext=[AuthorCompanyExt(id=1253994406680388356, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, companyId=1253994406651028227, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1School of Automation, Central South University, Changsha 410083), AuthorCompanyExt(id=1253994406739108613, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, companyId=1253994406651028227, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=1中南大学 自动化学院·长沙·410083)]), AuthorCompany(id=1253994406831383302, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, xref=2, ext=[AuthorCompanyExt(id=1253994406839771911, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, companyId=1253994406831383302, language=EN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2China Satellite Network Application Co., Ltd., Beijing 100001), AuthorCompanyExt(id=1253994406864937736, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, companyId=1253994406831383302, language=CN, country=null, province=null, city=null, postcode=null, companyName=null, departmentName=null, remark=2中国星网网络应用研究院有限公司·北京·100001)])], figs=[ArticleFig(id=1253994410400736032, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, label=Fig.1, caption=Diagram of satellite ground coverage, figureFileSmall=12NZVJ7p/2SgbwqDb/DfhA==, figureFileBig=3EieYwMR6ucwdb+pthicLg==, tableContent=null), ArticleFig(id=1253994410497205025, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, label=图1, caption=卫星对地覆盖示意图, figureFileSmall=12NZVJ7p/2SgbwqDb/DfhA==, figureFileBig=3EieYwMR6ucwdb+pthicLg==, tableContent=null), ArticleFig(id=1253994410706920226, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, label=Fig.2, caption=Diagram of imaging strip vertex calculation, figureFileSmall=cdkd5p4ZXy5EW5QSEOEpMg==, figureFileBig=1yzriNbeki6wNPldtIu7BA==, tableContent=null), ArticleFig(id=1253994410828555043, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, label=图2, caption=成像条带顶点计算图, figureFileSmall=cdkd5p4ZXy5EW5QSEOEpMg==, figureFileBig=1yzriNbeki6wNPldtIu7BA==, tableContent=null), ArticleFig(id=1253994410908246820, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, label=Fig.3, caption=Diagram of strip division, figureFileSmall=jSa/wjxXMXatpxYtvofIjQ==, figureFileBig=pxJtzpcpAimxoK6mdYeHwg==, tableContent=null), ArticleFig(id=1253994410975355685, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, label=图3, caption=卫星成像条带划分图, figureFileSmall=jSa/wjxXMXatpxYtvofIjQ==, figureFileBig=pxJtzpcpAimxoK6mdYeHwg==, tableContent=null), ArticleFig(id=1253994411055047462, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, label=Fig.4, caption=Coverage of different satellite bands, figureFileSmall=K1kO9hDuuZsSsHI7Hb4+Zw==, figureFileBig=b6iSstT91PW42h7HB35Vvw==, tableContent=null), ArticleFig(id=1253994411164099367, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, label=图4, caption=不同卫星条带覆盖图, figureFileSmall=K1kO9hDuuZsSsHI7Hb4+Zw==, figureFileBig=b6iSstT91PW42h7HB35Vvw==, tableContent=null), ArticleFig(id=1253994411294122792, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, label=Fig.5, caption=Flow chart of improved PPO algorithm, figureFileSmall=/bzFRguQwmuDu4n/TqbwOg==, figureFileBig=cRI4RTHOlIysDrC2dzt8tw==, tableContent=null), ArticleFig(id=1253994411549975337, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, label=图5, caption=改进PPO算法流程图, figureFileSmall=/bzFRguQwmuDu4n/TqbwOg==, figureFileBig=cRI4RTHOlIysDrC2dzt8tw==, tableContent=null), ArticleFig(id=1253994411726136106, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, label=Fig.6, caption=Diagram of training average reward, figureFileSmall=NNIDuVU27TkeANmkxt8evg==, figureFileBig=Dai2p8HRdcomXmoSJi4Dig==, tableContent=null), ArticleFig(id=1253994413236085547, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, label=图6, caption=训练平均奖励图, figureFileSmall=NNIDuVU27TkeANmkxt8evg==, figureFileBig=Dai2p8HRdcomXmoSJi4Dig==, tableContent=null), ArticleFig(id=1253994413311583020, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, label=Fig.7, caption=Diagram of the optimal observation sequence, figureFileSmall=5oZAZ819MUh/69XHbMYOkA==, figureFileBig=pYJHEDZxo61YP/3ewZRF5g==, tableContent=null), ArticleFig(id=1253994413374497581, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, label=图7, caption=最优观测序列示意图, figureFileSmall=5oZAZ819MUh/69XHbMYOkA==, figureFileBig=pYJHEDZxo61YP/3ewZRF5g==, tableContent=null), ArticleFig(id=1253994413433217838, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, label=Fig.8, caption=Diagram of training average reward under different probability conditions, figureFileSmall=G/xAZ8NCo1N45/C1pgzSfw==, figureFileBig=ZoWSyhhZTSiXaHQD45hCaQ==, tableContent=null), ArticleFig(id=1253994413487743791, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, label=图8, caption=不同概率条件下训练平均奖励图, figureFileSmall=G/xAZ8NCo1N45/C1pgzSfw==, figureFileBig=ZoWSyhhZTSiXaHQD45hCaQ==, tableContent=null), ArticleFig(id=1253994413575824176, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, label=Fig.9, caption=Diagram of online strategy training average reward, figureFileSmall=s0Bi3kAeEacedP4cjtwnQA==, figureFileBig=rv9/eJKMlaE/m8Mx7660Xw==, tableContent=null), ArticleFig(id=1253994413689070385, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, label=图9, caption=在线策略训练平均奖励图, figureFileSmall=s0Bi3kAeEacedP4cjtwnQA==, figureFileBig=rv9/eJKMlaE/m8Mx7660Xw==, tableContent=null), ArticleFig(id=1253994413760373554, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, label=Fig.10, caption=Diagram of online decision average reward, figureFileSmall=+tjbu+JKT8tkTXPHUtfnvQ==, figureFileBig=rvCzzAZYVUudnXe2wHiUsw==, tableContent=null), ArticleFig(id=1253994413848453939, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, label=图10, caption=在线决策平均奖励图, figureFileSmall=+tjbu+JKT8tkTXPHUtfnvQ==, figureFileBig=rvCzzAZYVUudnXe2wHiUsw==, tableContent=null), ArticleFig(id=1253994413940728628, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, label=Tab.1, caption=

Table of visible time windows

, figureFileSmall=null, figureFileBig=null, tableContent=
序号开始时间结束时间卫星
12022/12/1207:22:452022/12/1207:23:31Sat1
22022/12/1301:44:222022/12/1301:45:08Sat10
32022/12/1303:13:202022/12/1303:14:07Sat11
42022/12/1205:03:582022/12/1205:04:45Sat12
52022/12/1206:32:562022/12/1206:33:43Sat13
62022/12/1223:30:382022/12/1223:31:23Sat13
72022/12/1208:01:542022/12/1208:02:41Sat14
82022/12/1300:59:362022/12/1301:00:23Sat14
92022/12/1302:28:342022/12/1302:29:21Sat15
102022/12/1204:19:112022/12/1204:19:57Sat16
112022/12/1303:57:582022/12/1303:58:18Sat16
122022/12/1208:51:402022/12/1208:52:28Sat2
132022/12/1210:20:422022/12/1210:21:09Sat3
142022/12/1221:17:262022/12/1221:18:08Sat7
152022/12/1222:46:252022/12/1222:47:11Sat8
162022/12/1300:15:232022/12/1300:16:10Sat9
), ArticleFig(id=1253994414020420405, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, label=表1, caption=

可见时间窗口表

, figureFileSmall=null, figureFileBig=null, tableContent=
序号开始时间结束时间卫星
12022/12/1207:22:452022/12/1207:23:31Sat1
22022/12/1301:44:222022/12/1301:45:08Sat10
32022/12/1303:13:202022/12/1303:14:07Sat11
42022/12/1205:03:582022/12/1205:04:45Sat12
52022/12/1206:32:562022/12/1206:33:43Sat13
62022/12/1223:30:382022/12/1223:31:23Sat13
72022/12/1208:01:542022/12/1208:02:41Sat14
82022/12/1300:59:362022/12/1301:00:23Sat14
92022/12/1302:28:342022/12/1302:29:21Sat15
102022/12/1204:19:112022/12/1204:19:57Sat16
112022/12/1303:57:582022/12/1303:58:18Sat16
122022/12/1208:51:402022/12/1208:52:28Sat2
132022/12/1210:20:422022/12/1210:21:09Sat3
142022/12/1221:17:262022/12/1221:18:08Sat7
152022/12/1222:46:252022/12/1222:47:11Sat8
162022/12/1300:15:232022/12/1300:16:10Sat9
), ArticleFig(id=1253994414091723574, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, label=Tab.2, caption=

Imaging load parameters setting table

, figureFileSmall=null, figureFileBig=null, tableContent=
参数设置值
侧视角范围[-24°,24°]
侧视角离散值
视场角12°
发现概率0.9
虚警概率0.1
), ArticleFig(id=1253994414175609655, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, label=表2, caption=

成像载荷参数设置表

, figureFileSmall=null, figureFileBig=null, tableContent=
参数设置值
侧视角范围[-24°,24°]
侧视角离散值
视场角12°
发现概率0.9
虚警概率0.1
), ArticleFig(id=1253994414259495736, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, label=Tab.3, caption=

Network structure and hyperparameter setting

, figureFileSmall=null, figureFileBig=null, tableContent=
参数设置值
动作网络各层神经元个数[6,64,8,9]
评价网络各层神经元个数[6,64,8,1]
输出层的激活函数Linear
其他各层神经元的激活函数Tanh
动作网络学习率0.0003
评价网络学习率0.001
衰减因子0.99
最大迭代步数3000
), ArticleFig(id=1253994414368547641, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, label=表3, caption=

网络结构与超参数设置

, figureFileSmall=null, figureFileBig=null, tableContent=
参数设置值
动作网络各层神经元个数[6,64,8,9]
评价网络各层神经元个数[6,64,8,1]
输出层的激活函数Linear
其他各层神经元的激活函数Tanh
动作网络学习率0.0003
评价网络学习率0.001
衰减因子0.99
最大迭代步数3000
), ArticleFig(id=1253994414460822330, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=EN, label=Tab.4, caption=

Decision results

, figureFileSmall=null, figureFileBig=null, tableContent=
序号开始时间结束时间卫星先验概率条带与覆盖网格
12022/12/1204:19:112022/12/1204:19:57Sat16[0.8,0.1,0.1,0,0,0,0,0,0]6:[2,3]
22022/12/1205:03:582022/12/1205:04:45Sat12[0.97,0.02,0.01,0,0,0,0,0,0]2:[1,4,5,6]
32022/12/1206:32:562022/12/1206:33:43Sat13[0.01,0.97,0.02,0,0,0,0,0,0]1:[1,4,5,6,9]
42022/12/1207:22:452022/12/1207:23:31Sat1[0,0.98,0.02,0,0,0,0,0,0]7:[1,2]
52022/12/1208:01:542022/12/1208:02:41Sat14[0,0.01,0.96,0.03,0,0,0,0,0]1:[4,7,8,9]
62022/12/1208:51:402022/12/1208:52:28Sat2[0,0.02,0.98,0,0,0,0,0,0]9:[3,4,5,6,7]
72022/12/1210:20:422022/12/1210:21:09Sat3[0,0.01,0.11,0.86,0,0,0,0,0]9:[1,2]
82022/12/1221:17:262022/12/1221:18:08Sat7[0,0,0,0.45,0.25,0.06,0.23,0,0]9:[3]
92022/12/1222:46:252022/12/1222:47:11Sat8[0,0,0,0,0,0.04,0.17,0.42,0.37]9:[1,2,3,6]
102022/12/1223:30:382022/12/1223:31:23Sat13[0,0,0,0,0,0,0.05,0.31,0.64]1:[9]
112022/12/1300:15:232022/12/1300:16:10Sat3[0,0,0,0,0,0,0.12,0.71,0.16]9:[4,5,9]
122022/12/1300:59:362022/12/1301:00:23Sat14[0,0,0,0,0,0,0.04,0.70,0.26]1:[5,6,7,8]
132022/12/1301:44:222022/12/1301:45:08Sat10[0,0,0,0,0,0,0.05,0.90,0.05]9:[7,8]
142022/12/1302:28:342022/12/1302:29:21Sat15[0,0,0,0,0,0.04,0.04,0.67,0.25]3:[8,9]
152022/12/1303:13:202022/12/1303:14:07Sat11[0,0,0,0,0,0.03,0.28,0.13,0.56]7:[4,5,9]
162022/12/1303:57:582022/12/1303:58:18Sat16[0,0,0,0,0,0.06,0.56,0.25,0.12]4:[8,9]
), ArticleFig(id=1253994414540514107, tenantId=1146029695717560320, journalId=1251234571887362144, articleId=1253994382005297722, language=CN, label=表4, caption=

决策结果

, figureFileSmall=null, figureFileBig=null, tableContent=
序号开始时间结束时间卫星先验概率条带与覆盖网格
12022/12/1204:19:112022/12/1204:19:57Sat16[0.8,0.1,0.1,0,0,0,0,0,0]6:[2,3]
22022/12/1205:03:582022/12/1205:04:45Sat12[0.97,0.02,0.01,0,0,0,0,0,0]2:[1,4,5,6]
32022/12/1206:32:562022/12/1206:33:43Sat13[0.01,0.97,0.02,0,0,0,0,0,0]1:[1,4,5,6,9]
42022/12/1207:22:452022/12/1207:23:31Sat1[0,0.98,0.02,0,0,0,0,0,0]7:[1,2]
52022/12/1208:01:542022/12/1208:02:41Sat14[0,0.01,0.96,0.03,0,0,0,0,0]1:[4,7,8,9]
62022/12/1208:51:402022/12/1208:52:28Sat2[0,0.02,0.98,0,0,0,0,0,0]9:[3,4,5,6,7]
72022/12/1210:20:422022/12/1210:21:09Sat3[0,0.01,0.11,0.86,0,0,0,0,0]9:[1,2]
82022/12/1221:17:262022/12/1221:18:08Sat7[0,0,0,0.45,0.25,0.06,0.23,0,0]9:[3]
92022/12/1222:46:252022/12/1222:47:11Sat8[0,0,0,0,0,0.04,0.17,0.42,0.37]9:[1,2,3,6]
102022/12/1223:30:382022/12/1223:31:23Sat13[0,0,0,0,0,0,0.05,0.31,0.64]1:[9]
112022/12/1300:15:232022/12/1300:16:10Sat3[0,0,0,0,0,0,0.12,0.71,0.16]9:[4,5,9]
122022/12/1300:59:362022/12/1301:00:23Sat14[0,0,0,0,0,0,0.04,0.70,0.26]1:[5,6,7,8]
132022/12/1301:44:222022/12/1301:45:08Sat10[0,0,0,0,0,0,0.05,0.90,0.05]9:[7,8]
142022/12/1302:28:342022/12/1302:29:21Sat15[0,0,0,0,0,0.04,0.04,0.67,0.25]3:[8,9]
152022/12/1303:13:202022/12/1303:14:07Sat11[0,0,0,0,0,0.03,0.28,0.13,0.56]7:[4,5,9]
162022/12/1303:57:582022/12/1303:58:18Sat16[0,0,0,0,0,0.06,0.56,0.25,0.12]4:[8,9]
)], attaches=null, journal=Journal(id=1251231495105327204, delFlag=0, nameCn=飞控与探测, nameEn=Flight Control & Detection, nameHistory1=null, nameHistory2=null, issn=2096-5974, eissn=, cn=10-1567/TJ, coden=null, periodic=1, language=CN, oaType=1, ccby=null, superviseOffice=null, ownerOffice=null, pubOffice=null, editorOffice=null, officeType=null, aims=null, clcCode=null, officeProv=null, officeCity=null, officeAddr=null, officeZip=null, officeEmail=, officePhone=, editDirector=null, officeDirector=null, officeDirectorPhone=null, officeStaffNum=null, officeEmpNum=null, coverPicUrl=62VTU3mFmPrlNa8O4U3NDw==, journalPrice=null, startedYear=null, abbrevIsoEn=Flight Control & Detection, journalRemark=null, publicationField=null, createdTime=1776246435193, updatedTime=1776397610259, createdBy=18614031015, updatedBy=13701087609, firstLetterCn=F, firstLetterEn=F, subjectCode=Engineering, subjectName=工程, subjectCodeEn=Engineering, subjectNameEn=null, picCn=62VTU3mFmPrlNa8O4U3NDw==, picEn=A2zKCSNdCDHegCegMgaNMQ==, jcr=null, cjcr=null, exts=[JournalExt(id=1251865569406960556, language=CN, name=飞控与探测, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=, createdTime=1776397610284, updatedTime=1776397610284, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=http://fkytc.ijournal.cn/ch/author/login.aspx, submissionEditorUrl=http://fkytc.ijournal.cn/ch/login.aspx, submissionReviewUrl=http://fkytc.ijournal.cn/ch/auditor/login.aspx, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""}), JournalExt(id=1251865569457292205, language=EN, name=Flight Control & Detection, nameHistory1=null, nameHistory2=null, managedBy=, sponsoredBy=, publishedBy=, editorOffice=, officeProv=null, officeCity=null, officeAddr=, officeZip=, editDirector=, officeDirector=null, officePhone=null, coverPicUrl=null, journalRemark=, submitArticleUrl=null, websiteUrl=, createdTime=1776397610296, updatedTime=1776397610296, createdBy=13701087609, updatedBy=13701087609, submissionGuidelinesUrl=, submissionAuthorUrl=http://fkytc.ijournal.cn/ch/author/login.aspx, submissionEditorUrl=http://fkytc.ijournal.cn/ch/login.aspx, submissionReviewUrl=http://fkytc.ijournal.cn/ch/auditor/login.aspx, submissionCeEditorUrl=, submissionAeEditorUrl=, option={"copyright":""})], databaseList=null, tenantJournalId=1251234571887362144, websiteList=[Website(id=1251257283523592241, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1251234571887362144, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/fkytc/CN, language=CN, createTime=1776252583628, createBy=18614031015, updateTime=1776253801088, updateBy=18614031015, name=飞控与探测-中文, tplId=1146099689490845704, title=飞控与探测, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1251262529310241438, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283523592241, code=articleTextType, value=kx, createTime=1776253834321, updateTime=1776253834321, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262529289269915, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283523592241, code=banner, value=null, createTime=1776253834316, updateTime=1776253834316, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262529331212961, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283523592241, code=grayFlag, value=0, createTime=1776253834326, updateTime=1776253834326, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262529285075610, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283523592241, code=logo, value=https://castjournals.cast.org.cn/joweb/fkytc/CN/file/pic?fileId=i7mLZwFALt9PMt6SohU8eg==, createTime=1776253834315, updateTime=1776253834315, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262529343795875, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283523592241, code=minRunFlag, value=0, createTime=1776253834329, updateTime=1776253834329, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262529306047133, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283523592241, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/fkytc/CN/file/pic, createTime=1776253834320, updateTime=1776253834320, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262529335407266, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283523592241, code=silenceFlag, value=0, createTime=1776253834327, updateTime=1776253834327, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262529297658524, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283523592241, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_cn_619/, createTime=1776253834318, updateTime=1776253834318, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262529318630047, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283523592241, code=themeColor, value=null, createTime=1776253834323, updateTime=1776253834323, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262529327018656, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283523592241, code=themeStyle, value=null, createTime=1776253834325, updateTime=1776253834325, creator=18614031015, updator=18614031015)]), Website(id=1251257283611672633, webName=null, webTitle=null, webDomain=null, webCopyrigh=null, webIpcNo=null, seoTitle=null, seoKeywords=null, seoDescription=null, tenantJournalId=null, journalId=1251234571887362144, journalNameCn=null, journalNameEn=null, grayFlag=null, tenantId=1146029695717560320, platformId=null, journalGroupId=null, journalGroupNameCn=null, journalGroupNameEn=null, type=1, domain=https://castjournals.cast.org.cn/joweb/fkytc/EN, language=EN, createTime=1776252583649, createBy=18614031015, updateTime=1776253797406, updateBy=18614031015, name=飞控与探测-英文, tplId=1146101810881728533, title=Flight Control & Detection, delFlag=0, indexPage=/home, props=[WebsiteProps(id=1251262470745174676, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283611672633, code=articleTextType, value=kx, createTime=1776253820358, updateTime=1776253820358, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262470724203153, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283611672633, code=banner, value=null, createTime=1776253820353, updateTime=1776253820353, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262470766146199, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283611672633, code=grayFlag, value=0, createTime=1776253820363, updateTime=1776253820363, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262470711620240, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283611672633, code=logo, value=https://castjournals.cast.org.cn/joweb/fkytc/EN/file/pic?fileId=i7mLZwFALt9PMt6SohU8eg==, createTime=1776253820350, updateTime=1776253820350, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262470778729113, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283611672633, code=minRunFlag, value=0, createTime=1776253820366, updateTime=1776253820366, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262470736786067, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283611672633, code=picServerUrl, value=https://castjournals.cast.org.cn/joweb/fkytc/EN/file/pic, createTime=1776253820356, updateTime=1776253820356, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262470770340504, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283611672633, code=silenceFlag, value=0, createTime=1776253820364, updateTime=1776253820364, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262470732591762, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283611672633, code=staticResourcePath, value=https://castjournals.cast.org.cn/joweb/cast_kjdb_en_623/, createTime=1776253820355, updateTime=1776253820355, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262470749368981, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283611672633, code=themeColor, value=null, createTime=1776253820359, updateTime=1776253820359, creator=18614031015, updator=18614031015), WebsiteProps(id=1251262470757757590, tenantId=1146029695717560320, journalId=null, journalGroupId=null, siteId=1251257283611672633, code=themeStyle, value=null, createTime=1776253820361, updateTime=1776253820361, creator=18614031015, updator=18614031015)])], journalTitle=飞控与探测, weixinUrl=null, journalUrl=http://fkytc.ijournal.cn/, iacademicId=null, status=1, seqNo=null, journalTitleEn=Flight Control & Detection, journalPhotoCn=62VTU3mFmPrlNa8O4U3NDw==, journalPhotoEn=A2zKCSNdCDHegCegMgaNMQ==, journalFirstLetter=F, journalRecommend=null, journalNew=null, journalCollection=null, jcrJf=null, cjcrJf=null, jcrJfStr=null, cjcrJfStr=null, submissionFirstDecision=null, sciSubjectClassification=null, casSubjectClassification=null, citeScore=null, totalCitationFrequency=null, icpCode=null, psCode=null, advertisingLicenseCode=null, copyrightInformation=null, country=null, option=, provinceCode=null, provinceName=null, collectFlag=false), detailUrlCn=https://castjournals.cast.org.cn/joweb/fkytc/CN/10.20249/j.cnki.2096-5974.2025.05.004, detailUrlEn=https://castjournals.cast.org.cn/joweb/fkytc/EN/10.20249/j.cnki.2096-5974.2025.05.004, pdfUrlCn=https://castjournals.cast.org.cn/joweb/fkytc/CN/PDF/10.20249/j.cnki.2096-5974.2025.05.004, pdfUrlEn=https://castjournals.cast.org.cn/joweb/fkytc/EN/PDF/10.20249/j.cnki.2096-5974.2025.05.004, aliStartDate=null, aliEndDate=null, collectionFlag=false, citedCount=null, citedUrl=null, reference=null)
收藏切换
面向地面移动目标观测的多星成像在线调度方法
收藏切换
PDF下载
熊韫文 1 , 李毅 2 , 魏才盛 1
飞控与探测 | 导航制导与控制技术 2025,8(5): 34-43
收起
收藏切换
飞控与探测 | 导航制导与控制技术 2025, 8(5): 34-43
面向地面移动目标观测的多星成像在线调度方法
全屏
熊韫文1, 李毅2, 魏才盛1
作者信息
  • 1中南大学 自动化学院·长沙·410083
  • 2中国星网网络应用研究院有限公司·北京·100001
  • 熊韫文,男,博士生。

通讯作者:

通信作者简介:魏才盛,男,博士,教授,博士生导师。
Online Imaging Scheduling Method of Multiple Satellites for Ground Moving Target Observation
Yunwen XIONG1, Yi LI2, Caisheng WEI1
Affiliations
  • 1School of Automation, Central South University, Changsha 410083
  • 2China Satellite Network Application Co., Ltd., Beijing 100001
doi: 10.20249/j.cnki.2096-5974.2025.05.004
文章导航
收藏切换

针对地面移动目标信息不可观的多星成像协同规划问题,开展了基于改进深度强化学习的在线成像调度方法研究。首先,基于卫星成像覆盖计算方法设计了一种成像条带划分算法;其次,基于可见时间窗口、条带、卫星轨道信息,建立了基于部分可观马尔可夫决策过程(Partially Observable Markov Decision Process,POMDP)的多星协同观测任务规划模型;然后,提出了一种融合动作掩码与优势度归一化机制的在线近端策略优化强化学习算法,提升了算法对求解部分条带覆盖任务区域调度问题的收敛速率;最后,通过3组仿真验证了所提出算法对在线求解该问题的正确性与优越性。

移动目标  /  成像卫星调度  /  部分可观马尔可夫决策过程(POMDP)  /  深度强化学习

Aiming at the problem of multiple satellites collaborative planning with unobservable ground moving targets,this paper studies the online imaging scheduling method based on improved deep reinforcement learning algorithms. First,based on the satellite imaging coverage calculation method,an imaging strip partitioning algorithm is proposed. Second,based on the visible time window,strip and satellite orbit information,a multiple satellites cooperative observation mission planning model based on the partially observable Markov decision process (POMDP)is established. Then,a proximal policy optimization algorithm with an action mask and advantage normalization mechanism is proposed,which improves the convergence rate of the algorithm for solving the partial strip coverage task area scheduling problem. Finally,the correctness and superiority of the proposed algorithm are verified by three sets of simulations.

moving target  /  imaging satellite scheduling  /  partially observable Markov decision process(POMDP)  /  deep reinforcement learning
熊韫文, 李毅, 魏才盛. 面向地面移动目标观测的多星成像在线调度方法. 飞控与探测, 2025 , 8 (5) : 34 -43 . DOI: 10.20249/j.cnki.2096-5974.2025.05.004
Yunwen XIONG, Yi LI, Caisheng WEI. Online Imaging Scheduling Method of Multiple Satellites for Ground Moving Target Observation[J]. Flight Control & Detection, 2025 , 8 (5) : 34 -43 . DOI: 10.20249/j.cnki.2096-5974.2025.05.004
随着空间遥感技术的飞速发展,成像卫星以成像载荷覆盖范围大、工作时间长和不受空域限制等优点而备受关注,其应用逐渐成为气象灾害监测、环境保护以及国土安全防护的重要手段之一[1]。成像卫星主要分为光学成像卫星和雷达成像卫星两大类,其中,光学成像卫星的成像分辨率高,但易受光照环境和天气因素影响;相反,雷达成像卫星可在较差的天气与光照条件下执行观测任务,具备全天候的对地观测优势[2]。但雷达成像卫星具有成像分辨率与立体观察效果较差等缺点。因此,如何实现在轨光学、雷达等多种类型成像卫星的协同调度是提升成像卫星对地观测效益的关键。
现有解决成像卫星协同调度问题的算法可归纳为精确搜索算法与近似算法。例如,文献[3]采用穷举搜索的方法,设计了一个适用于单个卫星和少量观测的成像任务规划需求的任务规划方案。文献[4]采用迭代算法将观测需求分组排序,随后使用完全搜索技术获得最佳任务规划方案。然而,精确搜索算法需要穷尽寻优,无法快速求解出大规模的卫星成像规划问题的最优解。为克服精确搜索算法的局限,近似算法因其高效的搜索效率得到广泛关注。例如,文献[5]针对海洋移动目标成像侦测任务规划问题,在位置先验信息下动态构造了目标潜在区域与运动预测模型,并设计了一种基于模拟退火的改进遗传算法对问题进行求解,有效提升了求解速度。文献[6]针对遗传算法求解敏捷卫星成像调度时的编码问题,提出了一种二进制与实数杂合的编码方式,并将量子优化机制与遗传算法相结合,有效提高了搜索效率。文献[7]基于分支定界与两种裁剪枝规则设计了敏捷卫星任务优化调度算法,同时推导了卫星动中成像姿态规划算法,在最大化观测目标数量的基础上将成像质量调整到最优。文献[8]针对成像卫星任务中相邻目标间转换方式的问题,在转换时间、存储与能量的约束下构建了基于动态拓扑图结构的任务规划模型,并提出了一种改进的动态路径搜索算法对问题进行求解,提高了任务规划结果的准确性。
近年来,深度强化学习技术在组合优化领域得到广泛应用,其具有很强的泛化性和高速的求解速度,为成像卫星任务规划问题的解决提供了新的思路和方法[9]。深度强化学习可以描述为智能体在与环境交互的过程中,通过不断探索试验学习获得的最大累计回报策略。基于此,文献[10]提出了一种基于强化学习的敏捷卫星调度问题的通用解决方案,建立了具有连续状态空间和离散动作空间的有限马尔可夫决策过程,利用深度Q网络在经验数据上建立了一个在线价值函数,经过训练的Q网络能够有效地处理未知敏捷卫星的调度数据。文献[11]针对敏捷卫星成像任务规划问题,在深度强化学习神经网络设计中引入循环神经网络和注意力机制,使策略获得的奖励得到了一定的提升。文献[12]针对大规模任务下卫星任务调度问题,提出了一种基于图论的最小团划分算法,用于任务聚类预处理,然后采用深度确定性策略梯度算法来解决一个时间连续的卫星任务调度问题。文献[13]针对敏捷卫星多目标调度规划问题,利用加权法将多目标问题分解成子问题并分别建立了子问题的马尔可夫决策过程,然后采用深度强化学习训练得到每个子问题的解。文献[14]针对卫星观测任务规划问题约束复杂、求解空间大和输入任务序列长度不固定的问题,提出了一种多头注意力机制,对指针网络进行改进,然后通过动作评价深度强化学习算法对指针网络进行训练。然而,在处理大规模复杂成像调度问题时,现有的强化学习方法无法在可接受的时间范围内找到满足实际物理约束的解决方案,难以满足多星在线实时优化调度的任务要求。
基于以上分析,本文针对成像条带划分与地面位置不确定条件下移动目标的多星成像调度问题,开展基于改进深度强化学习的在线成像调度方法研究。首先,设计了一种成像条带划分方法,建立了该问题的部分可观马尔可夫决策过程;其次,对深度强化学习的动作选择与更新方式进行了改进;最后,通过不同初始概率与转移概率的算例验证了提出算法的正确性与优越性。
考虑移动目标在任务区域内机动,首先,计算得到卫星在任务时间内对任务区域的所有可见时间窗口w1w2,…,wn;其次,根据条带划分方法计算得出每颗卫星过境时的所有条带和每个条带所包含的网格编号;然后,基于POMDP框架对多星成像调度问题进行建模;最后,基于所设计的算法在每个时间窗口决策出卫星应该选择哪个条带以实现概率奖励最大化。
根据移动目标成像任务规划问题的需求和特点,下面将给出该问题存在的主要约束和一些简化假设:
1)每次观测活动中卫星只能选择一个条带且不允许切换至其他条带。
2)每颗卫星只搭载一个成像载荷。
3)任务时间内移动目标在任务区域的网格内机动。
卫星对地覆盖角d以及卫星对地中心角α分别可由以下公式计算得出
其中,Re为地球半径,h为卫星瞬时高度。
图1给出了覆盖角与卫星成像角的几何关系,在直角三角形OeZQ中,∠QOeZ =σ,可得
其中,σ为观测角。在直角三角形OeZS中,有
而卫星成像角度ασ已知,由式(3)、式(4)与式(5)可以计算出覆盖角dσ
AB两点对应的成像角度可由任意时刻卫星侧视角ψt以及卫星视场角ϕ计算得出,即
图2所示,AB两点在卫星轨道左上方区域,已知卫星S的轨道倾角为it时刻星下点经纬度为(λStφSt),那么AB点的经纬度可根据下式计算得到
由于卫星载荷的成像幅宽有限,不能完全覆盖较大的任务区域,需要对卫星成像区域进行条带划分。本文基于上述覆盖计算公式,设计了一种成像条带划分算法。
预处理:计算得到卫星过境任务区域的开始时间窗口ts、结束时间窗口te、开始时刻星下点经纬度()、结束时刻星下点经纬度()、开始时刻轨道高度hs、结束时刻轨道高度he
步骤1:根据侧视角范围[ψmin,ψmax]和侧视角离散值Δψ计算得到离散的侧视角序列[ψmin∶Δψ ψmax]。
步骤2:根据侧视角序列和视场角计算得到每个条带的两条边的角度。对于侧视角度ψi,其条带对应边的角度为ψiϕ/2和ψiϕ/2,直至求解完所有条带。
步骤3:将条带两条边的角度、星下点经纬度、轨道高度代入式(6)~式(12),即可得到当前条带4个顶点的经纬度,直至循环求解完所有条带。
通过上述算法可以求解出卫星在时间窗口[tste]过境任务区域时的所有成像条带,记作As ={b1b2,…,},如图3所示。
马尔可夫决策过程(Markov Decision Process,MDP)框架描述了智能体与环境交互的过程。然而,本文所研究的移动目标的位置信息并不可观。因此,本文引入POMDP来对该问题进行建模。POMDP框架由{SATRΩO}六元组组成:SAR与MDP框架中的定义一致,分别为状态空间、动作空间、奖励;T为状态间的转移函数集合,即在智能体在状态s选择动作a转移到状态s′的条件概率T(s|sa);Ω为智能体观测到的部分信息的有限集合;O为观察函数,即智能体在选择动作a后转移到状态s′获得观察o的条件概率[15]。下面将给出移动目标成像任务规划问题的POMDP框架。
(1)条件转移概率
在多星成像调度问题中,智能体在每个可见时间窗口w1w2,…,wn进行决策,状态间的转移是确定的,与智能体当前的状态与选择的动作无关。因此,所有卫星在任务时间内过境任务区域的可见时间窗口有限集T可表达为
(2)动作空间
在组网卫星中每颗卫星过境任务区域的覆盖情况不同,每颗卫星过境任务区域具有多个条带,并且每个条带所覆盖的网格也各不相同。卫星在可见时间窗口过境任务区域的条带集合As
当选择条带bi时,定义其覆盖的网格集合为={g1g2,…,}。所有卫星的条带构成了该问题的动作空间A,即
(3)信念状态
信念状态为状态空间中所有状态的概率分布。在该问题中,将移动目标任务区域划分为网格,并利用贝叶斯概率来量化移动目标在各网格的概率信息。因此,信念状态可设计为移动目标在各网格的离散分布概率。用先验概率描述移动目标在机动过后产生的分布概率
定义后验概率为移动目标被观测后在网格中的概率为
先验概率与后验概率之间的转化公式可参考文献[16],本文由于篇幅原因不再赘述。
(4)状态空间
设计状态为上述信念状态以及卫星覆盖情况,卫星覆盖情况包括卫星过境任务区域的经纬度λsatiϕsati,卫星的轨道倾角θsati,所以状态可表示为
每个可见时间窗口对应了一个状态,所有的状态组成了该问题的状态空间。
(5)奖励函数
定义奖励函数为条带bi对应的网格集合内,所有网格的观测收益之和再乘以该卫星的观测持续时间,即
其中:为条带所包含的网格集合;tn)为网格k的先验概率;pd为发现概率;wews分别为卫星可见时间窗口的开始时间与结束时间。
移动卫星成像任务规划问题中,每颗卫星过境任务区域的覆盖情况不同,这就会导致卫星各条带的覆盖情况不同。如图4所示,对条带从上至下进行顺序编号,可以看出图4(a)中只有编号为8,9,10的条带覆盖了任务区域,图4(b)中只有编号为1,2,3的条带覆盖了任务区域。因此,对于不同的状态,执行某些动作是没有奖励的。传统的解决方法是对奖励函数进行设计:当智能体选择到无效的动作或者边界外的动作就给予一个较差的收益。但是,这样的“软约束”无法保证智能体在训练过程中都能满足约束,从而无法确保最终学习到满足约束的最优策略。
为了解决上述问题,本文在PPO算法中加入动作掩码机制,直接在动作网络输出时屏蔽无效的动作。在PPO算法中,动作网络πθ使用softmax函数,将所有动作的非标准化概率,…,转化为归一化动作概率πθa1|s),…,πθan|s[17],即
其中,softmax函数公式为
→-∞时,有
因此,对于状态s的无效动作,可让其未归一化概率等于一个非常大的负数,那么经过softmax函数映射后,其概率将趋近于0。同时,为了确保加入动作掩码机制后的损失函数可以求偏导,必须保证掩码函数可微,因此,掩码函数如下
其中,M为较大的负数。那么最终得到的概率为
可以看出,mask函数是一个可微的函数,那么π′θ也是可微的,所以∂π′θ(·|s)/∂θ是存在的。
在加入动作掩码机制后,动作网络的输出层不需要激活函数,直接输出所有动作的未归一化概率,然后将其代入式(21)、式(22)就可以得到动作概率。同时,在该问题中,每颗卫星的条带覆盖情况不同,所以每个状态对应的mask函数都不同,因此需要将每个状态对应的mask函数加入到经验池中,以便训练策略在更新过程中可根据经验池中的状态得到正确的动作概率,那么改进算法的经验池由{stat,lnpst|at),rtvst),done,mask}七元组组成。其中,done表示是否为终止状态,直接用于算法训练。改进PPO算法的算法框架如图5所示。
同时为了提升算法的收敛效率,消除更新过程中通过奖励计算得到的优势度的均值与标准差,对优势度进行归一化处理,即
其中:mean函数为均值函数;std函数为标准差函数,加上10-5是防止除以0。
为了验证所提出方法在求解卫星成像调度问题时的正确性与优越性,首先给定卫星可见时间窗口、成像载荷参数以及改进PPO算法参数设定,其次在同一参数设定下给出不同PPO算法的对比仿真,然后针对不同初始位置概率与目标转移概率算例给出相应的仿真结果。
(1)任务时间、任务区域以及网格设置
在仿真中,设置任务为UTC时间[2022/12/1204:00:00,2022/12/1304:00:00],任务区域为{(109°,30°),(110.5°,28.5°)}的矩形区域,以0.5°为网格粒度将任务区域划分成9个网格。
(2)可见时间窗口
基于设计的部分组网卫星以及STK软件,可得到如表1所示的可见时间窗口。
(3)成像载荷参数设置
设置卫星载荷的侧视角范围、侧视角离散值、视场角等参数,如表2所示。
(4)移动目标初始概率与转移概率设置
将任务区域划分为9个网格,移动目标的初始概率设置为
Qinit =[0.8,0.1,0.1,0,0,0,0,0,0](24)移动目标更新时间间隔为2h,转移概率设置为
(5)网络结构与超参数设置
改进PPO算法的动作网络与评价网络的网络结构与超参数设置如表3所示,动作网络与评价网络两层隐含层的神经元个数分别为64和8,输出层的激活函数都为Linear函数,其他各层激活函数都为Tanh函数,动作网络的学习率为0.0003,评价网络学习率为0.001,衰减因子为0.99。对于该工况,设置最大迭代步数为3000。
经过训练后,得到每轮的平均奖励如图6所示,曲线为3次训练下奖励的均值,阴影为3次训练下平均奖励的范围。红色曲线为本文所提出的算法(PPO+mask+advNor),即融合动作掩码与优势度归一化的改进PPO算法,绿色曲线为标准PPO算法加上优势度归一化机制(PPO + advNor),蓝色曲线为标准PPO算法(PPO)。从图中可以看出所设计的算法可以在3000步内收敛,说明动作掩码机制可以有效地对无效动作进行约束,能让智能体快速地学习到满足约束的最优策略。同时,加入优势度归一化机制的PPO算法收敛效果也比标准PPO算法要好,在500步之前,两种算法得到的平均奖励相差不大,但在500步以后,加入优势度归一化机制的PPO算法也呈现出缓慢的收敛趋势。
利用训练好的策略进行测试,决策的条带结果如表4所示,可视化结果如图7所示。结果表明,训练好的策略可以选择最优的条带对移动目标进行观测,实现奖励函数最大化。在任务时域前期,智能体趋向于选择包含网格1,2,3,4的条带进行观测;在任务时域中期,智能体趋向于选择包含网格5,6的条带进行观测;在后期,智能体则选择包含网格7,8,9的条带进行观测,这与所设置的转移概率表征的路径大致相同。最优观测序列如图7所示,图中不同颜色线段代表不同卫星过境对目标进行观测,相同颜色线段集合表示当前过境选择的条带所包含的网格集合。值得注意的是,在2022/12/1210:20:42,2022/12/1221:17:26时刻,过境任务区域的卫星Sat3以及Sat7只有一个条带覆盖了任务区域,所以这两颗卫星只能选择条带9。
对不同初始概率与转移概率情况下的移动目标成像任务规划问题进行仿真验证。平均奖励如图8所示,蓝色曲线为概率相对集中条件下训练的平均奖励曲线,在该概率条件下移动目标相对容易被发现,因此得到的平均奖励也是最大的;绿色曲线为概率相对分散条件下训练的平均奖励曲线,在该概率条件下移动目标不容易被发现,因此得到的平均奖励较小;红色曲线给定的概率相对适中。从图中可以看出,3种概率条件下的策略都能在3000步内快速收敛,所设计的算法可以正确地求解移动目标成像任务规划问题。
为了实现在线对不同初始位置概率与目标转移概率情况下的成像任务规划问题进行求解,将动作网络和评价网络的第一层网络与第二层网络神经元个数分别设置为400与200,并在每一轮训练开始随机初始化初始位置概率与转移概率。经过15000轮训练,得到的平均奖励如图9所示。
对训练好的策略进行50次在线测试,每次测试都随机初始化初始概率与转移概率。得到平均奖励如图10所示,在每次在线规划中,策略都能决策出正确观测条带,使得平均奖励最优。
针对考虑成像条带划分及移动目标位置信息不可观下的多星协同观测调度问题,提出了基于改进PPO算法的成像任务规划方法。通过对不同初始概率与转移概率情况下的成像调度问题进行数值仿真,仿真结果表明,改进的PPO算法可以在3000步内收敛,而标准PPO算法在3000步内未呈现出收敛趋势,说明动作掩码机制可以有效地对无效动作进行约束剪枝,能让智能体快速地学习到能够满足约束的最优策略。因此,改进PPO算法可以在线正确求解移动目标成像任务规划问题。
参考文献 引证文献
排序方式:
[1]
总装备部.卫星应用现状与发展[M].北京:中国科学技术出版社,2001.
The General Equipment DepartmentCurrent status and development of satellite applications[M].Beijing:Science and Technology of China Press,2001 (in Chinese).
[2]
王永刚,刘玉文.军事卫星及应用概论[M].北京:国防工业出版社,2003.
WANG Y G,LIU Y W.Introduction to military sat ellites and applications[M].Beijing:National Defense Industry Press,2003(in Chinese).
[3]
VERFAILLIE G,SCHIEX T.Solution reuse in dynamic constraint satisfaction problems[C]//Proceedings of the Twelfth AAAI National Conference on Artificial Intelligence,Seattle:AAAI.1994,94:307-312.
[4]
CORDEAU J F,LAPORTE G.Maximizing the value of an Earth observation satellite orbit[J].Journal of the Operational Research Society,2005,56(8):962-968.
[5]
冉承新,王慧林,熊纲要,.基于改进遗传算法的移动目标成像侦测任务规划问题研究[J].宇航学报, 2010,31(2):457-465.
RAN C X,WANG H L,XIONG G Y,et al.Research on mission-planing of ocean moving targets imaging reconnaissance based on improved genetic algorithm[J]. Journal of Astronautics,2010,31(2):457-465(in Chinese).
[6]
王海蛟,贺欢,杨震.敏捷成像卫星调度的改进量子遗传算法[J].宇航学报,2018,39(11):1266-1274.
WANG H J,HE H,YANG Z.Scheduling of agile satellites based on an improved quantum genetic algorithm[J].Journal of Astronautics,2018,39(11):1266-1274(in Chinese).
[7]
陈雄姿,谢松,蔡熙,.敏捷卫星动中成像自主任务规划算法[J].宇航学报,2023,44(11):1693-1705.
CHEN X Z,XIE S,CAI X,et al.Algorithms of autonomous mission planning for agile satellite active pushbroom imaging[J].Journal of Astronautics,2023,44(11):1693-1705(in Chinese).
[8]
王建江,邱涤珊,贺川,.考虑目标间不同转换方式的成像卫星调度[J].宇航学报,2012,33(12):1806-1814.
WANG J J,QIU D S,HE C,et al.Scheduling of imaging satellite with different transition modes between adjacent targets[J].Journal of Astronautics, 2012,33(12):1806-1814(in Chinese).
[9]
李凯文,张涛,王锐,.基于深度强化学习的组合优化研究进展[J].自动化学报,2021,47(11):2521-2537.
LI K W,ZHANG T,WANG R,et al.Research reviews of combinatorial optimization methods based on deep reinforcement learning[J].Acta Automatica Sinica,2021,47(11):2521-2537(in Chinese).
[10]
HE Y,XING L,CHEN Y,et al.A generic Markov decision process model and reinforcement learning method for scheduling agile earth observation satellites[J].IEEE Transactions on Systems,Man,and Cybernetics:Systems,2020,52(3):1463-1474.
[11]
CHEN M,CHEN Y,CHEN Y,et al.Deep reinforcement learning for agile satellite scheduling problem[C]//2019 IEEE Symposium Series on Computational Intelligence (SSCI).Xiamen:IEEE,2019:126-132.
[12]
HUANG Y,MU Z,WU S,et al.Revising the observation satellite scheduling problem based on deep reinforcement learning[J].Remote Sensing,2021,13(12):2377.
[13]
WEI L,CHEN Y,CHEN M,et al.Deep reinforcement learning and parameter transfer based approach for the multi-objective agile earth observation satellite scheduling problem[J].Applied Soft Computing, 2021,110:107607.
[14]
马一凡,赵凡宇,王鑫,.基于改进指针网络的卫星对地观测任务规划方法[J].浙江大学学报(工学版), 2021,55(2):395-401.
MA Y F,ZHAO F Y,WANG X,et al.Satellite earth observation task planning method based on improved pointer networks[J].Journal of Zhejiang University (Engineering Science),2021,55(2):395-401(in Chinese).
[15]
KAELBLING L P,LITTMAN M L,CASSANDRA A R.Planning and acting in partially observable stochastic domains[J].Artificial Intelligence,1998,101(1-2):99-134.
[16]
慈元卓,贺仁杰,徐一帆,.卫星搜索移动目标问题中的目标运动预测方法研究[J].控制与决策,2009, 24(7):1007-1012.
CI Y Z,HE R J,XU Y F,et al.Method of target motion prediction for moving target search by satellite[J].Control and Decision,2009,24(7):1007-1012(in Chinese).
[17]
SCHULMAN J,WOLSKI F,DHARIWAL P,et al. Proximal policy optimization algorithms[J].arXiv preprint arXiv:1707.06347,2017.Available at:https://arxiv.org/abs/1707.06347.
2025年第8卷第5期
PDF下载
146
68
引用本文
BibTeX
文章信息
doi: 10.20249/j.cnki.2096-5974.2025.05.004
  • 首发时间:2026-04-23
补充材料
相关文章
文章信息
作者
出版历史
基金
作者信息
    1中南大学 自动化学院·长沙·410083
    2中国星网网络应用研究院有限公司·北京·100001

通讯作者:

通信作者简介:魏才盛,男,博士,教授,博士生导师。
参考文献
分享链接
https://castjournals.cast.org.cn/joweb/fkytc/CN/10.20249/j.cnki.2096-5974.2025.05.004
分享至
全文二维码

扫描看全文

引用本文
BibTeX
本文的引用情况
2种不同金属材料的力学参数

Family
属数
Number of
genus
种数
Number of
species
占总种数比例
Percentage of
total species (%)

Genus
种数
Number of
species
占总种数比例
Percentage of total
species (%)
鹅膏菌科Amanitaceae 2 11 5.26 鹅膏菌属 Amanita 10 4.78
小菇科 Mycenaceae 2 12 5.74 丝盖伞属 Inocybe 5 2.39
多孔菌科 Polyporaceae 8 14 6.70 蜡蘑属 Laccaria 5 2.39
红菇科 Russulaceae 3 23 11.00 小皮伞属 Marasmius 6 2.87
小菇属 Mycena 11 5.26
光柄菇属 Pluteus 5 2.39
红菇属 Russula 17 8.13
栓菌属 Trametes 5 2.39
关闭全屏