Under the background of the dual carbon goals, the regional integrated energy system (RIES) can achieve interconversion between heterogeneous energy sources due to its multi-energy coupling characteristics, providing new technical support for energy-saving and efficient operation of modern energy systems. Due to the differences in the flow of heterogeneous energy sources in transmission pipelines, existing research usually adopts convex relaxation techniques or linearization methods to model and solve the RIES for multi-time-scale, and relies on high-precision source-load forecasting results and equipment mathematical modeling to improve the reliability of scheduling decisions. However, the increasingly complex internal energy coupling structure of the RIES has increased the difficulty of its refined mathematical modeling and solution, posing challenges to the real-time scheduling decisions and safe optimal operation of the RIES. therefore, this paper proposes an improved distributed bi-layer proximal policy optimization (DBLPPO) deep reinforcement learning scheduling model. This model can achieve multi-time-scale optimization management of various energy networks in the RIES and avoid the optimization difficulties caused by non-convex nonlinear model structures in scheduling solutions.
Firstly, the power output, storage, and transformation of internal energy in the RIES are constructed into a high-dimensional space Markov decision process mathematical model. Secondly, based on the improved distributed proximal policy optimization (DPPO) algorithm, a sequential decision description is made for it, and a control model of the internal bi-layer proximal policy optimization (PPO) is constructed. the local network adopts the "coupling first, then decoupling" solution approach to carry out multi-time-scale optimization decision-making for the cold-heat system and the power system. In the early stage of long time scale, the inner and outer models perform coupled solutions, and the RIES cold-heat system and power system achieve coordinated optimal operation. In the remaining short time scales, the inner and outer models perform decoupled solutions and carry out short-term flexible regulation of the power system. the inner and outer models interact with each other and fluctuating convergence towards the reward maximization direction, eventually achieving multi-time-scale optimization scheduling of the RIES cold-heat system and power system.
This paper conducts simulation experiments with a cold-heat-electric RIES as the scheduling scenario, and compares the scheduling results of the DBLPPO scheduling model with those of a single time scale scheduling model (PPO, DPPO). the results show that the DBLPPO scheduling model can flexibly regulate the system's adjustable resources in the short time scale, meet the power fluctuation requirements of electricity, heat, and cold loads in the short time scale, and has the lowest comprehensive operating cost, which is 24.47% lower than that of the DPPO scheduling model and 28.54% lower than that of the PPO scheduling model. In addition, simulation experiments are conducted with the DBLPPO scheduling model and the bi-layer PPO scheduling model in the same scenario, and the results show that the distributed structure of the DBLPPO scheduling model still has a significant advantage in improving model training efficiency, which can effectively shorten the training time, 10.01% shorter than that of the dual-layer PPO scheduling model.
Through case analysis, it is verified that the proposed scheduling model can achieve coordinated optimal management of various energy networks in the RIES at different time scales, accelerate the optimal decision-making speed of the multi-time-scale scheduling model, and by virtue of the fast adaptability of the deep reinforcement learning algorithm, efficiently solve random optimization problems in complex RIES scenarios, and improve the economic benefits of system operation. The next step of work will be to improve the model to enhance the environmental awareness ability of the inner model, so that its decision-making scheme is always the optimal scheduling decision in the long time scale.
| 科 Family | 属数 Number of genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) | 属 Genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) |
|---|---|---|---|---|---|---|
| 鹅膏菌科Amanitaceae | 2 | 11 | 5.26 | 鹅膏菌属 Amanita | 10 | 4.78 |
| 小菇科 Mycenaceae | 2 | 12 | 5.74 | 丝盖伞属 Inocybe | 5 | 2.39 |
| 多孔菌科 Polyporaceae | 8 | 14 | 6.70 | 蜡蘑属 Laccaria | 5 | 2.39 |
| 红菇科 Russulaceae | 3 | 23 | 11.00 | 小皮伞属 Marasmius | 6 | 2.87 |
| 小菇属 Mycena | 11 | 5.26 | ||||
| 光柄菇属 Pluteus | 5 | 2.39 | ||||
| 红菇属 Russula | 17 | 8.13 | ||||
| 栓菌属 Trametes | 5 | 2.39 |