Research on depth estimation and portrait segmentation based on diffusion models

Research on depth estimation and portrait segmentation based on diffusion models

PDF

Zongbo DONG, Yifan WANG, Lijun WANG^*, Huchuan LU

Science & Technology Review | 2025, 43(22) : 98 - 107

Less

Science & Technology Review | 2025, 43(22): 98-107

• Papers •

Research on depth estimation and portrait segmentation based on diffusion models

Full

Zongbo DONG, Yifan WANG, Lijun WANG^*, Huchuan LU

Affiliations

Dalian University of Technology School of Future Technology, Dalian 116024, China

Published: 2025-11-28 doi: 10.3981/j.issn.1000-7857.2025.05.00059

Outline

Abstract

Less

While diffusion models have demonstrated remarkable capabilities in generative tasks, their application to visual perception tasks such as depth estimation and portrait segmentation remains underexplored. This paper proposes Diffusion Perception, a unified framework based on diffusion models for high−quality depth estimation and portrait segmentation. By reformulating traditional perception tasks as conditional generation problems, the framework leverages the denoising characteristics of latent diffusion models (LDMs) to optimize prediction results in latent space. The innovative design incorporates three core processing stages: multimodal feature encoding, noise input prediction, and text−controlled feature extraction and reconstruction, enabling the transition of diffusion models from generative paradigms to visual perception task paradigms. Experimental results demonstrate that on our custom depth estimation dataset, the proposed method achieves evaluation metrics of 93.98% Relative Accuracy (RR), 99.61% Plane Estimation Accuracy (Plane), and 93.61% Scene Consistency (Consistence), outperforming existing state−of−the−art depth estimation methods. Furthermore, in portrait segmentation tasks, the method achieves Intersection over Union (IoU) and mean IoU (mIoU) scores of 96.98% and 91.98% respectively, surpassing existing segmentation algorithms. This study provides novel insights into applying diffusion models in visual perception, where their generative paradigm naturally handles prediction uncertainty and is well−suited for robust perception in dynamic environments.

Key words

diffusion models / depth estimation / portrait segmentation / fully convolutional networks / deep learning

Cite this Article

Zongbo DONG, Yifan WANG, Lijun WANG, Huchuan LU. Research on depth estimation and portrait segmentation based on diffusion models[J]. Science & Technology Review, 2025 , 43 (22) : 98 -107 . DOI: 10.3981/j.issn.1000-7857.2025.05.00059

Appendix

Less

Year 2025 volume 43 Issue 22

PDF

1199

674

Cite this Article

BibTeX

Article Info

doi: 10.3981/j.issn.1000-7857.2025.05.00059

Receive Date：2025-05-12
Online Date：2025-12-29
Published：2025-11-28

Article Data

Affiliations

History

Received：2025-05-12
Revised：2025-11-03

Funding

Affiliations

Dalian University of Technology School of Future Technology, Dalian 116024, China

References

Share

https://castjournals.cast.org.cn/joweb/kjdb/EN/10.3981/j.issn.1000-7857.2025.05.00059

Share to

Scan QR to access full text

Cite this article

BibTeX

Citations

表12种不同金属材料的力学参数

科 Family	属数 Number of genus	种数 Number of species	占总种数比例 Percentage of total species (%)	属 Genus	种数 Number of species	占总种数比例 Percentage of total species (%)
鹅膏菌科Amanitaceae	2	11	5.26	鹅膏菌属 Amanita	10	4.78
小菇科 Mycenaceae	2	12	5.74	丝盖伞属 Inocybe	5	2.39
多孔菌科 Polyporaceae	8	14	6.70	蜡蘑属 Laccaria	5	2.39
红菇科 Russulaceae	3	23	11.00	小皮伞属 Marasmius	6	2.87
				小菇属 Mycena	11	5.26
				光柄菇属 Pluteus	5	2.39
				红菇属 Russula	17	8.13
				栓菌属 Trametes	5	2.39

关闭全屏

BibTeX
EndNote
RefWorks
TxT

Articles: Latest Articles; Most Read; Collections

Updates: Events; News; Multimedia

About: About Us

Contact

No. 86 Xueyuan South Road, Haidian District, Beijing

100081

010-62199257

qkjq@cast.org.cn

Copyright © 2025 China Association for Science and Technology. All rights reserved. For all open access content, the relevant licensing terms apply.
Sponsored by the Office of the Leading Group for Cybersecurity and Informatization of CAST, and supported by Science and Technology Review Publishing House