收藏切换
Cross Modal Learning Method of Speech Face via Single Stream Network
收藏切换
PDF
Fang-hao ZHONG, Fan-liang BU*, Hao-ming QIN
Science Technology and Engineering | 2025, 25(11) : 4638 - 4646
Less
收藏切换
Science Technology and Engineering | 2025, 25(11): 4638-4646
Papers·Automation and Computational Technology
Cross Modal Learning Method of Speech Face via Single Stream Network
Full
Fang-hao ZHONG, Fan-liang BU*, Hao-ming QIN
Affiliations
  • School of Information Network Security, Peoples Public Security University of China, Beijing 100038, China
Published: 2025-04-18 doi: 10.12404/j.issn.1671-1815.2403487
Outline
收藏切换

Existing methods for audio-visual cross-modal association learning often adopt a dual-stream network structure, but they still face challenges in reducing computational complexity, model light weighting, and efficient feature fusion. To improve model performance and enhance the efficiency of cross-modal learning, a single-stream network-based approach for audio-visual cross-modal learning was proposed. Firstly, preprocessed data from both modalities were fed into a single-stream feature extraction network, where a class-information-based loss function was employed to learn and extract feature vectors from both modalities. Subsequently, attention-based feature fusion was performed on the extracted feature vectors from both modalities. Finally, a combination of cosine similarity algorithm and cross-entropy loss was used to learn the association between the two modalities, thus completing the cross-modal association learning task. Experimental results demonstrate that the proposed method achieves promising performance in audio-visual cross-modal verification, matching, and retrieval tasks, ensuring excellent performance while considering the lightness and flexibility of the network structure.

association learning  /  voice-face cross-modal  /  single-stream network  /  feature fusion
Fang-hao ZHONG, Fan-liang BU, Hao-ming QIN. Cross Modal Learning Method of Speech Face via Single Stream Network[J]. Science Technology and Engineering, 2025 , 25 (11) : 4638 -4646 . DOI: 10.12404/j.issn.1671-1815.2403487
Year 2025 volume 25 Issue 11
PDF
297
113
Cite this Article
BibTeX
Article Info
doi: 10.12404/j.issn.1671-1815.2403487
  • Receive Date:2024-05-11
  • Online Date:2025-07-09
  • Published:2025-04-18
Article Data
Affiliations
History
  • Received:2024-05-11
  • Revised:2024-08-01
Funding
Affiliations
    School of Information Network Security, Peoples Public Security University of China, Beijing 100038, China
References
Share
https://castjournals.cast.org.cn/joweb/kxjsygc/EN/10.12404/j.issn.1671-1815.2403487
Share to
QR

Scan QR to access full text

Cite this article
BibTeX
Citations
表12种不同金属材料的力学参数

Family
属数
Number of
genus
种数
Number of
species
占总种数比例
Percentage of
total species (%)

Genus
种数
Number of
species
占总种数比例
Percentage of total
species (%)
鹅膏菌科Amanitaceae 2 11 5.26 鹅膏菌属 Amanita 10 4.78
小菇科 Mycenaceae 2 12 5.74 丝盖伞属 Inocybe 5 2.39
多孔菌科 Polyporaceae 8 14 6.70 蜡蘑属 Laccaria 5 2.39
红菇科 Russulaceae 3 23 11.00 小皮伞属 Marasmius 6 2.87
小菇属 Mycena 11 5.26
光柄菇属 Pluteus 5 2.39
红菇属 Russula 17 8.13
栓菌属 Trametes 5 2.39
关闭全屏
  • BibTeX
  • EndNote
  • RefWorks
  • TxT