基于深度特征交互與層次化多模態(tài)融合的情感識別模型

打開文本圖片集
關(guān)鍵詞:多模態(tài)情感識別;層次化融合;多尺度融合;特征融合
中圖分類號:TP391 文獻(xiàn)標(biāo)志碼:A 文章編號:1001-3695(2025)07-008-1978-08
doi:10. 19734/j.issn.1001-3695.2024.11.0487
Abstract:Multimodalemotionrecognitionhasrecentlybecomeanimportantresearchdirectioninafectivecomputing,aiming to moreaccuratelyrecognizeandunderstand human emotional states by integrating various modalitiessuchasspeechandtext. However,existingmethodslacktheprocessngofinter-modalcorelationsduringfeatureextractionandoverlook multi-scale emotionalcuesduring feature fusion.Toaddresstheseisues,thisstudyproposedadeepfeature interactionand hierarchical multimodal fusionemotionrecognition model(DFIHMF).Inthefeature extraction stage,themodel enhanced interactionsbetweendifferentmodalitiesandextractedmulti-scaleinformationbyintroducinglocalknowledgetokens(LKT)andcrosmodal interaction tokens(CIT).Inthefeature fusionstage,the model integratedcomplexmultimodalfeaturesandmulti-scaleemotionalcesusingahierarchical fusionstrategy.ExperimentalresultsontheMOSIandMOSEIdatasetsshow thatthemodel achieves accuracy rates of 45.6% and 53.5% on the ACC7 evaluation metric,demonstrating that the proposed method outperforms existing technologies in multimodal emotion recognition tasks.
Key Words:multimodal emotion recognition;hierarchical fusion;multi-scale fusion;feature fusion
0 引言
情感識別是自然語言處理(naturallanguageprocessing,NLP)中的一項核心任務(wù),其目標(biāo)在于分析和處理輸入文本,以估計對象的情緒狀態(tài)。(剩余21459字)