基于圖像知識增強的中文多模態(tài)反諷檢測方法

打開文本圖片集
Chinese multimodal irony detection method based on image knowledgeenhancement
LI Yueying,CAO Hui, ZHANG Jisai, XIA Xiaotian (KeyLaboratoryofLinguisticandCulturalComputingMinistryofEducation,InstituteofChineseEthnicInformationTechnology, Northwest Minzu University,Lanzhou 73Oo3O,China)
Abstract:With therapid development of social mediaandonlinecontent,theuseof ironyhasbecomecommonin the onlinecommunicationandinformationdisemination.However,thetraditionaltextanalysismethodsoftenfailtocapturethe meaningofironyaccurately,andrelyingsolelyontextualinformationhaslimitationsandisofinstability.Inthispaper,a Chinesemultimodal ironydatasetisconstructed.Thedatasetincludes5964annotateddatasamples,includingtwomodesoftext andimage.Theimagesplayanimportantroleinmultimodalironydetectiontasks.Inordertofullexplorethehidden information inimages,theimagecaptioninggenerationmodelViT-GPT-image-captioningisusedtogeneratethedescription informationoftheimageforimageknowledgeenhancement,soastoenhancetheunderstandingandcognitionoftheimage. Moreover,amultimodalatentionnetwork modelCMANetthatintegratesmodal informationforironydetectionisproposedtoget ridof theinsuffcientinformationcorelationbetweenmodesandlackofdataintheprocessofmulti-modaldatafusion. Experimental verification was performed on the dataset. The results show that the F1 -score of the proposed CMANet model has been improved by 1.49% and its accuracyby 1.89% in comparison with those of the baseline model.
Keywords:multimodal; irony detection; attention mechanism; cross-modal; deep learning; network fusion
0 引言
反諷是一種特殊的情感表達技巧,讓人難以直接理解表達者的真實意圖,達到委婉而含蓄的表達目的。(剩余11168字)