特黄三级爱爱视频|国产1区2区强奸|舌L子伦熟妇aV|日韩美腿激情一区|6月丁香综合久久|一级毛片免费试看|在线黄色电影免费|国产主播自拍一区|99精品热爱视频|亚洲黄色先锋一区

基于信息互補(bǔ)與交叉注意力的跨模態(tài)檢索方法

  • 打印
  • 收藏
收藏成功


打開文本圖片集

關(guān)鍵詞:信息互補(bǔ);交叉注意力;圖卷積網(wǎng)絡(luò);跨模態(tài)檢索

中圖分類號(hào):TP391 文獻(xiàn)標(biāo)志碼:A 文章編號(hào):1001-3695(2025)07-015-2032-07

doi:10.19734/j.issn.1001-3695.2025.01.0003

Abstract:WiththerapidgrowthofmultimodaldataontheInternet,cross-modalretrievaltechnologyhasatractedwidespread atention.However,some multimodaldataoftenlacksemanticinformation,whichleadstotheinabilityof modelstoaccurately extracttheinherentsemanticfeatures.Aditionally,somemultimodaldatacontainredundantinformationunrelatedtosemantics,whichinterfereswiththemodelextractionofkeyinformation.Toaddresstis,thispaperproposedacrossmodalretrieval methodbasedoninformationcomplementarityandcross-atention(ICCA).The methodusedaGCN tomodeltherelationships betweenmulti-labelsanddata,supplementing the mising semantic informationinmultimodaldataandthe missing sampledetailinformationinmulti-bels.Moreover,acrossattntionsubmoduleusedulti-labelinformationtoflerouttedudant semantic-irelevantdata.Toachievebetter matchingofsemanticallysimilarimagesand textsinthecommonrepresentation space,this paperproposed asemantic matching lossThislossintegrated multi-labelembeddings intothe image-text matching process,further enhancingthesemanticqualityof thecommonrepresentation.Experimentalresultsonthree widelyuseddatasets NUS-WIDE,MIRFlickr-25K,and MS-COCO demonstrate that ICCA achieves mAPvaluesof0.808,0.859,and0.837, respectively, significantly outperforming existing methods.

KeyWords:informationcomplementarity;cross-attention;graph convolutional network(GCN);cros-modalretrieval

0 引言

近年來,隨著互聯(lián)網(wǎng)技術(shù)的飛速發(fā)展,視頻、圖像、文本等多媒體數(shù)據(jù)呈現(xiàn)出急劇增長的趨勢。(剩余18541字)

目錄
monitor