基于多頭集中注意力機制的無監(jiān)督視頻摘要模型

打開文本圖片集
中圖分類號:TP391 文獻(xiàn)標(biāo)志碼:A
Unsupervised Video Summriztion Model Bsed on Multi-hed Concentrtion Mechnism
LI Yujie ,b , JIA Honn , LING Lia , ZHOU Wenkai°, JIANG Zhenga,DING Shuxue a,b , TAN Benying a,b (a.SchoolofArtificial Intelligence,b.KeyLaboratoryofArtificial IntellgenceAlgorithmEngineeringof Guangxi Universities,Guilin Universityof Electronic Technology,Guilin 541OO4,Guangxi,China)
Abstract:Toadressthelimitations of existing video summarization methods inestablishing long-range frame dependenciesand paralelized training,anovel unsupervisedvideosummarizationmodel basedonthe multi-headcentralized atention mechanism(MH-CASUM)was proposed.The multi-head atention mechanism was integrated intothecentralized atentionmodel,thelengthregularizationlossfunction wasimproved,andthelossthreshold formodelparameterselection was optimized.The uniquenessand diversityof video frames were leveraged to enrich thesummary information,thereby the video summarization task was more eficiently accomplished.The performanceofthe MH-CASUM model was validated through evaluation experiments on SumMe and TVSum datasets using F1 score,Kendall correlation coefficient,and Spearmancorrelationcoeffcient.Theresultsshow thatthe introductionofmulti-headatentionmechanismandthe improved method for loss threshold inmodel parameter selection significantly enhance thevideo summarization performance of the MH-CASUM model. Compared to the previously best-performing unsupervised video summarization model CASUM,the (2號 F1 score of MH-CASUM on TVSum dataset is increased by 0.98% ,which proves its superiority and competitiveness in video summarization task.
Keywords: video summarization;attention mechanism;multi-head concentrated attention;unsupervised approach
隨著互聯(lián)網(wǎng)和信息技術(shù)的迅速發(fā)展,多媒體技術(shù)的廣泛應(yīng)用給人們的生活帶了極大的便利,同時視頻的“信息爆炸”也給人們帶來諸多不便[1]。(剩余18759字)