特黄三级爱爱视频|国产1区2区强奸|舌L子伦熟妇aV|日韩美腿激情一区|6月丁香综合久久|一级毛片免费试看|在线黄色电影免费|国产主播自拍一区|99精品热爱视频|亚洲黄色先锋一区

面向說(shuō)話(huà)人日志的多原型驅(qū)動(dòng)圖神經(jīng)網(wǎng)絡(luò)方法

  • 打印
  • 收藏
收藏成功


打開(kāi)文本圖片集

Multi-prototype driven graph neural network for speaker diarization

Abstract:Recently,theutilizationof graphneuralnetwork forsesson-levelmodelinghasdemonstrateditseficacyforspeakerdiarization.However,mostof existing variantssolelyrelyonlocalstructure information,gnoringtheimportanceof global speakerinformation,whichcannotfullycompensateforthelackof speakerinformationinthespeakerdiarizationtask.This paper proposedamulti-prototypedriven graphneuralnetwork(MPGNN)forrepresentationlearning,whichefectivelycombined local and global speaker information within each session and simultaneously remaps X -vector to a new embedding space that was moresuitableforclustering.Specifically,,the designof prototypelearning withadynamicandadaptive approach wasacritical component,where more accurateglobal speaker informationcould becaptured.Experimentalresultsshowthatthe proposed MPGNN approach significantly outperforms the baseline systems,achieving diarization error rates(DER)of 3.33% , 3.52% , (204號(hào) 5.66% ,and 6.52% on the AMI_SDM and CALLHOME datasets respectively.

Keywords:speakerdiarization;graphneural network;local structure information;global speaker information;multiprototype learning

0 引言

說(shuō)話(huà)人日志(speakerdiarization,SD)的目標(biāo)是解決“誰(shuí)在何時(shí)說(shuō)話(huà)”的問(wèn)題,即在給定的包含多個(gè)說(shuō)話(huà)人交流的長(zhǎng)音頻信號(hào)中,同時(shí)實(shí)現(xiàn)說(shuō)話(huà)人識(shí)別和說(shuō)話(huà)人定位。(剩余15780字)

目錄
monitor