基于特征融合的音頻偽造檢測方法

打印
收藏

收藏成功

微博 QQ空間微信

打開文本圖片集

關(guān)鍵詞：音頻深度偽造檢測；深度學(xué)習(xí)；特征融合；聲碼器偽跡

中圖分類號(hào)：TN912.3 文獻(xiàn)標(biāo)志碼：A 文章編號(hào)：1001-3695（2025）07-025-2109-07

doi：10.19734/j.issn.1001-3695.2024.11.0460

Abstract：Advancements inartificialinteligence have madedistinguishingsynthesized speech fromgenuinespeech increasinglychallenging，complicating audio deepfake detection.Existing methods often exhibit low acuracy，poor generalization， and weakrobustness.Thisstudy proposed MFF-STViT，amethod integratingthreeaudio features with vocoderartifactfeatures through anovelfeature fusionmoduletoenhance representation.The fused features were processdusing animproved Transformer model，STViT，toreduce redundancyand improve detectionperformance.Onthe ASVspoof2019LA testset，the method reduced the equal error rate（EER）by 71.38% on average. On the ASVspoof2O21 LA dataset， it achieved average reductions of 44.41% in EERand 18.11% intheminimum tandem detection cost function（min-tDCF）.For the ASVspoof2021 DF dataset， the average EER decreased by 57.81% ，with reductions exceeding 80% in specific partitions. These findings demonstrate the efectiveness of MFF-STViT in improving accuracy，generalization，and robustness.

Keywords：audio deepfake detection；deep learning；feature fusion；vocoder artifacts

0 引言

近年來，自動(dòng)說話人確認(rèn)（automaticspeakerverification，ASV）系統(tǒng)因其采集方式簡便、特異性高、成本低等優(yōu)點(diǎn)被廣泛應(yīng)用于語音郵件、電話銀行、呼叫中心、生物特征認(rèn)證、法醫(yī)應(yīng)用等領(lǐng)域[1]。（剩余19472字）

試讀結(jié)束

購買全文6.00元下一篇基于多視圖舌象特征融合的中醫(yī)證型辨識(shí)

計(jì)算機(jī)應(yīng)用研究

2025年07期

￥12.00/本

特黄三级爱爱视频|国产1区2区强奸|舌L子伦熟妇aV|日韩美腿激情一区|6月丁香综合久久|一级毛片免费试看|在线黄色电影免费|国产主播自拍一区|99精品热爱视频|亚洲黄色先锋一区

基于特征融合的音頻偽造檢測方法