特黄三级爱爱视频|国产1区2区强奸|舌L子伦熟妇aV|日韩美腿激情一区|6月丁香综合久久|一级毛片免费试看|在线黄色电影免费|国产主播自拍一区|99精品热爱视频|亚洲黄色先锋一区

基于GPU的Winograd 卷積算法并行化

  • 打印
  • 收藏
收藏成功


打開文本圖片集

中圖分類號:TP183 文獻(xiàn)標(biāo)志碼:A 文章編號:1001-3695(2025)08-026-2446-06

doi:10.19734/j. issn.1001-3695.2024.11.0502

GPU-based parallelization of Winograd convolution algorithm

Wang Xin?,Zhen Xueru (KeyLboratodedrotroghstr(tcto)gUsitui)

Abstract:This paper proposedaninovativeWinogradparalelconvolutionalgorithmbasedonGPU toaddress theproblemof excessivecomputationalloadinmodernconvolutionalneuralnetworks.Thealgorithmusedload-balanced task mapping,optimized thedataloadingstrategyto hidelatency,andcombined thedynamic padding methodtofullexplore thesynergybetwen theWinogradconvolution algorithmandtheGPUarchitecture.Experimentalresultsshowthatonmultipleconvolutionallayers of the classic convolutional l network model ResNet,the proposed algorithm outperforms the standard Winograd convolutionalgorithmintheNVIDIAcuDNN8.3.Olibrary.Itachievesaspeed-upratioofupto2.46ontheTuringarchitecture RTX 2080Ti GPUand maintainshigh computational accuracy.Compared with the standard Winograd convolutionalgorithm based on GPU,the algorithm significantly improves the efficiency of convolutional computation.

Key Words:Winograd algorithm;parallel computing;CUDA;convolutional neural network

0 引言

卷積神經(jīng)網(wǎng)絡(luò)(convolutionalneuralnetwork,CNN)作為深度學(xué)習(xí)(deeplearning,DL)中的核心技術(shù),已經(jīng)在圖像分類[1]和目標(biāo)分割[2]等多個(gè)領(lǐng)域得到了廣泛應(yīng)用。(剩余14861字)

目錄
monitor