基于GPU的Winograd 卷積算法并行化

打印
收藏

收藏成功

微博 QQ空間微信

打開文本圖片集

中圖分類號：TP183 文獻(xiàn)標(biāo)志碼：A 文章編號：1001-3695（2025）08-026-2446-06

doi：10.19734/j. issn.1001-3695.2024.11.0502

GPU-based parallelization of Winograd convolution algorithm

Wang Xin?，Zhen Xueru （KeyLboratodedrotroghstr（tcto）gUsitui）

Abstract：This paper proposedaninovativeWinogradparalelconvolutionalgorithmbasedonGPU toaddress theproblemof excessivecomputationalloadinmodernconvolutionalneuralnetworks.Thealgorithmusedload-balanced task mapping，optimized thedataloadingstrategyto hidelatency，andcombined thedynamic padding methodtofullexplore thesynergybetwen theWinogradconvolution algorithmandtheGPUarchitecture.Experimentalresultsshowthatonmultipleconvolutionallayers of the classic convolutional l network model ResNet，the proposed algorithm outperforms the standard Winograd convolutionalgorithmintheNVIDIAcuDNN8.3.Olibrary.Itachievesaspeed-upratioofupto2.46ontheTuringarchitecture RTX 2080Ti GPUand maintainshigh computational accuracy.Compared with the standard Winograd convolutionalgorithm based on GPU，the algorithm significantly improves the efficiency of convolutional computation.

Key Words：Winograd algorithm；parallel computing；CUDA；convolutional neural network

0 引言

卷積神經(jīng)網(wǎng)絡(luò)（convolutionalneuralnetwork，CNN）作為深度學(xué)習(xí)（deeplearning，DL）中的核心技術(shù)，已經(jīng)在圖像分類[1]和目標(biāo)分割[2]等多個(gè)領(lǐng)域得到了廣泛應(yīng)用。（剩余14861字）

試讀結(jié)束

購買全文6.00元下一篇基于強(qiáng)化學(xué)習(xí)的災(zāi)區(qū)應(yīng)急無人機(jī)網(wǎng)絡(luò)服務(wù)公平性最大化方案

計(jì)算機(jī)應(yīng)用研究

2025年08期

￥12.00/本

特黄三级爱爱视频|国产1区2区强奸|舌L子伦熟妇aV|日韩美腿激情一区|6月丁香综合久久|一级毛片免费试看|在线黄色电影免费|国产主播自拍一区|99精品热爱视频|亚洲黄色先锋一区

基于GPU的Winograd 卷積算法并行化