基于強(qiáng)化學(xué)習(xí)的人道主義應(yīng)急物資分配優(yōu)化研究

打印
收藏

收藏成功

微博 QQ空間微信

打開文本圖片集

Research on the Optimization of Humanitarian Emergency Material Allocation Based on Reinforcement Learning

ZHANGJianjunYANGYundan ZHOU Yizhuo

（School of Economics and Management， Tongji University， Shanghai 2Ooo92，China）

Abstract： The efcient allocation of limited humanitarian aid supplies following major emergencies is a critical research topic，aiming to meet the material needs of affected areas while reducing the sufering of disaster victims. This paper addresses this issue by modeling a Mixed Integer Nonlinear Programming （MINLP） problem，which involves solving multi-period dynamic optimization allocation strategies.Reinforcement Learning （RL），as one of the two mainstream methods for current strategy exploration，is particularly suitable for dynamic resource allocation scenarios due to its strong scalability and adaptability to external dynamics through interaction with the environment and feedback signals. We employ the Dueling DQN algorithm to solve for the optimal policy，overcoming the overestimation of Q-values that has been a drawback in previous RL applications to humanitarian aid distribution. This approach more accurately estimates the action-value function for affcted regions. Additionally，the paper introduces a novel stochastic demand assumption，enhancing the model's realism and validity by better reflecting the actual conditions of disaster scenarios. The effectiveness of the proposed method is demonstrated using a numerical example based on the Ya'an earthquake，making this the first study to substantiate the optimization of emergency resource allocation using real data sources with RL. Comparative analysis shows that the Dueling DQN algorithm reduces the total cost by approximately 5% compared to traditional DQN methods， indicating a more effective reduction in the sufering of affected populations. This aligns with the“people-oriented”rescue principle of China and holds significant theoretical and practical implications for humanitarian-based emergencyresponses.

Key words： deep reinforcement learning； humanitarian； emergency supplies distribution

0 引言

在重大突發(fā)事件發(fā)生后，拯救生命、減輕受災(zāi)民眾痛苦是災(zāi)害救援的首要目標(biāo)。（剩余11650字）

試讀結(jié)束

購買全文6.00元下一篇醫(yī)患人格相似性對在線醫(yī)療交互效果的影響研究

上海管理科學(xué)

2025年02期

￥9.00/本

特黄三级爱爱视频|国产1区2区强奸|舌L子伦熟妇aV|日韩美腿激情一区|6月丁香综合久久|一级毛片免费试看|在线黄色电影免费|国产主播自拍一区|99精品热爱视频|亚洲黄色先锋一区

基于強(qiáng)化學(xué)習(xí)的人道主義應(yīng)急物資分配優(yōu)化研究