基于風(fēng)險(xiǎn)敏感的自動(dòng)駕駛汽車分層強(qiáng)化學(xué)習(xí)決策

打開(kāi)文本圖片集
Risk-sensitive hierarchical reinforcement learning decisionmaking for autonomous vehicles
HUZhilong1,PEI Xiaofei*1,2,ZHOU Honglong1,WEI Weiran2 (1.HubeiKeyLaboratoryofAdvancedTechnologyofAutomotiveComponents,Wuhan UniversityofTechnology, Wuhan43oo70,China;2.HubeiColaborativeInnovationCenterofAutomotiveComponentsTechnologyWuhan UniversityofTechnology,Wuhan,China)
Abstract:Inorder to make the behavior decision of autonomous vehicles fully consider the inherent uncertainty in the traficenvironment,this paper introduced quantile regressionand Conditional Valueat Risk (CVaR)basedonthe traditional RainbowDQNalgorithm,taking low-probabilityrisks intoconsideration,and properly balancing risksand benefits,so thatitcan make saferand more humane driving decisions.Abehavioral decisionmodelwas established basedonthe Markov framework,and the reward functionandaction space weredesigned by comprehensivelyconsidering safety,efficiencyandcomfort.Aplanningand control model wasbuilt,andtwoscenariosof highwayinflowandoutflowand intersectionwerebuiltusing theOpenNatural Driving Inteligent Vehicle SimulationTest Environment (OnSite)platform.The OnSiteevaluation tool was used tosimulateand compare the four algorithms of RainbowDQN-CVaR,RainbowDQN-QR,RainbowDQNand DSAC-T.The results show that in complex highway merging and exiting scenarios and intersection scenarios, theproposed RainbowDQN-CVaRalgorithmscores 55.3% and 47% higher than the traditional RainbowDQN algorithm, 17.7% and 34.3% higher than the RainbowDQN-QRalgorithm,and 2.8% and 62.7% higher than the DSAC-Talgorithm.Theeffectivenessof theRainbowDQN-CVaRbehaviordecisionmodel isverified,and itcan make saferand more reasonable decisions in a more complex traffic environment,making the autonomous driving vehicle have higher driving safetyand efficiency.
Key words:autonomous driving; reinforcement learning; behavioral decision-making; quantile regression; conditional value at risk (CVaR)
面向復(fù)雜開(kāi)放道路環(huán)境下的自動(dòng)駕駛汽車,還存在駕駛行為過(guò)于保守、人工接管率高等問(wèn)題,主要體現(xiàn)在決策規(guī)劃的智慧程度不高,由此會(huì)帶來(lái)一系列安全風(fēng)險(xiǎn),如自車碰撞風(fēng)險(xiǎn)、整體交通流的安全風(fēng)險(xiǎn)等。(剩余11946字)