Open Access Paper
14 February 2024 Active integration algorithm of vehicle road cooperative signal based on deep reinforcement learning
Author Affiliations +
Proceedings Volume 13018, International Conference on Smart Transportation and City Engineering (STCE 2023); 130183F (2024) https://doi.org/10.1117/12.3024608
Event: International Conference on Smart Transportation and City Engineering (STCE 2023), 2023, Chongqing, China
Abstract
In the process of integrated processing of vehicle road cooperative signals, due to the low accuracy of the analysis results of road traffic status, the specific effect of using it to manage road traffic is not ideal. For this reason, this paper proposes the research of vehicle road cooperative signal active integration algorithm based on deep reinforcement learning. Taking full account of the shortcomings of the reinforcement learning algorithm in terms of convergence, we introduced deep reinforcement learning, added a new target network to the original reinforcement learning network, combined with the road traffic state transition matrix, and calculated the traffic state of the road environment by updating and iterating. In the active integration stage of vehicle road cooperative signals, based on the boosting algorithm, the differences shown in the analysis results of road traffic status are used to construct corresponding differences between integrators, and the integrated processing of them is realized on the basis of differentiated road traffic status signal weights. In the test results, under the design algorithm, the average travel time of road traffic is only 185s, the average travel speed reaches 5.97m/s, and the average queue length is only 6.52 vehicles, which has good application effect.

1.

INTRODUCTION

According to the analysis of current transportation modes, road transportation is one of the most important modes [1]. Therefore, from the perspective of people’s travel and urban economic development, it is of great practical significance to ensure that vehicles can operate efficiently on the infrastructure [2]. However, it is worth noting that in the context of rapid population and economic growth, the corresponding number of vehicles also shows an obvious increase trend, which brings the most direct impact that the carrying capacity of the existing infrastructure has been unable to carry the current traffic volume [3]. In the face of increasingly serious urban road traffic congestion, the main solutions are divided into two categories. The first is to further improve the traffic capacity of the road. The specific way is to expand the basic road facilities [4] to meet the driving demand of the growing traffic volume. This solution requires a large amount of time and resources, and for the urban traffic network that has achieved perfect planning, the related expansion construction will also affect the traffic efficiency of vehicles on the normal road to varying degrees [5]. Secondly, with the existing infrastructure as the core and the advantages of path planning, intelligent transportation and other technologies, reasonable management and control of traffic flow is also a relatively mainstream solution [6]. Compared with the expansion of infrastructure, the implementation cost of this method is significantly reduced, and the existing traffic control equipment on the road can also be fully utilized. In the urban road network [7], the signalized intersection is one of the most important nodes, and it is also one of the high incidence places of traffic congestion [8]. When analyzing the degree of congestion at signalized intersections, the queue length is often taken as the benchmark. With the help of queue length information, it can not only visually reflect the actual situation of signalized intersections, but also play an important role in evaluating the operation efficiency of intersections [9]. It is also an extremely important parameter index for the optimization of intersection signal control. In the actual traffic suspension handling process, it is not only necessary to accurately analyze the queuing status of each flow direction at the current intersection [10], but also to adaptively set specific signal control strategies in combination with the current queue length parameters. It can be seen that it is very necessary to actively integrate and process the vehicle road cooperative signal. In order to effectively alleviate traffic congestion, efficient integrated control of traffic signals is an inevitable development path and the key to improving the traffic efficiency of the urban road network.

On the basis of the above, this paper proposes the research on the active integration algorithm of vehicle road cooperative signals based on deep reinforcement learning, and through comparative testing, analyzes and verifies the practical application value of the design of the active integration algorithm of vehicle road cooperative signals for road traffic management.

2.

DESIGN OF ACTIVE INTEGRATION ALGORITHM FOR VEHICLE ROAD COOPERATIVE SIGNAL

2.1

Road traffic status analysis based on in-depth reinforcement learning

In order to achieve effective integration of vehicle road collaborative signals, it is necessary to objectively analyze road traffic conditions. Therefore, this paper fully considers the advantages of Reinforcement learning algorithm and makes up for its shortcomings in the application of road traffic state analysis. In the traditional Reinforcement learning process, the same neural network is usually used to calculate or estimate the target value. In view of this, in the process of analyzing the road traffic status, this paper introduces the deep reinforcement learning mechanism and adds a new target network to the original reinforcement learning network. Suppose that the original reinforcement learning network is expressed as Q(s, a,θ) the newly added target network is represented as Q(s, a,θ), where, θ and θ represent network parameter information respectively. On this basis, the original reinforcement learning network selects an action with the maximum value function according to the current traffic status st of the road environment a’, and the newly added target network evaluates it to get the next state st+1, corresponding selection action a’, and calculate the corresponding Q value. Where, the target value of deep reinforcement learning output results can be expressed as

00124_PSISDG13018_130183F_page_2_1.jpg

Among them, 00124_PSISDG13018_130183F_page_2_2.jpg represents the target value of the output result of deep reinforcement learning, r represents the transition matrix of road traffic state and the attenuation factor.

On this basis, according to the original reinforcement learning network calculation results Q(s, a,θ) and the newly added target network 00124_PSISDG13018_130183F_page_2_3.jpg relationship between, determining a’ error function of. Combined with the error function, the gradient descent method is used to update the network parameters, and the current network parameters are synchronized to the newly added target network using each fixed iteration step. The calculation method of error function can be expressed as

00124_PSISDG13018_130183F_page_2_4.jpg

Among them, L(θ) represents the error function, E represents a reward function.

According to the above method and the experience playback strategy, the analysis process of road traffic status can be expressed in the form shown in Table 1.

Table 1

Analysis Process of Road Traffic Status

stepExecute Action
1Initialize discount factor γ, Learning rate α, training rounds M, steps T in each round.Initialize neural network Q-Network, Target Network, network parameter θ, θ-
2For episode=1, M do
3Initialize simulation environment, initialize st
4Fort=1, Tdo
5According to the e-degree strategy, select a random action at from the action space with probability
6Otherwise, Q-Network selects actions based on st =max Q (st, a)
7Execute action st, the agent receives rewards rt and a new state st+1 from the environment
8Store the experience samples e=<st, at, rt, st+1) in the experience buffer pool D
9Extract a minimum number of experience samples from the experience replay pool D
10Update network parameters of Q-Network by Gradient descent according to Loss function θ
11Update Target Network network parameters every C steps θ-, θ ← θ-

According to the above mode, the analysis of road traffic status is realized, which provides a basis for the subsequent active integration of vehicle road coordinated signals.

2.2

Active integration of vehicle road coordinated signal

Combined with the analysis results of the road traffic state in section 2.1, this paper introduces the boosting algorithm when actively integrating the vehicle road cooperative signals. In the specific integration process, the differences shown in the analysis results of the road traffic state are mainly used to construct the corresponding differences between integrators. In the active integration algorithm of vehicle road cooperative signal designed in this paper, the core idea is to continuously update the weight value of each parameter in the road traffic status data. That is, for the I road traffic status signal data, set its weight as Dt (i), the weak integrator trains different road traffic status data each time. Specifically, at the beginning of the iteration, the weight of all road traffic status signal data is set in a uniform distribution manner. After each iteration, the weightDt (i) all will be updated. For the road traffic status signal data of the last round of abnormal integration, the algorithm will amplify its weight and ensure that the integrator can pay more attention to the more difficult signals. For the correctly integrated road traffic status signal data in the last round, the algorithm will reduce its attention. When designing the integrator in this paper, the goal is to make the output of the integrator more consistent with the actual value, that is, the goal of the integrator is to minimize the error, which can be expressed as

00124_PSISDG13018_130183F_page_3_2.jpg

Among them, ɛ indicates the error between the output result of the integrator and the actual value, h(st+1) indicates that the boosting algorithm is applicable to st+1 integration results of road traffic status signal data, yi indicates the actual road traffic status.

Based on the data results shown in Formula (3), the data results output by all boosting integrators are combined by using the linear weighting method. It can be expressed as

00124_PSISDG13018_130183F_page_3_3.jpg

Among them, k represents the final vehicle road coordinated signal data composed of the integrator addition,wi represents the weight parameter of the integrator.

In the way shown above, realize the active integration of vehicle road cooperative signals, and provide a reliable data basis for the development of relevant road management work.

3.

TEST ANALYSIS

3.1

Test Environment Settings

Based on a four phase bus priority intersection, a comparative test was carried out when analyzing the practical application effect of the active integration algorithm of vehicle road coordinated signals designed in this paper based on deep reinforcement learning. Make statistics on the basic conditions of the test road environment, as shown in Table 2.

Table 2

Basic Conditions of Test Road Environment

Serial Numberindexparameter
1Intersection cycle duration/s95.0
2East-west straight phase duration/s32.0
3East-west left turn phase duration/s20.0
4north-south straight phase duration/s25.0
5North-south left turn phase duration/s18.0

In the specific operation process, the test road traffic environment is set with east-west bus lanes. Considering the objective needs of intersection signal control under the vehicle road coordination environment, the signal duration countdown is set to avoid the setting of yellow light time for each phase. On this basis, based on the saturation flow rate of the intersection approach, the traffic flow of the test road environment is analyzed. The specific calculation is mainly obtained through the conversion of green light phase vehicle volume. The corresponding data information is shown in Table 3.

Table 3

Saturation Flow Rate of Test Road

Serial Numberindexparameter
1East-west straight inlet saturation flow rate/pcu/h4691
2Left turn inlet saturation flow rate/pcu/h2585
3North-south straight inlet saturation flow rate/pcu/h2887
4Right turn inlet saturation flow rate/pcu/h2585

In combination with the basic overview of the road environment shown in Table 2 and Table 3, this paper fully considers that in the actual road transport process, the specific flow is unevenly distributed, and there are peak flow periods and low flow periods. For this purpose, this paper sets a traffic flow scenario as shown in Figure 1 based on the actual traffic flow information of a continuous week.

Figure 1

Traffic flow setting of test scenario

00124_PSISDG13018_130183F_page_4_1.jpg

On the basis of the above test environment settings, the vehicle road collaborative signal active integration algorithm designed in this paper, the vehicle road collaborative signal active integration algorithm based on DSRC technology, and the vehicle road collaborative signal active integration algorithm based on traffic guidance technology are used to schedule and manage the test environment. By analyzing the corresponding road operation under different algorithms, the specific application performance is objectively evaluated.

3.2

Test results and analysis

When analyzing the specific application effect of vehicle road cooperative signal active integration algorithm in road traffic dispatching management, this paper sets average travel time, average travel speed and average queue length as evaluation indicators, and their specific calculation methods can be expressed as

00124_PSISDG13018_130183F_page_5_1.jpg
00124_PSISDG13018_130183F_page_5_2.jpg
00124_PSISDG13018_130183F_page_5_3.jpg

Among them,t represents the average driving time of vehicles on the test section, v represents the average running speed of the vehicle on the test section, l represents the average queue length of vehicles on the test section, ti represents the driving time of the vehicle on test section i, vi represents the driving speed of vehicle i on the test section, li indicates the queue length of vehicles on the test section at time j, N represents the total number of vehicles passing the test section,T indicates the total number of vehicles queuing on the test section.

Combined with the above index parameters, the corresponding road traffic dispatching management effects under different vehicle road coordinated signal active integration algorithms are counted, and the data results are shown in Table 4.

Table 4

Statistical Table of Test Results

test methodAverage travel time/sAverage driving speed/m/sAverage queue length/Vehicle
An active integration algorithm for vehicle road collaborative signals based on DSRC technology4522.1316.3
An active integration algorithm for vehicle road collaborative signals based on traffic guidance technology2463.4512.44
This article designs an active integration algorithm for vehicle road collaborative signals1855.976.52

Combined with the test results shown in Table 4, it can be seen that under three different vehicle road collaborative signal active integration algorithms, the corresponding running conditions of the test road environment show obvious differences. Among them, under the active integration algorithm of vehicle road coordinated signal based on DSRC technology, the average driving time of the test road environment reached more than 450s, and the corresponding average driving speed was only 2.13m/s, which was at a low level. Moreover, the analysis of the average queue length showed that it reached 16.3 vehicles, indicating that the road traffic congestion was serious at this time. In the test results of the active integration algorithm for vehicle road cooperative signals based on the traffic guidance technology, the corresponding evaluation indicators have improved to a certain extent compared with the active integration algorithm for vehicle road cooperative signals based on DSRC technology, but the corresponding average queue length has also reached more than 10.0, and the congestion situation is still at a high level. In contrast, in the test results of the vehicle road collaborative signal active integration algorithm designed in this paper, the average driving time of the test road environment is only 185s, the average driving speed reaches 5.97m/s for the control group 267s and 61s respectively, the average queue length is only 6.52 vehicles for the control group 3.84m/s and 2.52m/s respectively, and 9.78 vehicles and 5.92 vehicles for the control group respectively.

Based on the above test results, it can be concluded that the active integration algorithm of vehicle road cooperative signals designed in this paper based on deep reinforcement learning can provide reliable assistance for road traffic management, and has good practical application value for improving road traffic efficiency and alleviating traffic congestion.

4.

CONCLUSION

In order to alleviate the traffic congestion problem to the maximum extent and improve the traffic efficiency of the road, it is necessary to carry out reasonable scheduling and control of the road in combination with the actual traffic status. The premise to achieve this goal is to carry out effective integrated management of vehicle road coordinated signals. In this paper, an active integration algorithm of vehicle road cooperative signal based on deep reinforcement learning is proposed, which can analyze the characteristics of vehicle road cooperative signal by means of deep strength learning, and realize its effective integration. In the specific road traffic management process, with the help of signal active integration results, it can provide reliable guidance for improving the traffic efficiency of the road. With the help of this study, I hope to provide some reference value for the development of relevant road management work.

REFERENCES

[1] 

ALES FILIP, “Synergies between road and rail transport in the development of safe self-driving vehicles[J],” International Journal of Transport Development and Integration, 6 (3), 313 –325 (2022). Google Scholar

[2] 

ANXIN YANG, ZIGUANG LU., “Electromagnetic torque and reactive torque control of induction motor drives to improve vehicle variable flux operation and torque response[J],” Journal of power electronics: A publications of the Korean Institute of Power Electronics, 22 (10), 1699 –1712 (2022). Google Scholar

[3] 

S.K. SADANGI, “Optimal Capacity Utilization in Rail Transportation of Coal: Operational and Financial Implication for Coal India Limited[J],” The Indian mining & engineering journal, 60 (7), 07 –16 (2021). Google Scholar

[4] 

JEROEN VERSTRAETE, CHRIS M. J. TAMPÈRE, “Consistent iterative algorithm for stochastic dynamic traffic assignment with a stable route set[J],” Computer-aided civil and infrastructure engineering, 37 (10), 1223 –1242 (2022). Google Scholar

[5] 

SHIN, YUYOL, YOON, YOONJIN, “Incorporating Dynamicity of Transportation Network With Multi-Weight Traffic Graph Convolutional Network for Traffic Forecasting[J],” IEEE transactions on intelligent transportation systems, 23 (3), 2082 –2092 (2022). Google Scholar

[6] 

QIANG LU, HOJIN JUNG, KYOUNG-DAE KIM, “Optimization-Based Approach for Resilient Connected and Autonomous Intersection Crossing Traffic Control Under V2X Communication[J],” IEEE transactions on intelligent vehicles, 7 (2), 354 –367 (2022). Google Scholar

[7] 

CHOUDHARY, PARUL, DWIVEDI, RAKESH K., SINGH, UMANG, “An analysis of real-time traffic congestion optimisation through VTL in VANETs[J],” 68 (1), 12 –26 (2022). Google Scholar

[8] 

ABOLFATHI, SINA, SAFFARZADEH, MAHMOOD, MIRZAHOSSEIN, HAMID, et al, “Signalized intersection delay analysis using transit signal priority (TSP) and dedicated short?range communication (DSRC) system for bus rapid transit (BRT)[J],” Innovative Infrastructure Solutions, 7 (5), 293.1 –293.19 (2022). Google Scholar

[9] 

FAEZI, S. F., DOLATABADI, M. MALEKI, “Saturation Flow Rate of Urban At-Grade Signalized Intersection Under Different Climatic Conditions (Case Study: Sattari-Mokhberi Intersection) [J],” Iranian Journal of Science and Technology, Transaction of Civil Engineering, 46 (1), 603 –614 (2022). Google Scholar

[10] 

ANANYA PAUL, SULATA MITRA., “Deep reinforcement learning based cooperative control of traffic signal for multi-intersection network in intelligent transportation system using edge computing[J],” Transactions on Emerging Telecommunications Technologies, 33 (11), 4588-1 –4588-25 (2022). Google Scholar
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Shupeng Huang, Liang Sun, and Zhi Cai "Active integration algorithm of vehicle road cooperative signal based on deep reinforcement learning", Proc. SPIE 13018, International Conference on Smart Transportation and City Engineering (STCE 2023), 130183F (14 February 2024); https://doi.org/10.1117/12.3024608
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Roads

Signal processing

Analytical research

Design

Detection and tracking algorithms

Transportation

Algorithm development

Back to Top