decision tree that builds each time may not be the
same. In general, a random forest randomly generates
hundreds to thousands of decision trees, and then
selects the tree with the highest degree of repetition
as the final result(Lihui Li,2017).
In the traffic accident forecasting, the eigenvector
is established as the input characteristic by the
influencing factors of the number of traffic accidents,
and the traffic accident number corresponding to the
eigenvector is taken as the forecasting result. The
forecasting model is obtained by fitting the training
samples.
Below we set up a RF regression model of road
traffic accidents. After repeated model screening, the
model with the highest goodness-of-fit includes the
six influencing variables: RURAL0, NOLANE0,
AVGTRUCK, PSR, AADT and SLENGEH. The
IncNodePurity values of PSR, RURAL0 and
AVGTRUCK are small among the influencing
variables. The influence of this variable on the
number of traffic accidents is larger. The figure below
shows: Residuals in the RF regression model tend to
stabilize as the number of decision trees increases.
Figure 4: Simulation results
The model results show that the :SSR=1113.6.
Compared with Poisson regression model, the SSR is
smaller, which is larger than the SSR of NB
regression model. Therefore, the goodness of fit of
the RF regression model is higher than that of Poisson
regression model, which is worse than the NB
regression model. The machine learning prediction
model has a low prediction accuracy on the number
of traffic accidents.
5 CONCLUSION AND OUTLOOK
This paper tries to find a model that is closer to the
actual traffic condition by carrying out Poisson
regression,NB regression, ZINB regression and RF
regression on road traffic accidents. Assuming that
the number of traffic accidents subject to different
distributions, by selecting the strong influencing
factors among the different factors in the regression
model to build a model closer to the actual situation.
The simulation results show that under the
existing traffic data, Poisson regression model has a
poor fitting degree, followed by a RF regression
model, and AIC difference between theNB regression
and ZINB regression model is not very much. ZINB
regression model has the best goodness of fit. All
models eventually include the two
variables:RURAL0 and AVGTRUCK, and the
impact of RURAL0 on the number of road accidents
in all three models is greater than the other factors.
Urban roads are more prone to traffic accidents than
rural roads; there are more traffic accidents on roads
with large truck proportions. Therefore, we come to
the conclusion that we should strengthen the
management of urban road traffic conditions, the
specific measures should be based on traffic
characteristics of specific sections of the traffic
investigation. Traffic management should be
strengthened for areas with frequent and high traffic
accidents. Controlling the number of trucks within a
reasonable range can help reduce traffic accidents.
The factors considered in this paper may not be
comprehensive. Due to the lack of data collection, the
data related to traffic accidents will also affect the
accuracy of the results of regression analysis. In the
future, I hope to further study in this area and analyze
the relationship between traffic accidents in a more
comprehensive way from various perspectives. I
hope that the best model of traffic accident can be
fitted to achieve a more accurate prediction of traffic
accidents.
REFERENCES
Chang L Y,2005.Analysis of freeway accident frequencies:
NB regression versus artificial neural network. Safety
Science.
Liande Zhong , 2008. Study on Highway Accident
Prediction Model. Beijing University of Technology.
Miaou S P, Lum H. 1993. Modeling Vehicle accidents and
highway geometric design relationships. Accident
Analysis & Prevention,
Milton J, Mannering F, 1998. The relationship among
highway geometrics, traffic-relatedelements and
motor-vehicle accident frequencies. Transportation.
Yi Chen, 2013. Zero Expansion Poisson Regression Model
and Its Application in Traffic Accidents. Computer
Technology And Development.