WIT Press

Evaluating Stochastic Train Process Time Distribution Models On The Basis Of Empirical Detection Data


Free (open access)

Paper DOI









621 kb


J. Yuan, R. M. P. Goverde & I. A. Hansen


This paper evaluates several commonly applied probability distribution models for stochastic train process times based on empirical data recorded in a Dutch railway station, The Hague Holland Spoor. An initial guess of model parameters is obtained by the Maximum Likelihood Estimator (MLE). An iterative procedure is then followed, in which large delays are omitted one by one and the distribution parameters are estimated correspondingly using the MLE method. The parameter estimation is improved by minimizing the Kolmogorov-Smirnov (K-S) statistic where of course the empirical distribution is still based on the complete data set. A local search is finally performed in the neighbourhood of the improved model parameters to further optimize the estimation. To evaluate the distribution models, we compare the K-S statistic among the fitted distributions with optimized parameters using the one-sample K-S goodness-offit test at a commonly adopted significance level of α = 0.05. It has been found that the log-normal distribution can be generally considered as the best approximate model among the candidate distributions for both the arrival times of trains at the platform and at the approach signal of the station. The Weibull distribution can generally be considered as the best approximate distribution model for non-negative arrival delays, departure delays and the free dwell times of late arriving trains. The shape parameter of the fitted distribution is generally smaller than 1.0 in the first two cases, whereas it is always larger than 1.0 in the last case. These distribution evaluation results for train process times can be used for accurately predicting the propagation of train delays and supporting timetable design and rescheduling particularly in case of lack of empirical data. Keywords: train delays, running and dwell times, track occupancy times, statistical distribution, the K-S test.


train delays, running and dwell times, track occupancy times, statistical distribution, the K-S test