Analysis of commercial truck drivers’ potentially dangerous driving behaviors based on 11-month digital tachograph data and multilevel modeling approach

https://doi.org/10.1016/j.aap.2019.105256Get rights and content

Highlights

  • An innovative way for extracting the information hiding in daily driving data.

  • 40% of the truckers tend to drive in a substantially dangerous way.

  • The explained variance proportion of different truckers vary distinctly.

  • Dangerous truckers are less likely to overspeed on holidays or rainy days.

Abstract

This study analyzed the potentially dangerous driving behaviors of commercial truck drivers from both macro and micro perspectives. The analysis was based on digital tachograph data collected over an 11-month period and comprising 4373 trips made by 70 truck drivers. First, different types of truck drivers were identified using principal component analysis (PCA) and a density-based spatial clustering of applications with noise (DBSCAN) at the macro level. Then, a multilevel model was built to extract the variation properties of speeding behavior at the micro level. Results showed that 40% of the truck drivers tended to drive in a substantially dangerous way and the explained variance proportion of potentially extremely dangerous truck drivers (79.76%) was distinctly higher than that of other types of truck drivers (14.70%˜34.17%). This paper presents a systematic approach to extracting and examining information from a big data source of digital tachograph data. The derived findings make valuable contributions to the development of safety education programs, regulations, and proactive road safety countermeasures and management.

Introduction

Traffic safety is a critical issue in the transportation field. Traffic safety conditions are determined by drivers, vehicles, and driving environment. Previous research revealed that over 90% of traffic accidents were associated with unsafe driving behaviors (e.g., Petridou and Moustaki, 2001; Ellison et al., 2015; Atombo et al., 2016). Thus, an enhanced understanding of unsafe driving behaviors could provide meaningful contributions to road safety research.

Driving behavior plays an important role in driving risk analysis. However, it is difficult to measure risk in real-life situations (Eboli et al., 2017). Driving simulators are often used to investigate driving behaviors in various experimental environments (Pankok and Kaber, 2018). Some vehicle instrument technologies such as the Naturalistic Driving Study (NDS) (Guo and Hankey, 2009) and the DriveCam system (Hickman et al., 2010) have been applied to monitor driving behaviors and kinematic signatures on a large scale. The NDS programs such as the Second Strategic Highway Research Program (SHRP2), the 100-Car NDS, and Europe’s UDRIVE have provided valuable insights into accident causation and driving behaviors (Guo, 2019). Most existing analyses of dangerous driving behavior have relied on crash data or self-reported questionnaire surveys (Lord and Mannering, 2010; Ellison et al., 2015; Mannering and Bhat, 2014; Huang et al., 2017). To fully explore driving behavior in traffic accidents, it is important to understand driving styles. Driving styles, or habits, are defined as the way that individuals choose to drive and are analyzed over periods of years (Constantinescu et al., 2010). Traditional attributes of driving style include choices of driving speed, acceleration, deceleration or braking, threshold for overtaking, headway, and propensity to commit traffic violations (Murphey et al., 2009; Wu et al., 2016).

Among all types of vehicles, trucks are the largest contributor to traffic accidents, injuries, and fatalities owing to their high proportion among the roadway population, as well as their size, weight, and other unique characteristics (Zhu and Srinivasan, 2011; He et al., 2019). Compared to other vehicle types, the larger size and higher center of gravity of trucks result in longer braking distances and more severe consequences when involved in accidents. Moreover, truck crashes have higher economic impacts because of the damage to high-value cargo and travel delays caused by traffic accidents. Thus, research identifying influential factors in truck accidents would facilitate the development of countermeasures that could reduce the number and severity of accidents involving trucks.

Prior research has demonstrated that driving environment has a substantial influence on driver safety awareness (e.g., Kaber et al., 2012; Faure et al., 2016; Yan et al., 2017). Driving is a complex task that involves maintaining appropriate steering and speeds while accurately perceiving, identifying, and anticipating road elements such as road type or transit route, and other dynamic conditions including traffic flow, car following situation, and weather. Given these parameters, it is crucial to advance our understanding of how driving environment affects driving behavior.

The rapid rise and prevalence of mobile technologies have enabled the collection of a massive amount of passive data, e.g., big data. Effective analysis of big data provides new opportunities to advance our understanding of critical transportation problems such as road safety (Chen et al., 2016; Bao et al., 2019). Unlike small-scale data obtained via questionnaires or surveys, most big data are initially generated for other purposes, but have high potential value for research applications. Currently, this massive amount of second-by-second data have yet to be fully utilized in road safety research.

Approached with a microcosmic perspective, most road safety data are hierarchically organized to facilitate road safety research (Dupont et al., 2013). This makes multilevel models, or hierarchical linear models, technologically effective and efficient for figuring out the heterogeneity among the complicated data structures. Billot et al. (2009) investigated the influence of rain on driving behavior using a multilevel model. Huang and Abdel-Aty (2010) proposed a 5×ST-level (S: spatial; T: temporal) hierarchy to represent the general framework of multilevel data structures in traffic safety. This is a conceptual framework for assessing structured safety data. Several case studies using Bayesian hierarchical models were summarized to improve model fitting and predictive performance over traditional models. However, all the data utilized were sourced from crash data, either frequency or severity. The authors also noted concerns about the applicability and transferability of these multilevel, or hierarchical models.

In summary, the following research gaps were identified in existing research. First, because crash data is traditionally the primary or only data source, proactive approaches to effective traffic safety measures have been ignored. Second, the influence of unobserved heterogeneities omitted by multilevel or hierarchical models on traffic safety remain unknown. Third, in Japan, installing tachograph recorders has been mandatory for trucks since 1967. Prior to then, only analogue-type recorders were available1 . Since the 2010s, the use of digital tachograph recorders has increased and effectively replaced analogue-type recorders. However, the accumulated big data have seldom been applied to road safety research in Japan.

To fill the abovementioned research gaps, the objectives of this study are twofold. The first objective is to identify different types of truck drivers in terms of driving performance at the macro level. The second objective is to excavate the variation properties of speeding behavior across different types of truck drivers by building a multilevel model at the micro level, where the effects of multi-dimensional unobserved heterogeneities are reflected. This study used digital tachograph data collected from a major Japanese freight transport company over an 11-month period in the year 2014. Driving performance indicators included speeding, driving duration, and jerky driving indicators, which were measured using the tachograph data and further exploited to indirectly capture potentially dangerous driving behaviors. This study differs from existing studies by simultaneously considering the following aspects:

  • (1)

    Adopting a real-world commercial truck-driving data and capturing potentially dangerous driving behaviors based on multiple indicators;

  • (2)

    Using a large-scale dataset that recorded truck drivers’ driving speed-related data at an interval of 0.5 s in tandem with GPS data collected at an interval of 60 s;

  • (3)

    Building a multilevel model that reflects the influences of multi-dimensional unobserved heterogeneities;

  • (4)

    Focusing on using the above large-scale driving data with detailed temporal and spatial information in the specific context of Japan.

Section snippets

Literature review

Many existing traffic safety studies emphasize the use of crash data to create accident indices based on metric such as crash and near-crash rates. However, such measurements neglect the importance of proactive safety countermeasures. Data are traditionally obtained from police crash reports, which include information such as date and time, age and gender of vehicle occupants, weather, road conditions, type of accident, and safety belt usage (Mannering and Bhat, 2014).

Methodology

Numerous indicators represent various aspects of dangerous driving behaviors. To simplify the interpretation of different dangerous driving behaviors among truck drivers, this study first conducts a Principal Component Analysis (PCA). The PCA is used to derive major independent components representing the various dangerous driving behaviors. Then, a cluster analysis is executed based on the aforementioned truck driver classification components, because it is expected that truckers with

Data

The digital tachograph data used in this study was obtained from a major Japanese logistic company, who agreed to provide us a year of truck-driving data collected in the Chugoku region, including the five prefectures of Hiroshima, Okayama, Yamaguchi, Shimane, and Tottori. Because this study was a part of a project with a major expressway company in this region, all trucks with use of expressways were targeted and as a result, a total of 70 truck drivers’ daily driving data was collected from

PCA results

The variance contribution rates, and cumulative contribution rates are listed in Table 3. The first three components explain approximately 72% of total variance, which is enough to warrant further analysis. A rotation is needed to better explain the original variables. Rotated results are detailed in the last three columns, in which the three main components with eigenvalues greater than 1.0 are presented.

Hereafter, the above three rotated (principal) components are used to replace the original

Discussion

Traffic accidents are relatively rare events. So, surrogates were needed to compensate for the insufficient number of real-word incidents required to execute an accurate risk assessment. Existing studies have mainly focused on accident indices such as crash rate or near-crash rate. However, more a proactive approach is required to identify and mitigate potential risks. A comprehensive analysis that considers truck drivers, vehicles, and driving environment, as well as temporal and spatial

Conclusion

The paper proposes an innovative approach to extracting useful information about commercial truck driver behavior from widely available big data sources. This study used eleven months of digital tachograph data that comprised 4373 trips made by 70 truck drivers in Japan. Results suggest that 40% of truck drivers exhibit substantially dangerous driving tendencies. The explanatory variables introduced in this study accurately expressed the influence of unobserved conditions and phenomena for

Declaration of Competing Interest

None.

Acknowledgements

This research was funded by the Grants-in-Aid for Scientific Research (A), Japan Society for the Promotion of Science (No.15H02271). The authors would also like to thank the China Scholarship Council for their financial support to this research and the three anonymous reviewers for their valuable suggestions.

References (66)

  • V. Faure et al.

    The effects of driving environment complexity and dual tasking on drivers’ mental workload and eye blink behavior

    Transp. Res. Part F Traffic Psychol. Behav.

    (2016)
  • A.J. Filtness et al.

    Sleep-related crash characteristics: implications for applying a fatigue definition to crash reports

    Accid. Anal. Prev.

    (2017)
  • D.P. Garcia et al.

    Assessment of plant biomass for pellet production using multivariate statistics (PCA and HCA)

    Renew. Energy

    (2019)
  • H.M. Hassan et al.

    Investigation of drivers’ behavior towards speeds using crash data and self-reported questionnaire

    Accid. Anal. Prev.

    (2017)
  • Q. Hou et al.

    Investigating factors of crash frequency with random effects and random parameters models: new insights from Chinese freeway study

    Accid. Anal. Prev.

    (2018)
  • S. Hu et al.

    Temporal modeling of highway crash counts for senior and non-senior drivers

    Analysis and Prevention

    (2013)
  • H. Huang et al.

    Multilevel data and Bayesian analysis in traffic safety

    Accid. Anal. Prev.

    (2010)
  • H. Huang et al.

    Severity of driver injury and vehicle damage in traffic crashes at intersections: a Bayesian hierarchical analysis

    Accid. Anal. Prev.

    (2008)
  • H. Huang et al.

    A multivariate spatial model of crash frequency by transportation modes for urban intersections

    Anal. Methods Accid. Res.

    (2017)
  • S. Islam et al.

    Comprehensive analysis of single- and multi-vehicle large truck at-fault crashes on rural and urban roadways in Alabama

    Accid. Anal. Prev.

    (2014)
  • D. Kaber et al.

    Effects of hazard exposure and roadway complexity on young and older driver situation awareness and performance

    Transp. Res. Part F

    (2012)
  • C.E.J. Key et al.

    A study investigating the comparative situation awareness of older and younger drivers when driving a route with extended periods of cognitive taxation

    Transp. Res. Part F

    (2017)
  • Z. Li et al.

    Using geographically weighted Poisson regression for county-level crash modeling in California

    Saf. Sci.

    (2013)
  • D. Lord et al.

    The statistical analysis of crash-frequency data: a review and assessment of methodological alternatives

    Transportation Research Part A

    (2010)
  • F. Malin et al.

    Accident risk of road and weather conditions on different road types

    Accid. Anal. Prev.

    (2019)
  • F. Mannering et al.

    Analytic methods in accident research: methodological frontier and future directions

    Anal. Methods Accid. Res.

    (2014)
  • A.D. McDonald et al.

    A contextual and temporal algorithm for driver drowsiness detection

    Accid. Anal. Prev.

    (2018)
  • A. Mueller et al.

    Driving in fog: the effects of driving experience and visibility on speed compensation and hazard avoidance

    Accid. Anal. Prev.

    (2012)
  • F. Naznin et al.

    Application of a random effects negative binomial model to examine tram-involved crash frequency on route sections in Melbourne, Australia

    Accid. Anal. Prev.

    (2016)
  • C. Pankok et al.

    The effect of navigation display clutter on performance and attention allocation in presentation- and simulator-based driving experiments

    Appl. Ergon.

    (2018)
  • S.S. Pantangi et al.

    A preliminary investigation of the effectiveness of high visibility enforcement programs using naturalistic driving study data: a grouped random parameters approach

    Anal. Methods Accid. Res.

    (2019)
  • Y. Peng et al.

    Assessing the impact of reduced visibility on traffic crash risk using microscopic data and surrogate safety measures

    Transp. Res. Part C

    (2017)
  • I. Radun et al.

    Driver fatigue and the law from the perspective of police officers and prosecutors

    Transp. Res. Part F

    (2013)
  • Cited by (57)

    View all citing articles on Scopus
    View full text