• shareshare
  • link
  • cite
  • add
auto_awesome_motion View all 7 versions
Publication . Article . 2022

Natural language processing for aviation safety: extracting knowledge from publicly-available loss of separation reports

Irene Buselli; Luca Oneto; Carlo Dambra; Christian Eduardo Verdonk Gallego; Miguel García Martínez; Anthony Smoker; Nnenna Ike; +2 Authors
Open Access

Background: The air traffic management (ATM) system has historically coped with a global increase in traffic demand ultimately leading to increased operational complexity. When dealing with the impact of this increasing complexity on system safety it is crucial to automatically analyse the losses of separation (LoSs) using tools able to extract meaningful and actionable information from safety reports. Current research in this field mainly exploits natural language processing (NLP) to categorise the reports,with the limitations that the considered categories need to be manually annotated by experts and that general taxonomies are seldom exploited. Methods: To address the current gaps,authors propose to perform exploratory data analysis on safety reports combining state-of-the-art techniques like topic modelling and clustering and then to develop an algorithm able to extract the Toolkit for ATM Occurrence Investigation (TOKAI) taxonomy factors from the free-text safety reports based on syntactic analysis. TOKAI is a tool for investigation developed by EUROCONTROL and its taxonomy is intended to become a standard and harmonised approach to future investigations. Results: Leveraging on the LoS events reported in the public databases of the Comisión de Estudio y Análisis de Notificaciones de Incidentes de Tránsito Aéreo and the United Kingdom Airprox Board,authors show how their proposal is able to automatically extract meaningful and actionable information from safety reports,other than to classify their content according to the TOKAI taxonomy. The quality of the approach is also indirectly validated by checking the connection between the identified factors and the main contributor of the incidents. Conclusions: Authors' results are a promising first step toward the full automation of a general analysis of LoS reports supported by results on real-world data coming from two different sources. In the future,authors' proposal could be extended to other taxonomies or tailored to identify factors to be included in the safety taxonomies.

Subjects by Vocabulary

Microsoft Academic Graph classification: System safety Natural language processing computer.software_genre computer Field (computer science) Exploit Artificial intelligence business.industry business Computer science Cluster analysis Exploratory data analysis Air traffic management Topic model Taxonomy (general)


ATM, Safety, Resilience, Natural Language Processing, Losses of Separation, Safety Reports, TOKAI, Research Article, Articles, General Language Studies and Linguistics, Transport Systems and Logistics

Related Organizations
Funded by
saFety And Resilience guidelines for aviatiOn
  • Funder: European Commission (EC)
  • Project Code: 892542
  • Funding stream: H2020 | SESAR-RIA
Related to Research communities
Digital Humanities and Cultural Heritage