Data Perspectives ICARUS

Today, there is an unprecedented volume, diversity and richness of aviation data that can be acquired, generated, stored, and managed by the overall aviation data value chain. With the aircraft highly instrumented, an average flight is claimed to produce between 500 and 1,000 gigabytes of data while other estimations anticipate that the global fleet could generate up to 98,000,000 terabytes of data by 2026.

Despite the vast quantity of data across myriad parameters that never stop flowing across the aircraft-passengers-luggage-cargo journeys according to the IATA NEXTT initiative, the aviation stakeholders are generally at a relative disadvantage in terms of data gathering and sharing in relation to airlines, especially since the eternal questions of “who owns the aircraft” and “who owns the passenger” remain open.

Initiatives towards Aviation Data Mining and Sharing

Over the past years, various ambitious initiatives to mine and exchange data/information within closed ecosystems of aviation stakeholders at the same level have emerged (e.g. GADM by IATA mainly intended for use between airlines, SKYbrary by EUROCONTROL between Air Traffic Control authorities and airlines, A-CDM between airports, and Skywise between Airbus and airlines, among others) yet there is still little data diffusion and sharing across the plethora of aviation-related stakeholders.

Organisations (e.g. airlines, airports, extra-aviation services, environmental and health agencies, etc.) only have fragmented aviation data at their disposal and still operate in silos, typically building on top of their historical data, and in certain, exceptional situations trying to establish peer-to-peer collaborations for utilising near real-time data from other sources in an ad hoc manner. It needs to be noted that airports on their behalf are open to data sharing with external partners, with 23% offering a plan today and 39% expecting to by the end of 2021, according to the SITA Air Transport IT Trends Insights 2018.

Insights from the ICARUS Data Collection Activities

Such findings have been also broadly confirmed through the ICARUS data collection activities that evidenced that:

The aviation stakeholders mainly focus on collecting primary aviation datasets classified under Data Tier 1, without paying equal attention to what value they may also extract from Data Tiers 2 and 3.

In ICARUS, 28 datasets classified in Data Tier 1, 8 datasets classified in Data Tier 2 and 3 datasets classified in Data Tier 3 are readily available by the 4 ICARUS demonstrators and data provider (OAG), while over 45 datasets at Data Tier 1, 12 datasets at Data Tier 2 and 3 datasets at Data Tier 3 are considered as essential by the 4 ICARUS demonstrators to execute their scenarios. In all cases, the datasets shall be acquired (or are considered to be acquired) through bilateral agreements with the respective data providers and are not intended for public use, while they are available in large volumes in machine-readable formats (csv or json) as evidenced in their detailed profiling (containing their General Info, their detailed Features, their Availability, their Rights and their Assessment).

Open aviation data are typically summarized in reports and are not published in a raw format

Core aviation stakeholders, with few notable exceptions (such as the EUROCONTROL Pan-European ANS Performance Data Portal or the codelists maintained and published by IATA and ACI), typically do not publish aviation data in a raw format. Such a trend is of course in line with the fact that typically, aviation data are not available for free, but bear a certain cost to confirmed organizations within the aviation ecosystem.

Evidence of aviation data in open data portals is rather scarce:

The EU Open Data Portal, the European Data Portal, Eurostat and OECD Data collectively contained less than 600 aviation datasets (from almost 840.000 datasets they featured) at the time the ICARUS consortium performed its extensive search to identify open data sources that are somehow related to the aviation data value chain. The number of aviation-related datasets (in comparison to public, environment or health data) is practically negligible in such generic-purpose open data portals, while the data assets that are relevant to the ICARUS demonstrators’ needs are restricted to open weather data, open population data and open air quality data.

Key Take-Aways

Overall, the ICARUS data collection activities have aimed at capturing the data visibility (in terms of data available and data needed) for the 4 demonstrators (AIA, PACE/TXT, ISI, CELLOCK) and the aviation data provider (OAG) in order to properly profile the aviation-related data assets that are owned by the ICARUS consortium or are available in open data repositories (from 5 aviation-specific data sources, over 10 open data sources and 2 related transport projects’ repositories).

The purpose of such activities is to eventually: (a) populate the ICARUS platform with a large number of data assets that will attract the interest of aviation stakeholders, and (b) follow the ICARUS demonstrators’ implementation progress (taking into account that during their experimentation with data analytics in the ICARUS platform, new questions that need to be supported by additional data assets may arise). More details are available in the ICARUS Deliverables D1.1 (Section 4) and D1.3 (Section 3).

Blog post prepared by Suite5.