A technological deep dive into EO4EU design and deployment of the Knowledge Graph for Earth Observation purposes

The EO4EU project’s scope is beyond simply accessing earth observation data; it's about revolutionising how that data is structured, accessed, and utilised through the innovative application of Knowledge Graphs (KGs) technology. This article delves into the methodological steps and technological underpinnings of the EO4EU Knowledge Graph, showcasing how this initiative is transforming environmental understanding.

EO4EU addresses the challenge of fragmented earth observation data by creating a standardised and accessible metadata ecosystem. The core of this system is a semantically rich KG, built through a complex process of data ingestion, transformation, and integration. This KG acts as a central hub, connecting diverse data sources and enabling users to discover and utilize information in a more efficient and intuitive manner.

The deployment of the EO4EU KG involves six (6) well-defined methodological steps:
i. Data Acquisition and Integration: The project identifies and selects a diverse range of earth observation data sources, including satellite missions, ground-based sensors, and environmental monitoring networks. These sources encompass Copernicus Services (ADS, CDS, CLMS, Marine, Sentinel) and third-party platforms (ADAM, Istat.it, INSPIRE, CMCC, FAO, ECMWF, NOAA).
ii. Data Profiling: A comprehensive examination of the data is conducted to understand its characteristics, including data types, file formats, and available variables. This analysis ensures the data's suitability for integration into the KG.
iii. Data Preprocessing and Standardization: Data is analytically examined to identify inconsistencies, variations, and discrepancies. Data standardization is applied to ensure uniformity across the KG, involving normalization of dates, transformation of measurements, and resolution of naming convention discrepancies.
iv. Schema Mapping and Ontology Alignment: Data attributes are aligned with the KG's schema or ontology, establishing uniformity in attribute names, types, and semantics. This process involves mapping data attributes to their corresponding entities and attributes within the graph structure.
v. Data Aggregation: Methods are employed for extracting processed information from datasets, tailored to the KG's requirements. This involves standardization to guarantee the uniformity and reliability of the data.
vi. Parser Development: Custom parsers are developed to handle the unique data sources and formats encountered in earth observation data. These parsers, including HTML parsers, extract essential information from web pages and other files, enabling access to data that might otherwise be inaccessible.

Technological Underpinnings: The Engine Behind the Knowledge Graph


The EO4EU KG relies on a robust technological infrastructure:
● Graph Database: A graph database solution is used to store and manage the KG, enabling efficient querying and traversal of relationships between data entities.
● Semantic Search Engine: A semantic search engine, allows users to perform free text queries and discover relevant data based on the meaning and context of their search terms.
● Vector Database: A vector database, stores vector representations of data elements, enabling efficient similarity calculations for semantic search.


Further to this infrastructure, the project has developed a suite of versatile APIs that empower users to interact with the KG and access its wealth of earth observation data and insights seamlessly. These APIs offer a wide range of functionalities, including providing the status of the service, initializing the service, executing user queries, retrieving data and metadata, breaking down datasets into products and features, and generating Python code snippets corresponding to API calls.

EO4EU Novelcore article