Publications des scientifiques de l'IRD

Arslan M., Desconnets Jean-Christophe, Mougenot I. (2022). Environmental and life sciences observations in knowledge graphs using NLP techniques to support multidisciplinary studies. Procedia Computer Science, 201, 543-550. International Conference on Emerging Data and Industry 4.0. (EDI40), 5., Porto (PORT), 2022/03/22-25. ISSN 1877-0509.

Titre du document
Environmental and life sciences observations in knowledge graphs using NLP techniques to support multidisciplinary studies
Année de publication
2022
Type de document
Article
Auteurs
Arslan M., Desconnets Jean-Christophe, Mougenot I.
Source
Procedia Computer Science, 2022, 201, 543-550 ISSN 1877-0509
Colloque
International Conference on Emerging Data and Industry 4.0. (EDI40), 5., Porto (PORT), 2022/03/22-25
The understanding of environmental observations is a continuous challenge for environmental and life science investigations. The environmental data is complex as it involves its own features, methods, properties, systems, and spatio-temporal dimensions. The time granularity remains approximately the same for different environmental contexts but geographic and rest of the above-mentioned entities are defined using domain vocabularies that are specific for each discipline. It is time-consuming for the researchers of life sciences' discipline to discover, access, and analyze relevant environmental observations as each discipline has its data formats, vocabularies, and metadata standards. These differences introduce structural and semantic heterogeneities, resulting in creating a barrier for reusing datasets generated by other disciplines. Existing dataset discovery platforms contain domain-specific metadata descriptions for explaining datasets which limits their usage. To overcome this knowledge barrier, this work reports the proof-of-concept implementation of a knowledge graph that is centered towards the oceanography use case scenario using NLP techniques (named entity recognition (NER) followed by text preprocessing). The constructed knowledge graph is a collection of subgraphs each representing the metadata of a dataset. It uses the geo-spatial and open semantic data standards that aim to provide enhanced metadata descriptions of datasets for enabling multidisciplinary research.
Plan de classement
Informatique [122]
Localisation
Fonds IRD [F B010092365]
Identifiant IRD
fdi:010092365
Contact