Publications des scientifiques de l'IRD

Raoufi E., Happi Happi Bill Gates, Larmande Pierre, Scharffe F., Todorov K. (2026). Analysis of the performance of representation learning methods for entity alignment : benchmark versus real-world data. Semantic Web, 17 (1), p. 09217134251389825 [24 p.]. ISSN 1570-0844.

Titre du document
Analysis of the performance of representation learning methods for entity alignment : benchmark versus real-world data
Année de publication
2026
Type de document
Article référencé dans le Web of Science WOS:001629059600001
Auteurs
Raoufi E., Happi Happi Bill Gates, Larmande Pierre, Scharffe F., Todorov K.
Source
Semantic Web, 2026, 17 (1), p. 09217134251389825 [24 p.] ISSN 1570-0844
Representation learning for entity alignment (EA) aims to identify entities in different knowledge graphs (KGs) that refer to the same real-world object by comparing their embedding similarity. Although many EA models perform well on synthetic benchmark datasets, this performance does not always transfer to real-world, incomplete, and domain-specific data. A systematic comparison between synthetic benchmarks and original heterogeneous datasets is still limited. Many EA models also restrict the alignment search space to validation entities, limiting coverage of real KG content. Within this setting, our results show that embedding-based EA models continue to face generalization challenges in realistic large-scale KG search spaces. We evaluate several competitive EA models-commonly tested on benchmarks such as DBP15K-on multiple real-world heterogeneous datasets. The experiments reveal a performance decrease when moving beyond synthetic benchmarks, indicating that current models do not fully capture the characteristics of real data. We also analyze semantic similarity and profiling features of the datasets to help explain these differences. This study outlines practical limitations of embedding-based EA methods and provides insights for developing approaches that better handle the variability and complexity found in real-world KG alignment tasks.
Plan de classement
Sciences fondamentales / Techniques d'analyse et de recherche [020] ; Informatique [122]
Localisation
Fonds IRD [F B010095821]
Identifiant IRD
fdi:010095821
Contact
  • Coordonnées :
    Mission Science Ouverte (MSO)
    IRD - Délégation régionale Île-de-France & Ouest
    Campus Condorcet - Hôtel à projets
    8 cours des Humanités - 93322 Aubervilliers Cedex
    Horizon Pleins textes
    Aide
  •