Publications des scientifiques de l'IRD

Orozco-Arias S., Lopez-Murillo L. H., Candamil-Cortes M. S., Arias M., Jaimes P. A., Paschoal A. R., Tabares-Soto R., Isaza G., Guyot Romain. (2023). Inpactor2 : a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes. Briefings in Bioinformatics, p. [10 p.]. ISSN 1467-5463.

Titre du document
Inpactor2 : a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes
Année de publication
2023
Type de document
Article référencé dans le Web of Science WOS:000896757000001
Auteurs
Orozco-Arias S., Lopez-Murillo L. H., Candamil-Cortes M. S., Arias M., Jaimes P. A., Paschoal A. R., Tabares-Soto R., Isaza G., Guyot Romain
Source
Briefings in Bioinformatics, 2023, p. [10 p.] ISSN 1467-5463
LTR-retrotransposons are the most abundant repeat sequences in plant genomes and play an important role in evolution and biodiversity. Their characterization is of great importance to understand their dynamics. However, the identification and classification of these elements remains a challenge today. Moreover, current software can be relatively slow (from hours to days), sometimes involve a lot of manual work and do not reach satisfactory levels in terms of precision and sensitivity. Here we present Inpactor2, an accurate and fast application that creates LTR-retrotransposon reference libraries in a very short time. Inpactor2 takes an assembled genome as input and follows a hybrid approach (deep learning and structure-based) to detect elements, filter partial sequences and finally classify intact sequences into superfamilies and, as very few tools do, into lineages. This tool takes advantage of multi-core and GPU architectures to decrease execution times. Using the rice genome, Inpactor2 showed a run time of 5 minutes (faster than other tools) and has the best accuracy and F1-Score of the tools tested here, also having the second best accuracy and specificity only surpassed by EDTA, but achieving 28% higher sensitivity. For large genomes, Inpactor2 is up to seven times faster than other available bioinformatics tools.
Plan de classement
Sciences fondamentales / Techniques d'analyse et de recherche [020] ; Sciences du monde végétal [076]
Localisation
Fonds IRD [F B010086744]
Identifiant IRD
fdi:010086744
Contact