%0 Journal Article
%9 ACL : Articles dans des revues avec comité de lecture répertoriées par l'AERES
%A Gschloessl, B.
%A Dorkeld, F.
%A Berges, H.
%A Beydon, G.
%A Bouchez, O.
%A Branco, M.
%A Bretaudeau, A.
%A Burban, C.
%A Dubois, E.
%A Gauthier, Philippe
%A Lhuillier, E.
%A Nichols, J.
%A Nidelet, S.
%A Rocha, S.
%A Saune, L.
%A Streiff, R.
%A Gautier, M.
%A Kerdelhue, C.
%T Draft genome and reference transcriptomic resources for the urticating pine defoliator Thaumetopoea pityocampa (Lepidoptera : Notodontidae)
%D 2018
%L fdi:010073027
%G ENG
%J Molecular Ecology Resources
%@ 1755-098X
%K BAC library ; de novo assembly ; gene prediction ; genome ; Lepidoptera ; transcriptome
%M ISI:000432662400018
%N 3
%P 602-619
%R 10.1111/1755-0998.12756
%U https://www.documentation.ird.fr/hor/fdi:010073027
%> https://www.documentation.ird.fr/intranet/publi/2018/06/010073027.pdf
%V 18
%W Horizon (IRD)
%X The pine processionary moth Thaumetopoea pityocampa (Lepidoptera: Notodontidae) is the main pine defoliator in the Mediterranean region. Its urticating larvae cause severe human and animal health concerns in the invaded areas. This species shows a high phenotypic variability for various traits, such as phenology, fecundity and tolerance to extreme temperatures. This study presents the construction and analysis of extensive genomic and transcriptomic resources, which are an obligate prerequisite to understand their underlying genetic architecture. Using a well-studied population from Portugal with peculiar phenological characteristics, the karyotype was first determined and a first draft genome of 537Mb total length was assembled into 68,292 scaffolds (N50 = 164kb). From this genome assembly, 29,415 coding genes were predicted. To circumvent some limitations for fine-scale physical mapping of genomic regions of interest, a 3X coverage BAC library was also developed. In particular, 11 BACs from this library were individually sequenced to assess the assembly quality. Additionally, de novo transcriptomic resources were generated from various developmental stages sequenced with HiSeq and MiSeq Illumina technologies. The reads were de novo assembled into 62,376 and 63,175 transcripts, respectively. Then, a robust subset of the genome-predicted coding genes, the de novo transcriptome assemblies and previously published 454/Sanger data were clustered to obtain a high-quality and comprehensive reference transcriptome consisting of 29,701 bona fide unigenes. These sequences covered 99% of the cegma and 88% of the busco highly conserved eukaryotic genes and 84% of the busco arthropod gene set. Moreover, 90% of these transcripts could be localized on the draft genome. The described information is available via a genome annotation portal (
%$ 080 ; 076 ; 020