Publications des scientifiques de l'IRD

Xu L., Berti-Equille Laure, Cuesta-Infante A., Veeramachaneni K. (2023). In situ augmentation for defending against adversarial attacks on text classifiers. In : Tanveer M. (ed.), Agarwal S. (ed.), Ozawa S. (ed.), Ekbal A. (ed.), Jatowt A. (ed.). Neural Information Processing : 29th International Conference, ICONIP 2022, Virtual Event Novembre 22-26, 2022, proceedings, part III. Cham : Springer, 485-496. (Lecture Notes in Computer Science ; 13625). ICONIP : International Conference on Neural Information Processing, 29., [En ligne], 2022/11/22-26. ISBN 978-3-031-30110-0.

Titre du document
In situ augmentation for defending against adversarial attacks on text classifiers
Année de publication
2023
Type de document
Partie d'ouvrage
Auteurs
Xu L., Berti-Equille Laure, Cuesta-Infante A., Veeramachaneni K.
In
Tanveer M. (ed.), Agarwal S. (ed.), Ozawa S. (ed.), Ekbal A. (ed.), Jatowt A. (ed.) Neural Information Processing : 29th International Conference, ICONIP 2022, Virtual Event Novembre 22-26, 2022, proceedings, part III
Source
Cham : Springer, 2023, 485-496 (Lecture Notes in Computer Science ; 13625). ISBN 978-3-031-30110-0
Colloque
ICONIP : International Conference on Neural Information Processing, 29., [En ligne], 2022/11/22-26
In text classification, recent research shows that adversarial attack methods can generate sentences that dramatically decrease the classification accuracy of state-of-the-art neural text classifiers. However, very few defense methods have been proposed against these generated high-quality adversarial sentences. In this paper, we propose LMAg (Language-Model-based Augmentation using Gradient Guidance), an in situ data augmentation method as a defense mechanism effective in two representative defense setups. Specifically, LMAg transforms input text during the test time. It uses the norm of the gradient to estimate the importance of a word to the classifier's prediction, then replaces those words with alternatives proposed by a masked language model. LMAg is an additional protection layer on the classifier that counteracts the perturbations made by adversarial attack methods, thus can protect the classifier from adversarial attack without additional training. Experimental results show that LMAg can improve after-attack accuracy of BERT text classifier by 51.5% and 17.3% for two setups respectively.
Plan de classement
Informatique [122]
Localisation
Fonds IRD [F B010090432]
Identifiant IRD
fdi:010090432
Contact
  • Coordonnées :
    Mission Science Ouverte (MSO)
    IRD - Délégation régionale Île-de-France & Ouest
    Campus Condorcet - Hôtel à projets
    8 cours des Humanités - 93322 Aubervilliers Cedex
    Horizon Pleins textes
    Aide
  •