Publications des scientifiques de l'IRD

Majstorovic J., Giffard-Roisin Sophie, Poli P. (2022). Interpreting convolutional neural network decision for earthquake detection with feature map visualization, backward optimization and layer-wise relevance propagation methods. Geophysical Journal International, 232 (2), 923-939. ISSN 0956-540X.

Titre du document
Interpreting convolutional neural network decision for earthquake detection with feature map visualization, backward optimization and layer-wise relevance propagation methods
Année de publication
2022
Type de document
Article référencé dans le Web of Science WOS:000866173400007
Auteurs
Majstorovic J., Giffard-Roisin Sophie, Poli P.
Source
Geophysical Journal International, 2022, 232 (2), 923-939 ISSN 0956-540X
In the recent years, the seismological community has adopted deep learning (DL) models for many diverse tasks such as discrimination and classification of seismic events, identification of P- and S-phase wave arrivals or earthquake early warning systems. Numerous models recently developed are showing high accuracy values, and it has been attested for several tasks that DL models perform better than the classical seismological state-of-art models. However, their performances strongly depend on the DL architecture, the training hyperparameters, and the training data sets. Moreover, due to their complex nature, we are unable to understand how the model is learning and therefore how it is making a prediction. Thus, DL models are usually referred to as a 'black-box'. In this study, we propose to apply three complementary techniques to address the interpretability of a convolutional neural network (CNN) model for the earthquake detection. The implemented techniques are: feature map visualization, backward optimization and layer-wise relevance propagation. Since our model reaches a good accuracy performance (97%), we can suppose that the CNN detector model extracts relevant characteristics from the data, however a question remains: can we identify these characteristics? The proposed techniques help to answer the following questions: How is an earthquake processed by a CNN model? What is the optimal earthquake signal according to a CNN? Which parts of the earthquake signal are more relevant for the model to correctly classify an earthquake sample? The answer to these questions help understand why the model works and where it might fail, and whether the model is designed well for the predefined task. The CNN used in this study had been trained for single-station detection, where an input sample is a 25 s three-component waveform. The model outputs a binary target: earthquake (positive) or noise (negative) class. The training database contains a balanced number of samples from both classes. Our results shows that the CNN model correctly learned to recognize where is the earthquake within the sample window, even though the position of the earthquake in the window is not explicitly given during the training. Moreover, we give insights on how a neural network builds its decision process: while some aspects can be linked to clear physical characteristics, such as the frequency content and the P and S waves, we also see how different a DL detection is compared to a visual expertise or an STA/LTA detection. On top of improving our model designs, we also think that understanding how such models work, how they perceive an earthquake, can be useful for the comprehension of events that are not fully understood yet such as tremors or low frequency earthquakes.
Plan de classement
Sciences fondamentales / Techniques d'analyse et de recherche [020] ; Géologie et formations superficielles [064] ; Géophysique interne [066]
Description Géographique
ITALIE
Localisation
Fonds IRD [F B010086352]
Identifiant IRD
fdi:010086352
Contact