Don’t Learn What You Already Know: Scheme-Aware Modeling for Profiling Side-Channel Analysis against Masking
Over the past few years, deep-learning-based attacks have emerged as a de facto standard, thanks to their ability to break implementations of cryptographic primitives without pre-processing, even against widely used counter-measures such as hiding and masking. However, the recent works of Bronchain and Standaert at Tches 2020 questioned the soundness of such tools if used in an uninformed setting to evaluate implementations protected with higher-order masking. On the opposite, worst-case evaluations may be seen as possibly far from what a real-world adversary could do, thereby leading to too conservative security bounds. In this paper, we propose a new threat model that we name scheme-aware benefiting from a trade-off between uninformed and worst-case models. Our scheme-aware model is closer to a real-world adversary, in the sense that it does not need to have access to the random nonces used by masking during the profiling phase like in a worst-case model, while it does not need to learn the masking scheme as implicitly done by an uninformed adversary. We show how to combine the power of deep learning with the prior knowledge of scheme-aware modeling. As a result, we show on simulations and experiments on public datasets how it sometimes allows to reduce by an order of magnitude the profiling complexity, i.e., the number of profiling traces needed to satisfyingly train a model, compared to a fully uninformed adversary.
A Comprehensive Study of Deep Learning for Side-Channel Analysis 📺
Recently, several studies have been published on the application of deep learning to enhance Side-Channel Attacks (SCA). These seminal works have practically validated the soundness of the approach, especially against implementations protected by masking or by jittering. Concurrently, important open issues have emerged. Among them, the relevance of machine (and thereby deep) learning based SCA has been questioned in several papers based on the lack of relation between the accuracy, a typical performance metric used in machine learning, and common SCA metrics like the Guessing entropy or the key-discrimination success rate. Also, the impact of the classical side-channel counter-measures on the efficiency of deep learning has been questioned, in particular by the semi-conductor industry. Both questions enlighten the importance of studying the theoretical soundness of deep learning in the context of side-channel and of developing means to quantify its efficiency, especially with respect to the optimality bounds published so far in the literature for side-channel leakage exploitation. The first main contribution of this paper directly concerns the latter point. It is indeed proved that minimizing the Negative Log Likelihood (NLL for short) loss function during the training of deep neural networks is actually asymptotically equivalent to maximizing the Perceived Information introduced by Renauld et al. at EUROCRYPT 2011 as a lower bound of the Mutual Information between the leakage and the target secret. Hence, such a training can be considered as an efficient and effective estimation of the PI, and thereby of the MI (known to be complex to accurately estimate in the context of secure implementations). As a second direct consequence of our main contribution, it is argued that, in a side-channel exploitation context, choosing the NLL loss function to drive the training is sound from an information theory point of view. As a third contribution, classical counter-measures like Boolean masking or execution flow shuffling, initially dedicated to classical SCA, are proved to stay sound against deep Learning based attacks.