Effective and Efficient Masking with Low Noise using Small-Mersenne-Prime Ciphers
Embedded devices used in security applications are natural targets for physical attacks. Thus, enhancing their side-channel resistance is an important research challenge. A standard solution for this purpose is the use of Boolean masking schemes, as they are well adapted to current block ciphers with efficient bitslice representations. Boolean masking guarantees that the security of an implementation grows exponentially in the number of shares under the assumption that leakages are sufficiently noisy (and independent). Unfortunately, it has been shown that this noise assumption is hardly met on low-end devices. In this paper, we therefore investigate techniques to mask cryptographic algorithms in such a way that their resistance can survive an almost complete lack of noise. Building on seed theoretical results of Dziembowski et al., we put forward that arithmetic encodings in prime fields can reach this goal. We first exhibit the gains that such encodings lead to thanks to a simulated information theoretic analysis of their leakage (with up to six shares). We then provide figures showing that on platforms where optimized arithmetic adders and multipliers are readily available (i.e., most MCUs and FPGAs), performing masked operations in small to medium Mersenne-prime fields as opposed to binary extension fields will not lead to notable implementation overheads. We compile these observations into a new AES-like block cipher, called AES-prime, which is well-suited to illustrate the remarkable advantages of masking in prime fields. We also confirm the practical relevance of our findings by evaluating concrete software (ARM Cortex-M3) and hardware (Xilinx Spartan-6) implementations. Our experimental results show that security gains over Boolean masking (and, more generally, binary encodings) can reach orders of magnitude despite the same amount of information being leaked per share.
Prime-Field Masking in Hardware and its Soundness against Low-Noise SCA Attacks
A recent study suggests that arithmetic masking in prime fields leads to stronger security guarantees against passive physical adversaries than Boolean masking. Indeed, it is a common observation that the desired security amplification of Boolean masking collapses when the noise level in the measurements is too low. Arithmetic encodings in prime fields can help to maintain an exponential increase of the attack complexity in the number of shares even in such a challenging context. In this work, we contribute to this emerging topic in two main directions. First, we propose novel masked hardware gadgets for secure squaring in prime fields (since squaring is non-linear in non-binary fields) which prove to be significantly more resource-friendly than corresponding masked multiplications. We then formally show their local and compositional security for arbitrary orders. Second, we attempt to >experimentally evaluate the performance vs. security tradeoff of prime-field masking. In order to enable a first comparative case study in this regard, we exemplarily consider masked implementations of the AES as well as the recently proposed AESprime. AES-prime is a block cipher partially resembling the standard AES, but based on arithmetic operations modulo a small Mersenne prime. We present cost and performance figures for masked AES and AES-prime implementations, and experimentally evaluate their susceptibility to low-noise side-channel attacks. We consider both the dynamic and the static power consumption for our low-noise analyses and emulate strong adversaries. Static power attacks are indeed known as a threat for side-channel countermeasures that require a certain noise level to be effective because of the adversary’s ability to reduce the noise through intra-trace averaging. Our results show consistently that for the noise levels in our practical experiments, the masked prime-field implementations provide much higher security for the same number of shares. This compensates for the overheads prime computations lead to and remains true even if / despite leaking each share with a similar Signal-to-Noise Ratio (SNR) as their binary equivalents. We hope our results open the way towards new cipher designs tailored to best exploit the advantages of prime-field masking.
Prouﬀ & Rivain’s Formal Security Proof of Masking, Revisited: Tight Bounds in the Noisy Leakage Model
Masking is a counter-measure that can be incorporated to software and hardware implementations of block ciphers to provably se- cure them against side-channel attacks. The security of masking can be proven in diﬀerent types of threat models. In this paper, we are interested in directly proving the security in the most realistic threat model, the so-called noisy leakage adversary, that captures well how real-world side- channel adversaries operate. Direct proofs in this leakage model have been established by Prouﬀ & Rivain at Eurocrypt 2013, Dziembowski et al. at Eurocrypt 2015, and Prest et al. at Crypto 2019. Both proofs are complementary to each other, in the sense that the weaknesses of one proof are ﬁxed in at least one of the others, and conversely. These weak- nesses concerned in particular the strong requirements on the noise level and the security parameter to get meaningful security bounds, and some requirements on the type of adversary covered by the proof — i.e., cho- sen or random plaintexts. This suggested that the drawbacks of each security bound could actually be proof artifacts. In this paper, we solve these issues, by revisiting Prouﬀ & Rivain’s approach.
Information Bounds and Convergence Rates for Side-Channel Security Evaluators
Current side-channel evaluation methodologies exhibit a gap between inefficient tools offering strong theoretical guarantees and efficient tools only offering heuristic (sometimes case-specific) guarantees. Profiled attacks based on the empirical leakage distribution correspond to the first category. Bronchain et al. showed at Crypto 2019 that they allow bounding the worst-case security level of an implementation, but the bounds become loose as the leakage dimensionality increases. Template attacks and machine learning models are examples of the second category. In view of the increasing popularity of such parametric tools in the literature, a natural question is whether the information they can extract can be bounded.In this paper, we first show that a metric conjectured to be useful for this purpose, the hypothetical information, does not offer such a general bound. It only does when the assumptions exploited by a parametric model match the true leakage distribution. We therefore introduce a new metric, the training information, that provides the guarantees that were conjectured for the hypothetical information for practically-relevant models. We next initiate a study of the convergence rates of profiled side-channel distinguishers which clarifies, to the best of our knowledge for the first time, the parameters that influence the complexity of a profiling. On the one hand, the latter has practical consequences for evaluators as it can guide them in choosing the appropriate modeling tool depending on the implementation (e.g., protected or not) and contexts (e.g., granting them access to the countermeasures’ randomness or not). It also allows anticipating the amount of measurements needed to guarantee a sufficient model quality. On the other hand, our results connect and exhibit differences between side-channel analysis and statistical learning theory.
Don’t Learn What You Already Know: Scheme-Aware Modeling for Profiling Side-Channel Analysis against Masking
Over the past few years, deep-learning-based attacks have emerged as a de facto standard, thanks to their ability to break implementations of cryptographic primitives without pre-processing, even against widely used counter-measures such as hiding and masking. However, the recent works of Bronchain and Standaert at Tches 2020 questioned the soundness of such tools if used in an uninformed setting to evaluate implementations protected with higher-order masking. On the opposite, worst-case evaluations may be seen as possibly far from what a real-world adversary could do, thereby leading to too conservative security bounds. In this paper, we propose a new threat model that we name scheme-aware benefiting from a trade-off between uninformed and worst-case models. Our scheme-aware model is closer to a real-world adversary, in the sense that it does not need to have access to the random nonces used by masking during the profiling phase like in a worst-case model, while it does not need to learn the masking scheme as implicitly done by an uninformed adversary. We show how to combine the power of deep learning with the prior knowledge of scheme-aware modeling. As a result, we show on simulations and experiments on public datasets how it sometimes allows to reduce by an order of magnitude the profiling complexity, i.e., the number of profiling traces needed to satisfyingly train a model, compared to a fully uninformed adversary.
A Comprehensive Study of Deep Learning for Side-Channel Analysis 📺
Recently, several studies have been published on the application of deep learning to enhance Side-Channel Attacks (SCA). These seminal works have practically validated the soundness of the approach, especially against implementations protected by masking or by jittering. Concurrently, important open issues have emerged. Among them, the relevance of machine (and thereby deep) learning based SCA has been questioned in several papers based on the lack of relation between the accuracy, a typical performance metric used in machine learning, and common SCA metrics like the Guessing entropy or the key-discrimination success rate. Also, the impact of the classical side-channel counter-measures on the efficiency of deep learning has been questioned, in particular by the semi-conductor industry. Both questions enlighten the importance of studying the theoretical soundness of deep learning in the context of side-channel and of developing means to quantify its efficiency, especially with respect to the optimality bounds published so far in the literature for side-channel leakage exploitation. The first main contribution of this paper directly concerns the latter point. It is indeed proved that minimizing the Negative Log Likelihood (NLL for short) loss function during the training of deep neural networks is actually asymptotically equivalent to maximizing the Perceived Information introduced by Renauld et al. at EUROCRYPT 2011 as a lower bound of the Mutual Information between the leakage and the target secret. Hence, such a training can be considered as an efficient and effective estimation of the PI, and thereby of the MI (known to be complex to accurately estimate in the context of secure implementations). As a second direct consequence of our main contribution, it is argued that, in a side-channel exploitation context, choosing the NLL loss function to drive the training is sound from an information theory point of view. As a third contribution, classical counter-measures like Boolean masking or execution flow shuffling, initially dedicated to classical SCA, are proved to stay sound against deep Learning based attacks.