Gaëtan Cassiers

CryptoDB

Gaëtan Cassiers

Publications and invited talks

Year

Venue

Title

2025

CIC

On Loopy Belief Propagation for SASCAs Abstract

Rishub Nagpal Gaëtan Cassiers Robert Primas Christian Knoll Franz Pernkopf Stefan Mangard

Profiled power analysis is one of the most powerful forms of passive side-channel attacks. Over the last two decades, many works have analyzed their impact on cryptographic implementations as well as corresponding countermeasure techniques. To date, the most advanced variants of profiled power analysis are based on Soft-analytical Side-Channel Attacks (SASCA). After the initial profiling phase, a SASCA adversary creates a probabilistic graphical model, called a factor graph, of the target implementation and encodes the results of the previous step as prior information. Then, an inference algorithm such as loopy Belief Propagation (BP) can be used to recover the distribution of a target variable in the graph, i.e., sensitive data/keys. Designers of cryptographic implementations aim to reduce information leakage as much as possible and assess how much leakage can be allowed without compromising security requirements. Despite the existence of many works on profiled power analysis, it is still notoriously difficult to state under which conditions a cryptographic implementation provides sufficient protection against a profiling attacker with certain capabilities. In particular, it is unknown when a BP-based attack is optimal or whether tuning some heuristics in that algorithm may significant strengthen the attack. This knowledge gap led us to investigate the effectiveness of BP for SASCAs by studying the modes of failures of BP in the context of the SASCA, and systematically analyzing the behavior of BP on practically-relevant factor graphs. We use exact inference to gauge the quality of the approximation provided by BP. Through this assessment, we show that there exists a significant disparity between BP and exact inference in terms of guessing entropy when performing SASCAs on several classes of factor graphs. We further review and analyze various BP improvement heuristics from the literature.

2024

CIC

Randomness Generation for Secure Hardware Masking – Unrolled Trivium to the Rescue Abstract

Gaëtan Cassiers Loïc Masure Charles Momin Thorben Moos Amir Moradi François-Xavier Standaert

Masking is a prominent strategy to protect cryptographic implementations against side-channel analysis. Its popularity arises from the exponential security gains that can be achieved for (approximately) quadratic resource utilization. Many variants of the countermeasure tailored for different optimization goals have been proposed. The common denominator among all of them is the implicit demand for robust and high entropy randomness. Simply assuming that uniformly distributed random bits are available, without taking the cost of their generation into account, leads to a poor understanding of the efficiency vs. security tradeoff of masked implementations. This is especially relevant in case of hardware masking schemes which are known to consume large amounts of random bits per cycle due to parallelism. Currently, there seems to be no consensus on how to most efficiently derive many pseudo-random bits per clock cycle from an initial seed and with properties suitable for masked hardware implementations. In this work, we evaluate a number of building blocks for this purpose and find that hardware-oriented stream ciphers like Trivium and its reduced-security variant Bivium B outperform most competitors when implemented in an unrolled fashion. Unrolled implementations of these primitives enable the flexible generation of many bits per cycle, which is crucial for satisfying the large randomness demands of state-of-the-art masking schemes. According to our analysis, only Linear Feedback Shift Registers (LFSRs), when also unrolled, are capable of producing long non-repetitive sequences of random-looking bits at a higher rate per cycle for the same or lower cost as Trivium and Bivium B. Yet, these instances do not provide black-box security as they generate only linear outputs. We experimentally demonstrate that using multiple output bits from an LFSR in the same masked implementation can violate probing security and even lead to harmful randomness cancellations. Circumventing these problems, and enabling an independent analysis of randomness generation and masking, requires the use of cryptographically stronger primitives like stream ciphers. As a result of our studies, we provide an evidence-based estimate for the cost of securely generating $n$ fresh random bits per cycle. Depending on the desired level of black-box security and operating frequency, this cost can be as low as $20n$ to $30n$ ASIC gate equivalents (GE) or $3n$ to $4n$ FPGA look-up tables (LUTs), where $n$ is the number of random bits required. Our results demonstrate that the cost per bit is (sometimes significantly) lower than estimated in previous works, incentivizing parallelism whenever exploitable. This provides further motivation to potentially move low randomness usage from a primary to a secondary design goal in hardware masking research.

2024

TCHES

Compress: Generate Small and Fast Masked Pipelined Circuits Abstract

Gaëtan Cassiers Barbara Gigerl Stefan Mangard Charles Momin Rishub Nagpal

Masking is an effective countermeasure against side-channel attacks. It replaces every logic gate in a computation by a gadget that performs the operation over secret sharings of the circuit’s variables. When masking is implemented in hardware, care should be taken to protect against leakage from glitches, which could otherwise undermine the security of masking. This is generally done by adding registers, which stop the propagation of glitches, but introduce additional latency and area cost. In masked pipeline circuits, a high latency further increases the area overheads of masking, due to the need for additional registers that synchronize signals between pipeline stages. In this work, we propose a technique to minimize the number of such pipeline registers, which relies on optimizing the scheduling of the computations across the pipeline stages. We release an implementation of this technique as an open-source tool, Compress. Further, we introduce other optimizations to deduplicate logic between gadgets, perform an optimal selection of masked gadgets, and introduce new gadgets with smaller area. Overall, our optimizations lead to circuits that improve the state-of-the art in area and achieve state-of-the-art latency. For example, a masked AES based on an S-box generated by Compress reduces latency by 19% and area by 27% over a state-of-the-art implementation, or, for the same latency, reduces area by 45%.

2024

TCHES

Low-Latency Masked Gadgets Robust against Physical Defaults with Application to Ascon Abstract

Gaëtan Cassiers François-Xavier Standaert Corentin Verhamme

Low-latency masked hardware implementations are known to be a difficult challenge. On the one hand, the propagation of glitches can falsify their independence assumption (that is required for security) and can only be stopped by registers. This implies that glitch-robust masked AND gates (maintaining a constant number of shares) require at least one cycle. On the other hand, Knichel and Moradi’s only known single-cycle multiplication gadget that ensures (composable) security against glitches for any number of shares requires additional care to maintain security against transition-based leakages. For example, it cannot be integrated in a single-cycle roundbased architecture which is a natural choice for low-latency implementations. In this paper, we therefore describe the first single-cycle masked multiplication gadget that is trivially composable and provides security against transitions and glitches, and prove its security in the robust probing model. We then analyze the interest of this new gadget for the secure implementation of the future lightweight cryptography standard Ascon, which has good potential for low-latency. We show that it directly leads to improvements for uniformly protected implementations (where all computations are masked). We also show that it is can be handy for integration in so-called leveled implementations (where only the key derivation and the tag generation are masked, which provides integrity with leakage in encryption and decryption and confidentiality with leakage in encryption only). Most importantly, we show that it is very attractive for implementations that we denote as multi-target, which can alternate between uniformly protected and leveled implementations, without latency overheads and at limited cost. We complete these findings by evaluating different protected implementations of Ascon, clarifying its hardware design space.

2023

TCHES

Prime-Field Masking in Hardware and its Soundness against Low-Noise SCA Attacks Abstract

Gaëtan Cassiers Loïc Masure Charles Momin Thorben Moos François-Xavier Standaert

A recent study suggests that arithmetic masking in prime fields leads to stronger security guarantees against passive physical adversaries than Boolean masking. Indeed, it is a common observation that the desired security amplification of Boolean masking collapses when the noise level in the measurements is too low. Arithmetic encodings in prime fields can help to maintain an exponential increase of the attack complexity in the number of shares even in such a challenging context. In this work, we contribute to this emerging topic in two main directions. First, we propose novel masked hardware gadgets for secure squaring in prime fields (since squaring is non-linear in non-binary fields) which prove to be significantly more resource-friendly than corresponding masked multiplications. We then formally show their local and compositional security for arbitrary orders. Second, we attempt to >experimentally evaluate the performance vs. security tradeoff of prime-field masking. In order to enable a first comparative case study in this regard, we exemplarily consider masked implementations of the AES as well as the recently proposed AESprime. AES-prime is a block cipher partially resembling the standard AES, but based on arithmetic operations modulo a small Mersenne prime. We present cost and performance figures for masked AES and AES-prime implementations, and experimentally evaluate their susceptibility to low-noise side-channel attacks. We consider both the dynamic and the static power consumption for our low-noise analyses and emulate strong adversaries. Static power attacks are indeed known as a threat for side-channel countermeasures that require a certain noise level to be effective because of the adversary’s ability to reduce the noise through intra-trace averaging. Our results show consistently that for the noise levels in our practical experiments, the masked prime-field implementations provide much higher security for the same number of shares. This compensates for the overheads prime computations lead to and remains true even if / despite leaking each share with a similar Signal-to-Noise Ratio (SNR) as their binary equivalents. We hope our results open the way towards new cipher designs tailored to best exploit the advantages of prime-field masking.

2023

CRYPTO

Unifying Freedom and Separation for Tight Probing-Secure Composition Abstract

Sonia Belaïd Gaëtan Cassiers Matthieu Rivain Abdul Rahman Taleb

The masking countermeasure is often analyzed in the probing model. Proving the probing security of large circuits at high masking orders is achieved by composing gadgets that satisfy security definitions such as non-interference (NI), strong non-interference (SNI) or free SNI. The region probing model is a variant of the probing model, where the probing capabilities of the adversary scale with the number of regions in a masked circuit. This model is of interest as it allows better reductions to the more realistic noisy leakage model. The efficiency of composable region probing secure masking has been recently improved with the introduction of the input-output separation (IOS) definition. In this paper, we first establish equivalences between the non-interference framework and the IOS formalism. We also generalize the security definitions to multiple-input gadgets and systematically show implications and separations between these notions. Then, we study which gadgets from the literature satisfy these. We give new security proofs for some well-known arbitrary-order gadgets, and also some automated proofs for fixed-order, special-case gadgets. To this end, we introduce a new automated formal verification algorithm that solves the open problem of verifying free SNI, which is not a purely simulation-based definition. Using the relationships between the security notions, we adapt this algorithm to further verify IOS. Finally, we look at composition theorems. In the probing model, we use the link between free SNI and the IOS formalism to generalize and improve the efficiency of the tight private circuit (Asiacrypt 2018) construction, also fixing a flaw in the original proof. In the region probing model, we relax the assumptions for IOS composition (TCHES 2021), which allows to save many refresh gadgets, hence improving the efficiency.

2023

TCHES

Efficient Regression-Based Linear Discriminant Analysis for Side-Channel Security Evaluations: Towards Analytical Attacks against 32-bit Implementations Abstract

Gaëtan Cassiers Henri Devillez François-Xavier Standaert Balazs Udvarhelyi

32-bit software implementations become increasingly popular for embedded security applications. As a result, profiling 32-bit target intermediate values becomes increasingly needed to evaluate their side-channel security. This implies the need of statistical tools that can deal with long traces and large number of classes. While there are good options to solve these issues separately (e.g., linear regression and linear discriminant analysis), the current state of the art lacks efficient tools to solve them jointly. To the best of our knowledge, the best-known option is to fragment the profiling in smaller parts, which is suboptimal from the information theoretic viewpoint. In this paper, we therefore revisit regression-based linear discriminant analysis, which combines linear regression and linear discriminant analysis, and improve its efficiency so that it can be used for profiling long traces corresponding to 32-bit implementations. Besides introducing the optimizations needed for this purpose, we show how to use regression-based linear discriminant analysis in order to obtain efficient bounds for the perceived information, an information theoretic metric characterizing the security of an implementation against profiled attacks. We also combine this tool with optimizations of soft analytical side-channel attack that apply to bitslice implementations. We use these results to attack a 32-bit implementation of SAP instantiated with Ascon’s permutation, and show that breaking the initialization of its re-keying in one trace is feasible for determined adversaries.

2023

TCHES

Kavach: Lightweight masking techniques for polynomial arithmetic in lattice-based cryptography Abstract

Aikata Aikata Andrea Basso Gaetan Cassiers Ahmet Can Mert Sujoy Sinha Roy

Lattice-based cryptography has laid the foundation of various modern-day cryptosystems that cater to several applications, including post-quantum cryptography. For structured lattice-based schemes, polynomial arithmetic is a fundamental part. In several instances, the performance optimizations come from implementing compact multipliers due to the small range of the secret polynomial coefficients. However, this optimization does not easily translate to side-channel protected implementations since masking requires secret polynomial coefficients to be distributed over a large range. In this work, we address this problem and propose two novel generalized techniques, one for the number theoretic transform (NTT) based and another for the non-NTT-based polynomial arithmetic. Both these proposals enable masked polynomial multiplication while utilizing and retaining the small secret property.For demonstration, we used the proposed technique and instantiated masked multipliers for schoolbook as well as NTT-based polynomial multiplication. Both of these can utilize the compact multipliers used in the unmasked implementations. The schoolbook multiplication requires an extra polynomial accumulation along with the two polynomial multiplications for a first-order protected implementation. However, this cost is nothing compared to the area saved by utilizing the existing cheap multiplication units. We also extensively test the side-channel resistance of the proposed design through TVLA to guarantee its first-order security.

2023

TCHES

Information Bounds and Convergence Rates for Side-Channel Security Evaluators Abstract

Loïc Masure Gaëtan Cassiers Julien Hendrickx François-Xavier Standaert

Current side-channel evaluation methodologies exhibit a gap between inefficient tools offering strong theoretical guarantees and efficient tools only offering heuristic (sometimes case-specific) guarantees. Profiled attacks based on the empirical leakage distribution correspond to the first category. Bronchain et al. showed at Crypto 2019 that they allow bounding the worst-case security level of an implementation, but the bounds become loose as the leakage dimensionality increases. Template attacks and machine learning models are examples of the second category. In view of the increasing popularity of such parametric tools in the literature, a natural question is whether the information they can extract can be bounded.In this paper, we first show that a metric conjectured to be useful for this purpose, the hypothetical information, does not offer such a general bound. It only does when the assumptions exploited by a parametric model match the true leakage distribution. We therefore introduce a new metric, the training information, that provides the guarantees that were conjectured for the hypothetical information for practically-relevant models. We next initiate a study of the convergence rates of profiled side-channel distinguishers which clarifies, to the best of our knowledge for the first time, the parameters that influence the complexity of a profiling. On the one hand, the latter has practical consequences for evaluators as it can guide them in choosing the appropriate modeling tool depending on the implementation (e.g., protected or not) and contexts (e.g., granting them access to the countermeasures’ randomness or not). It also allows anticipating the amount of measurements needed to guarantee a sufficient model quality. On the other hand, our results connect and exhibit differences between side-channel analysis and statistical learning theory.

2023

TCHES

Protecting Dilithium against Leakage: Revisited Sensitivity Analysis and Improved Implementations Abstract

Melissa Azouaoui Olivier Bronchain Gaëtan Cassiers Clément Hoffmann Yulia Kuzovkova Joost Renes Tobias Schneider Markus Schönauer François-Xavier Standaert Christine van Vredendaal

CRYSTALS-Dilithium has been selected by the NIST as the new standard for post-quantum digital signatures. In this work, we revisit the side-channel countermeasures of Dilithium in three directions. First, we improve its sensitivity analysis by classifying intermediate computations according to their physical security requirements. Second, we provide improved gadgets dedicated to Dilithium, taking advantage of recent advances in masking conversion algorithms. Third, we combine these contributions and report performance for side-channel protected Dilithium implementations. Our benchmarking results additionally put forward that the randomized version of Dilithium can lead to significantly more efficient implementations (than its deterministic version) when side-channel attacks are a concern.

2023

TCHES

Quantile: Quantifying Information Leakage Abstract

Vedad Hadžic Gaëtan Cassiers Robert Primas Stefan Mangar Roderick Bloem

The masking countermeasure is very effective against side-channel attacks such as differential power analysis. However, the design of masked circuits is a challenging problem since one has to ensure security while minimizing performance overheads. The security of masking is often studied in the t-probing model, and multiple formal verification tools can verify this notion. However, these tools generally cannot verify large masked computations due to computational complexity.We introduce a new verification tool named Quantile, which performs randomized simulations of the masked circuit in order to bound the mutual information between the leakage and the secret variables. Our approach ensures good scalability with the circuit size and results in proven statistical security bounds. Further, our bounds are quantitative and, therefore, more nuanced than t-probing security claims: by bounding the amount of information contained in the lower-order leakage, Quantile can evaluate the security provided by masking even when they are not 1-probing secure, i.e., when they are classically considered as insecure. As an example, we apply Quantile to masked circuits of Prince and AES, where randomness is aggressively reused.

2022

TCHES

Triplex: an Efficient and One-Pass Leakage-Resistant Mode of Operation Abstract

Yaobin Shen Thomas Peters François-Xavier Standaert Gaëtan Cassiers Corentin Verhamme

This paper introduces and analyzes Triplex, a leakage-resistant mode of operation based on Tweakable Block Ciphers (TBCs) with 2n-bit tweaks. Triplex enjoys beyond-birthday ciphertext integrity in the presence of encryption and decryption leakage in a liberal model where all intermediate computations are leaked in full and only two TBC calls operating a long-term secret are protected with implementationlevel countermeasures. It provides beyond-birthday confidentiality guarantees without leakage, and standard confidentiality guarantees with leakage for a single-pass mode embedding a re-keying process for the bulk of its computations (i.e., birthday confidentiality with encryption leakage under a bounded leakage assumption). Triplex improves leakage-resistant modes of operation relying on TBCs with n-bit tweaks when instantiated with large-tweak TBCs like Deoxys-TBC (a CAESAR competition laureate) or Skinny (used by the Romulus finalist of the NIST lightweight crypto competition). Its security guarantees are maintained in the multi-user setting.

2022

TCHES

Bitslicing Arithmetic/Boolean Masking Conversions for Fun and Profit: with Application to Lattice-Based KEMs Abstract

Olivier Bronchain Gaëtan Cassiers

The performance of higher-order masked implementations of lattice-based based key encapsulation mechanisms (KEM) is currently limited by the costly conversions between arithmetic and Boolean masking. While bitslicing has been shown to strongly speed up masked implementations of symmetric primitives, its use in arithmetic-to-Boolean and Boolean-to-arithmetic masking conversion gadgets has never been thoroughly investigated. In this paper, we first show that bitslicing can indeed accelerate existing conversion gadgets. We then optimize these gadgets, exploiting the degrees of freedom offered by bitsliced implementations. As a result, we introduce new arbitrary-order Boolean masked addition, arithmetic-to-Boolean and Boolean-to-arithmetic masking conversion gadgets, each in two variants: modulo 2k and modulo p (for any integers k and p). Practically, our new gadgets achieve a speedup of up to 25x over the state of the art. Turning to the KEM application, we develop the first open-source embedded (Cortex-M4) implementations of Kyber768 and Saber masked at arbitrary order. The implementations based on the new bitsliced gadgets achieve a speedup of 1.8x for Kyber and 3x for Saber, compared to the implementation based on state-of-the-art gadgets. The bottleneck of the bitslice implementations is the masked Keccak-f[1600] permutation.

2021

TCHES

Provably Secure Hardware Masking in the Transition- and Glitch-Robust Probing Model: Better Safe than Sorry 📺 Abstract

Gaëtan Cassiers François-Xavier Standaert

There exists many masking schemes to protect implementations of cryptographic operations against side-channel attacks. It is common practice to analyze the security of these schemes in the probing model, or its variant which takes into account physical effects such as glitches and transitions. Although both effects exist in practice and cause leakage, masking schemes implemented in hardware are often only analyzed for security against glitches. In this work, we fill this gap by proving sufficient conditions for the security of hardware masking schemes against transitions, leading to the design of new masking schemes and a proof of security for an existing masking scheme in presence of transitions. Furthermore, we give similar results in the stronger model where the effects of glitches and transitions are combined.

2021

CRYPTO

Towards Tight Random Probing Security 📺 Abstract

Gaëtan Cassiers Sebastian Faust Maximilian Orlt François-Xavier Standaert

Proving the security of masked implementations in theoretical models that are relevant to practice and match the best known attacks of the side-channel literature is a notoriously hard problem. The random probing model is a good candidate to contribute to this challenge, due to its ability to capture the continuous nature of physical leakage (contrary to the threshold probing model), while also being convenient to manipulate in proofs and to automate with verification tools. Yet, despite recent progresses in the design of masked circuits with good asymptotic security guarantees in this model, existing results still fall short when it comes to analyze the security of concretely useful circuits under realistic noise levels and with low number of shares. In this paper, we contribute to this issue by introducing a new composability notion, the Probe Distribution Table (PDT), and a new tool (called STRAPS, for the Sampled Testing of the RAndom Probing Security). Their combination allows us to significantly improve the tightness of existing analyses in the most practical (low noise, low number of shares) region of the design space. We illustrate these improvements by quantifying the random probing security of an AES S-box circuit, masked with the popular multiplication gadget of Ishai, Sahai and Wagner from Crypto 2003, with up to six shares.

2020

TCHES

Efficient and Private Computations with Code-Based Masking 📺 Abstract

Weijia Wang Pierrick Méaux Gaëtan Cassiers François-Xavier Standaert

Code-based masking is a very general type of masking scheme that covers Boolean masking, inner product masking, direct sum masking, and so on. The merits of the generalization are twofold. Firstly, the higher algebraic complexity of the sharing function decreases the information leakage in “low noise conditions” and may increase the “statistical security order” of an implementation (with linear leakages). Secondly, the underlying error-correction codes can offer improved fault resistance for the encoded variables. Nevertheless, this higher algebraic complexity also implies additional challenges. On the one hand, a generic multiplication algorithm applicable to any linear code is still unknown. On the other hand, masking schemes with higher algebraic complexity usually come with implementation overheads, as for example witnessed by inner-product masking. In this paper, we contribute to these challenges in two directions. Firstly, we propose a generic algorithm that allows us (to the best of our knowledge for the first time) to compute on data shared with linear codes. Secondly, we introduce a new amortization technique that can significantly mitigate the implementation overheads of code-based masking, and illustrate this claim with a case study. Precisely, we show that, although performing every single code-based masked operation is relatively complex, processing multiple secrets in parallel leads to much better performances. This property enables code-based masked implementations of the AES to compete with the state-of-the-art in randomness complexity. Since our masked operations can be instantiated with various linear codes, we hope that these investigations open new avenues for the study of code-based masking schemes, by specializing the codes for improved performances, better side-channel security or improved fault tolerance.

2020

TOSC

Spook: Sponge-Based Leakage-Resistant Authenticated Encryption with a Masked Tweakable Block Cipher 📺 Abstract

Davide Bellizia Francesco Berti Olivier Bronchain Gaëtan Cassiers Sébastien Duval Chun Guo Gregor Leander Gaëtan Leurent Itamar Levi Charles Momin Olivier Pereira Thomas Peters François-Xavier Standaert Balazs Udvarhelyi Friedrich Wiemer

This paper defines Spook: a sponge-based authenticated encryption with associated data algorithm. It is primarily designed to provide security against side-channel attacks at a low energy cost. For this purpose, Spook is mixing a leakageresistant mode of operation with bitslice ciphers enabling efficient and low latency implementations. The leakage-resistant mode of operation leverages a re-keying function to prevent differential side-channel analysis, a duplex sponge construction to efficiently process the data, and a tag verification based on a Tweakable Block Cipher (TBC) providing strong data integrity guarantees in the presence of leakages. The underlying bitslice ciphers are optimized for the masking countermeasures against side-channel attacks. Spook is an efficient single-pass algorithm. It ensures state-of-the-art black box security with several prominent features: (i) nonce misuse-resilience, (ii) beyond-birthday security with respect to the TBC block size, and (iii) multiuser security at minimum cost with a public tweak. Besides the specifications and design rationale, we provide first software and hardware implementation results of (unprotected) Spook which confirm the limited overheads that the use of two primitives sharing internal components imply. We also show that the integrity of Spook with leakage, so far analyzed with unbounded leakages for the duplex sponge and a strongly protected TBC modeled as leak-free, can be proven with a much weaker unpredictability assumption for the TBC. We finally discuss external cryptanalysis results and tweaks to improve both the security margins and efficiency of Spook.

2020

CRYPTO

Mode-Level vs. Implementation-Level Physical Security in Symmetric Cryptography: A Practical Guide Through the Leakage-Resistance Jungle 📺 Abstract

Davide Bellizia Olivier Bronchain Gaëtan Cassiers Vincent Grosso Chun Guo Charles Momin Olivier Pereira Thomas Peters François-Xavier Standaert

Triggered by the increasing deployment of embedded cryptographic devices (e.g., for the IoT), the design of authentication, encryption and authenticated encryption schemes enabling improved security against side-channel attacks has become an important research direction. Over the last decade, a number of modes of operation have been proposed and analyzed under different abstractions. In this paper, we investigate the practical consequences of these findings. For this purpose, we first translate the physical assumptions of leakage-resistance proofs into minimum security requirements for implementers. Thanks to this (heuristic) translation, we observe that (i) security against physical attacks can be viewed as a tradeoff between mode-level and implementation-level protection mechanisms, and (i}) security requirements to guarantee confidentiality and integrity in front of leakage can be concretely different for the different parts of an implementation. We illustrate the first point by analyzing several modes of operation with gradually increased leakage-resistance. We illustrate the second point by exhibiting leveled implementations, where different parts of the investigated schemes have different security requirements against leakage, leading to performance improvements when high physical security is needed. We finally initiate a comparative discussion of the different solutions to instantiate the components of a leakage-resistant authenticated encryption scheme.

2020

ASIACRYPT

Packed Multiplication: How to Amortize the Cost of Side-channel Masking? 📺 Abstract

Weijia Wang Chun Guo François-Xavier Standaert Yu Yu Gaëtan Cassiers

Higher-order masking countermeasures provide strong provable security against side-channel attacks at the cost of incurring significant overheads, which largely hinders its applicability. Previous works towards remedying cost mostly concentrated on ``local'' calculations, i.e., optimizing the cost of computation units such as a single AND gate or a field multiplication. This paper explores a complementary ``global'' approach, i.e., considering multiple operations in the masked domain as a batch and reducing randomness and computational cost via amortization. In particular, we focus on the amortization of $\ell$ parallel field multiplications for appropriate integer $\ell > 1$, and design a kit named {\it packed multiplication} for implementing such a batch. Higher-order masking countermeasures provide strong provable security against side-channel attacks at the cost of incurring significant overheads, which largely hinders its applicability. Previous works towards remedying cost mostly concentrated on ``local'' calculations, i.e., optimizing the cost of computation units such as a single AND gate or a field multiplication. This paper explores a complementary ``global'' approach, i.e., considering multiple operations in the masked domain as a batch and reducing randomness and computational cost via amortization. In particular, we focus on the amortization of $\ell$ parallel field multiplications for appropriate integer $\ell > 1$, and design a kit named {\it packed multiplication} for implementing such a batch. For $\ell+d\leq2^m$, when $\ell$ parallel multiplications over $\mathbb{F}_{2^{m}}$ with $d$-th order probing security are implemented, packed multiplication consumes $d^2+2\ell d + \ell$ bilinear multiplications and $2d^2 + d(d+1)/2$ random field variables, outperforming the state-of-the-art results with $O(\ell d^2)$ multiplications and $\ell \left \lfloor d^2/4\right \rfloor + \ell d$ randomness. To prove $d$-probing security for packed multiplications, we introduce some weaker security notions for multiple-inputs-multiple-outputs gadgets and use them as intermediate steps, which may be of independent interest. As parallel field multiplications exist almost everywhere in symmetric cryptography, lifting optimizations from ``local'' to ``global'' substantially enlarges the space of improvements. To demonstrate, we showcase the method on the AES Subbytes step, GCM and TET (a popular disk encryption). Notably, when $d=8$, our implementation of AES Subbytes in ARM Cortex M architecture achieves a gain of up to $33\%$ in total speeds and saves up to $68\%$ random bits than the state-of-the-art bitsliced implementation reported at ASIACRYPT~2018.

2019

TCHES

Towards Globally Optimized Masking: From Low Randomness to Low Noise Rate 📺 Abstract

Gaëtan Cassiers François-Xavier Standaert

We improve the state-of-the-art masking schemes in two important directions. First, we propose a new masked multiplication algorithm that satisfies a recently introduced notion called Probe-Isolating Non-Interference (PINI). It captures a sufficient requirement for designing masked implementations in a trivial way, by combining PINI multiplications and linear operations performed share by share. Our improved algorithm has the best reported randomness complexity for large security orders (while the previous PINI multiplication was best for small orders). Second, we analyze the security of most existing multiplication algorithms in the literature against so-called horizontal attacks, which aim to reduce the noise of the actual leakages measured by an adversary, by combining the information of multiple target intermediate values. For this purpose, we leave the (abstract) probing model and consider a specialization of the (more realistic) noisy leakage / random probing models. Our (still partially heuristic but quantitative) analysis allows confirming the improved security of an algorithm by Battistello et al. from CHES 2016 in this setting. We then use it to propose new improved algorithms, leading to better tradeoffs between randomness complexity and noise rate, and suggesting the possibility to design efficient masked multiplication algorithms with constant noise rate in F2.