Lauren De Meyer
Spin Me Right Round Rotational Symmetry for FPGA-Specific AES: Extended Version
The effort in reducing the area of AES implementations has largely been focused on application-specific integrated circuits (ASICs) in which a tower field construction leads to a small design of the AES S-box. In contrast, a naive implementation of the AES S-box has been the status-quo on field-programmable gate arrays (FPGAs). A similar discrepancy holds for masking schemes—a well-known side-channel analysis countermeasure—which are commonly optimized to achieve minimal area in ASICs. In this paper, we demonstrate a representation of the AES S-box exploiting rotational symmetry which leads to a 50% reduction in the area footprint on FPGA devices. We present new AES implementations which improve on the state-of-the-art and explore various trade-offs between area and latency. For instance, at the cost of increasing 4.5 times the latency, one of our design variants requires 25% less look-up tables (LUTs) than the smallest known AES on Xilinx FPGAs by Sasdrich and Güneysu at ASAP 2016. We further explore the protection of such implementations against side-channel attacks. We introduce a generic methodology for masking any n -bit Boolean functions of degree t with protection order d . The methodology is exact for first-order and heuristic for higher orders. Its application to our new construction of the AES S-box allows us to improve previous results and introduce the smallest first-order masked AES implementation on Xilinx FPGAs, to date.
Low AND Depth and Efficient Inverses: a Guide on S-boxes for Low-latency Masking
In this work, we perform an extensive investigation and construct a portfolio of S-boxes suitable for secure lightweight implementations, which aligns well with the ongoing NIST Lightweight Cryptography competition. In particular, we target good functional properties on the one hand and efficient implementations in terms of AND depth and AND gate complexity on the other. Moreover, we also consider the implementation of the inverse S-box and the possibility for it to share resources with the forward S-box. We take our exploration beyond the conventional small (and even) S-box sizes. Our investigation is twofold: (1) we note that implementations of existing S-boxes are not optimized for the criteria which define masking complexity (AND depth and AND gate complexity) and improve a tool published at FSE 2016 by Stoffelen in order to fill this gap. (2) We search for new S-box designs which take these implementation properties into account from the start. We perform a systematic search based on the properties of not only the S-box but also its inverse as well as an exploration of larger S-box sizes using length-doubling structures. The result of our investigation is not only a wide selection of very good S-boxes, but we also provide complete descriptions of their circuits, enabling their integration into future work.
M&M: Masks and Macs against Physical Attacks 📺
Cryptographic implementations on embedded systems need to be protected against physical attacks. Today, this means that apart from incorporating countermeasures against side-channel analysis, implementations must also withstand fault attacks and combined attacks. Recent proposals in this area have shown that there is a big tradeoff between the implementation cost and the strength of the adversary model. In this work, we introduce a new combined countermeasure M&M that combines Masking with information-theoretic MAC tags and infective computation. It works in a stronger adversary model than the existing scheme ParTI, yet is a lot less costly to implement than the provably secure MPC-based scheme CAPA. We demonstrate M&M with a SCA- and DFA-secure implementation of the AES block cipher. We evaluate the side-channel leakage of the second-order secure design with a non-specific t-test and use simulation to validate the fault resistance.
Recovering the CTR_DRBG state in 256 traces
The NIST CTR_DRBG specification prescribes a maximum size on each random number request, limiting the number of encryptions in CTR mode with the same key to 4 096. Jaffe’s attack on AES in CTR mode without knowledge of the nonce from CHES 2007 requires 216 traces, which is safely above this recommendation. In this work, we exhibit an attack that requires only 256 traces, which is well within the NIST limits. We use simulated traces to investigate the success probability as a function of the signal-to-noise ratio. We also demonstrate its success in practice by attacking an AES-CTR implementation on a Cortex-M4 among others and recovering both the key and nonce. Our traces and code are made openly available for reproducibility.
Consolidating Security Notions in Hardware Masking 📺
In this paper, we revisit the security conditions of masked hardware implementations. We describe a new, succinct, information-theoretic condition called d-glitch immunity which is both necessary and sufficient for security in the presence of glitches. We show that this single condition includes, but is not limited to, previous security notions such as those used in higher-order threshold implementations and in abstractions using ideal gates. As opposed to these previously known necessary conditions, our new condition is also sufficient. On the other hand, it excludes avoidable notions such as uniformity. We also treat the notion of (strong) noninterference from an information-theoretic point-of-view in order to unify the different security concepts and pave the way to the verification of composability in the presence of glitches. We conclude the paper by demonstrating how the condition can be used as an efficient and highly generic flaw detection mechanism for a variety of functions and schemes based on different operations.
Classification of Balanced Quadratic Functions
S-boxes, typically the only nonlinear part of a block cipher, are the heart of symmetric cryptographic primitives. They significantly impact the cryptographic strength and the implementation characteristics of an algorithm. Due to their simplicity, quadratic vectorial Boolean functions are preferred when efficient implementations for a variety of applications are of concern. Many characteristics of a function stay invariant under affine equivalence. So far, all 6-bit Boolean functions, 3- and 4-bit permutations have been classified up to affine equivalence. At FSE 2017, Bozoliv et al. presented the first classification of 5-bit quadratic permutations. In this work, we propose an adaptation of their work resulting in a highly efficient algorithm to classify n x m functions for n ≥ m. Our algorithm enables for the first time a complete classification of 6-bit quadratic permutations as well as all balanced quadratic functions for n ≤ 6. These functions can be valuable for new cryptographic algorithm designs with efficient multi-party computation or side-channel analysis resistance as goal. In addition, we provide a second tool for finding decompositions of length two. We demonstrate its use by decomposing existing higher degree S-boxes and constructing new S-boxes with good cryptographic and implementation properties.
CAPA: The Spirit of Beaver Against Physical Attacks 📺
In this paper we introduce two things: On one hand we introduce the Tile-Probe-and-Fault model, a model generalising the wire-probe model of Ishai et al. extending it to cover both more realistic side-channel leakage scenarios on a chip and also to cover fault and combined attacks. Secondly we introduce CAPA: a combined Countermeasure Against Physical Attacks. Our countermeasure is motivated by our model, and aims to provide security against higher-order SCA, multiple-shot FA and combined attacks. The tile-probe-and-fault model leads one to naturally look (by analogy) at actively secure multi-party computation protocols. Indeed, CAPA draws much inspiration from the MPC protocol SPDZ. So as to demonstrate that the model, and the CAPA countermeasure, are not just theoretical constructions, but could also serve to build practical countermeasures, we present initial experiments of proof-of-concept designs using the CAPA methodology. Namely, a hardware implementation of the KATAN and AES block ciphers, as well as a software bitsliced AES S-box implementation. We demonstrate experimentally that the design can resist second-order DPA attacks, even when the attacker is presented with many hundreds of thousands of traces. In addition our proof-of-concept can also detect faults within our model with high probability in accordance to the methodology.
Multiplicative Masking for AES in Hardware
Hardware masked AES designs usually rely on Boolean masking and perform the computation of the S-box using the tower-field decomposition. On the other hand, splitting sensitive variables in a multiplicative way is more amenable for the computation of the AES S-box, as noted by Akkar and Giraud. However, multiplicative masking needs to be implemented carefully not to be vulnerable to first-order DPA with a zero-value power model. Up to now, sound higher-order multiplicative masking schemes have been implemented only in software. In this work, we demonstrate the first hardware implementation of AES using multiplicative masks. The method is tailored to be secure even if the underlying gates are not ideal and glitches occur in the circuit. We detail the design process of first- and second-order secure AES-128 cores, which result in the smallest die area to date among previous state-of-the-art masked AES implementations with comparable randomness cost and latency. The first- and second-order masked implementations improve resp. 29% and 18% over these designs. We deploy our construction on a Spartan-6 FPGA and perform a side-channel evaluation. No leakage is detected with up to 50 million traces for both our first- and second-order implementation. For the latter, this holds both for univariate and bivariate analysis.
Spin Me Right Round Rotational Symmetry for FPGA-Specific AES
The effort in reducing the area of AES implementations has largely been focused on Application-Specific Integrated Circuits (ASICs) in which a tower field construction leads to a small design of the AES S-box. In contrast, a naïve implementation of the AES S-box has been the status-quo on Field-Programmable Gate Arrays (FPGAs). A similar discrepancy holds for masking schemes – a wellknown side-channel analysis countermeasure – which are commonly optimized to achieve minimal area in ASICs.In this paper we demonstrate a representation of the AES S-box exploiting rotational symmetry which leads to a 50% reduction of the area footprint on FPGA devices. We present new AES implementations which improve on the state of the art and explore various trade-offs between area and latency. For instance, at the cost of increasing 4.5 times the latency, one of our design variants requires 25% less look-up tables (LUTs) than the smallest known AES on Xilinx FPGAs by Sasdrich and Güneysu at ASAP 2016. We further explore the protection of such implementations against first-order side-channel analysis attacks. Targeting the small area footprint on FPGAs, we introduce a heuristic-based algorithm to find a masking of a given function with d + 1 shares. Its application to our new construction of the AES S-box allows us to introduce the smallest masked AES implementation on Xilinx FPGAs, to-date.