Low-Latency Hardware Masking with Application to AES 📺
During the past two decades there has been a great deal of research published on masked hardware implementations of AES and other cryptographic primitives. Unfortunately, many hardware masking techniques can lead to increased latency compared to unprotected circuits for algorithms such as AES, due to the high-degree of nonlinear functions in their designs. In this paper, we present a hardware masking technique which does not increase the latency for such algorithms. It is based on the LUT-based Masked Dual-Rail with Pre-charge Logic (LMDPL) technique presented at CHES 2014. First, we show 1-glitch extended strong noninterference of a nonlinear LMDPL gadget under the 1-glitch extended probing model. We then use this knowledge to design an AES implementation which computes a full AES-128 operation in 10 cycles and a full AES-256 operation in 14 cycles. We perform practical side-channel analysis of our implementation using the Test Vector Leakage Assessment (TVLA) methodology and analyze univariate as well as bivariate t-statistics to demonstrate its DPA resistance level.
Low AND Depth and Efficient Inverses: a Guide on S-boxes for Low-latency Masking 📺
In this work, we perform an extensive investigation and construct a portfolio of S-boxes suitable for secure lightweight implementations, which aligns well with the ongoing NIST Lightweight Cryptography competition. In particular, we target good functional properties on the one hand and efficient implementations in terms of AND depth and AND gate complexity on the other. Moreover, we also consider the implementation of the inverse S-box and the possibility for it to share resources with the forward S-box. We take our exploration beyond the conventional small (and even) S-box sizes. Our investigation is twofold: (1) we note that implementations of existing S-boxes are not optimized for the criteria which define masking complexity (AND depth and AND gate complexity) and improve a tool published at FSE 2016 by Stoffelen in order to fill this gap. (2) We search for new S-box designs which take these implementation properties into account from the start. We perform a systematic search based on the properties of not only the S-box but also its inverse as well as an exploration of larger S-box sizes using length-doubling structures. The result of our investigation is not only a wide selection of very good S-boxes, but we also provide complete descriptions of their circuits, enabling their integration into future work.
Consolidating Security Notions in Hardware Masking 📺
In this paper, we revisit the security conditions of masked hardware implementations. We describe a new, succinct, information-theoretic condition called d-glitch immunity which is both necessary and sufficient for security in the presence of glitches. We show that this single condition includes, but is not limited to, previous security notions such as those used in higher-order threshold implementations and in abstractions using ideal gates. As opposed to these previously known necessary conditions, our new condition is also sufficient. On the other hand, it excludes avoidable notions such as uniformity. We also treat the notion of (strong) noninterference from an information-theoretic point-of-view in order to unify the different security concepts and pave the way to the verification of composability in the presence of glitches. We conclude the paper by demonstrating how the condition can be used as an efficient and highly generic flaw detection mechanism for a variety of functions and schemes based on different operations.
Classification of Balanced Quadratic Functions 📺
S-boxes, typically the only nonlinear part of a block cipher, are the heart of symmetric cryptographic primitives. They significantly impact the cryptographic strength and the implementation characteristics of an algorithm. Due to their simplicity, quadratic vectorial Boolean functions are preferred when efficient implementations for a variety of applications are of concern. Many characteristics of a function stay invariant under affine equivalence. So far, all 6-bit Boolean functions, 3- and 4-bit permutations have been classified up to affine equivalence. At FSE 2017, Bozoliv et al. presented the first classification of 5-bit quadratic permutations. In this work, we propose an adaptation of their work resulting in a highly efficient algorithm to classify n x m functions for n ≥ m. Our algorithm enables for the first time a complete classification of 6-bit quadratic permutations as well as all balanced quadratic functions for n ≤ 6. These functions can be valuable for new cryptographic algorithm designs with efficient multi-party computation or side-channel analysis resistance as goal. In addition, we provide a second tool for finding decompositions of length two. We demonstrate its use by decomposing existing higher degree S-boxes and constructing new S-boxes with good cryptographic and implementation properties.
CAPA: The Spirit of Beaver Against Physical Attacks 📺
In this paper we introduce two things: On one hand we introduce the Tile-Probe-and-Fault model, a model generalising the wire-probe model of Ishai et al. extending it to cover both more realistic side-channel leakage scenarios on a chip and also to cover fault and combined attacks. Secondly we introduce CAPA: a combined Countermeasure Against Physical Attacks. Our countermeasure is motivated by our model, and aims to provide security against higher-order SCA, multiple-shot FA and combined attacks. The tile-probe-and-fault model leads one to naturally look (by analogy) at actively secure multi-party computation protocols. Indeed, CAPA draws much inspiration from the MPC protocol SPDZ. So as to demonstrate that the model, and the CAPA countermeasure, are not just theoretical constructions, but could also serve to build practical countermeasures, we present initial experiments of proof-of-concept designs using the CAPA methodology. Namely, a hardware implementation of the KATAN and AES block ciphers, as well as a software bitsliced AES S-box implementation. We demonstrate experimentally that the design can resist second-order DPA attacks, even when the attacker is presented with many hundreds of thousands of traces. In addition our proof-of-concept can also detect faults within our model with high probability in accordance to the methodology.
Rhythmic Keccak: SCA Security and Low Latency in HW 📺
Glitches entail a great issue when securing a cryptographic implementation in hardware. Several masking schemes have been proposed in the literature that provide security even in the presence of glitches. The key property that allows this protection was introduced in threshold implementations as non-completeness. We address crucial points to ensure the right compliance of this property especially for low-latency implementations. Specifically, we first discuss the existence of a flaw in DSD 2017 implementation of Keccak by Gross et al. in violation of the non-completeness property and propose a solution. We perform a side-channel evaluation on the first-order and second-order implementations of the proposed design where no leakage is detected with up to 55 million traces. Then, we present a method to ensure a non-complete scheme of an unrolled implementation applicable to any order of security or algebraic degree of the shared function. By using this method we design a two-rounds unrolled first-order Keccak-
Multiplicative Masking for AES in Hardware
Hardware masked AES designs usually rely on Boolean masking and perform the computation of the S-box using the tower-field decomposition. On the other hand, splitting sensitive variables in a multiplicative way is more amenable for the computation of the AES S-box, as noted by Akkar and Giraud. However, multiplicative masking needs to be implemented carefully not to be vulnerable to first-order DPA with a zero-value power model. Up to now, sound higher-order multiplicative masking schemes have been implemented only in software. In this work, we demonstrate the first hardware implementation of AES using multiplicative masks. The method is tailored to be secure even if the underlying gates are not ideal and glitches occur in the circuit. We detail the design process of first- and second-order secure AES-128 cores, which result in the smallest die area to date among previous state-of-the-art masked AES implementations with comparable randomness cost and latency. The first- and second-order masked implementations improve resp. 29% and 18% over these designs. We deploy our construction on a Spartan-6 FPGA and perform a side-channel evaluation. No leakage is detected with up to 50 million traces for both our first- and second-order implementation. For the latter, this holds both for univariate and bivariate analysis.
A Note on 5-bit Quadratic Permutations' Classification
Classification of vectorial Boolean functions up to affine equivalence is used widely to analyze various cryptographic and implementation properties of symmetric-key algorithms. We show that there exist 75 affine equivalence classes of 5-bit quadratic permutations. Furthermore, we explore important cryptographic properties of these classes, such as linear and differential properties and degrees of their inverses, together with multiplicative complexity and existence of uniform threshold realizations.
- CHES 2020
- CHES 2019
- Elena Andreeva (1)
- Victor Arribas (2)
- Andrey Bogdanov (3)
- Dusan Bozilov (1)
- Thomas De Cnudde (1)
- Lauren De Meyer (5)
- Sébastien Duval (1)
- Benedikt Gierlichs (3)
- Michael Hutter (1)
- Miroslav Knezevic (2)
- Itamar Levi (1)
- Atul Luykx (1)
- Mark E. Marson (1)
- Florian Mendel (2)
- Bart Mennink (1)
- Nicky Mouha (1)
- Ventzislav Nikov (4)
- Svetla Nikova (7)
- George Petrides (1)
- Oscar Reparaz (6)
- Vincent Rijmen (4)
- Haci Ali Sahin (1)
- Pascal Sasdrich (1)
- Nigel P. Smart (1)
- François-Xavier Standaert (1)
- Georg Stütz (1)
- Ingrid Verbauwhede (2)
- Qingju Wang (2)
- Kan Yasuda (1)