Masking Kyber: First- and Higher-Order Implementations 📺
In the final phase of the post-quantum cryptography standardization effort, the focus has been extended to include the side-channel resistance of the candidates. While some schemes have been already extensively analyzed in this regard, there is no such study yet of the finalist Kyber.In this work, we demonstrate the first completely masked implementation of Kyber which is protected against first- and higher-order attacks. To the best of our knowledge, this results in the first higher-order masked implementation of any post-quantum secure key encapsulation mechanism algorithm. This is realized by introducing two new techniques. First, we propose a higher-order algorithm for the one-bit compression operation. This is based on a masked bit-sliced binary-search that can be applied to prime moduli. Second, we propose a technique which enables one to compare uncompressed masked polynomials with compressed public polynomials. This avoids the costly masking of the ciphertext compression while being able to be instantiated at arbitrary orders.We show performance results for first-, second- and third-order protected implementations on the Arm Cortex-M0+ and Cortex-M4F. Notably, our implementation of first-order masked Kyber decapsulation requires 3.1 million cycles on the Cortex-M4F. This is a factor 3.5 overhead compared to the unprotected optimized implementationin pqm4. We experimentally show that the first-order implementation of our new modules on the Cortex-M0+ is hardened against attacks using 100 000 traces and mechanically verify the security in a fine-grained leakage model using the verification tool scVerif.
A Compact and Scalable Hardware/Software Co-design of SIKE 📺
We present efficient and compact hardware/software co-design implementations of the Supersingular Isogeny Key Encapsulation (SIKE) protocol on field-programmable gate arrays (FPGAs). In order to be better equipped for different post-quantum scenarios, our architectures were designed to feature high-flexibility by covering all the currently available parameter sets and with support for primes up to 1016 bits. In particular, any of the current SIKE parameters equivalent to the post-quantum security of AES-128/192/256 and SHA3-256 can be selected and run on-the-fly. This security scalability property, together with the small footprint and efficiency of our architectures, makes them ideal for embedded applications in a post-quantum world. In addition, the proposed implementations exhibit regular, constant-time execution, which provides protection against timing and simple sidechannel attacks. Our results demonstrate that supersingular isogeny-based primitives such as SIDH and SIKE can indeed be deployed for embedded applications featuring competitive performance. For example, our smallest architecture based on a 128-bit MAC unit takes only 3415 slices, 21 BRAMs and 57 DSPs on a Virtex 7 690T and can perform key generation, encapsulation and decapsulation in 14.4, 24.4 and 26.0 milliseconds for SIKEp434 and in 52.3, 86.4 and 93.2 milliseconds for SIKEp751, respectively.
Improved Classical Cryptanalysis of SIKE in Practice 📺
The main contribution of this work is an optimized implementation of the van Oorschot-Wiener (vOW) parallel collision finding algorithm. As is typical for cryptanalysis against conjectured hard problems (e. g. factoring or discrete logarithms), challenges can arise in the implementation that are not captured in the theory, making the performance of the algorithm in practice a crucial element of estimating security. We present a number of novel improvements, both to generic instantiations of the vOW algorithm finding collisions in arbitrary functions, and to its instantiation in the context of the supersingular isogeny key encapsulation (SIKE) protocol, that culminate in an improved classical cryptanalysis of the computational supersingular isogeny (CSSI) problem. In particular, we present a scalable implementation that can be applied to the Round-2 parameter sets of SIKE that can be used to give confidence in their security levels.
Rapidly Verifiable XMSS Signatures 📺
This work presents new speed records for XMSS (RFC 8391) signature verification on embedded devices. For this we make use of a probabilistic method recently proposed by Perin, Zambonin, Martins, Custódio, and Martina (PZMCM) at ISCC 2018, that changes the XMSS signing algorithm to search for rapidly verifiable signatures. We improve the method, ensuring that the added signing cost for the search is independent of the message length. We provide a statistical analysis of the resulting verification speed and support it by experiments. We present a record setting RFC compatible implementation of XMSS verification on the ARM Cortex-M4. At a signing time of about one minute on a general purpose CPU, we create signatures that are verified about 1.44 times faster than traditionally generated signatures. Adding further well-known implementation optimizations to the verification algorithm we reduce verification time by over a factor two from 13.85 million to 6.56 million cycles. In contrast to previous works, we provide a detailed security analysis of the resulting signature scheme under classical and quantum attacks that justifies our selection of parameters. On the way, we fill a gap in the security analysis of XMSS as described in RFC 8391 proving that the modified message hashing in the RFC does indeed mitigate multi-target attacks. This was not shown before and might be of independent interest.
Dual Isogenies and Their Application to Public-Key Compression for Isogeny-Based Cryptography
The isogeny-based protocols SIDH and SIKE have received much attention for being post-quantum key agreement candidates that retain relatively small keys. A recent line of work has proposed and further improved compression of public keys, leading to the inclusion of public-key compression in the SIKE proposal for Round 2 of the NIST Post-Quantum Cryptography Standardization effort. We show how to employ the dual isogeny to significantly increase performance of compression techniques, reducing their overhead from 160–182% to 77–86% for Alice’s key generation and from 98–104% to 59–61% for Bob’s across different SIDH parameter sets. For SIKE, we reduce the overhead of (1) key generation from 140–153% to 61–74%, (2) key encapsulation from 67–90% to 38–57%, and (3) decapsulation from 59–65% to 34–39%. This is mostly achieved by speeding up the pairing computations, which has until now been the main bottleneck, but we also improve (deterministic) basis generation.
CSIDH: An Efficient Post-Quantum Commutative Group Action
We propose an efficient commutative group action suitable for non-interactive key exchange in a post-quantum setting. Our construction follows the layout of the Couveignes–Rostovtsev–Stolbunov cryptosystem, but we apply it to supersingular elliptic curves defined over a large prime field $$\mathbb F_p$$, rather than to ordinary elliptic curves. The Diffie–Hellman scheme resulting from the group action allows for public-key validation at very little cost, runs reasonably fast in practice, and has public keys of only 64 bytes at a conjectured AES-128 security level, matching NIST’s post-quantum security category I.
- Lejla Batina (3)
- Joppe W. Bos (2)
- Wouter Castryck (1)
- Craig Costello (3)
- Marc Gourjon (1)
- Andreas Hülsing (1)
- David Jao (1)
- Tanja Lange (1)
- Patrick Longa (3)
- Chloe Martindale (1)
- Pedro Maat C. Massolino (1)
- Michael Naehrig (3)
- Lorenz Panny (1)
- Tobias Schneider (1)
- Peter Schwabe (1)
- Benjamin Smith (2)
- David Urbanik (1)
- Fernando Virdia (1)
- Christine van Vredendaal (2)