MIRACLE: MIcRo-ArChitectural Leakage Evaluation: A study of micro-architectural power leakage across many devices
In this paper, we describe an extensible experimental infrastructure for evaluating the micro-architectural leakage, based on power consumption, that stems from a physical device. Building on existing literature, we use it to systematically study 14 different devices, which span 4 different instruction set architectures and 4 different vendors. The study allows a characterisation of each device with respect to any leakage effects stemming from sources within the micro-architectural implementation. We use it, for example, to identify and document several novel leakage effects (e.g., due to speculative instruction execution), and scenarios where an assumption about leakage is non-portable between different yet compatible devices.Ours is the widest study of its kind we are aware of, and highlights a range of challenges with respect to 1) the design, implementation, and evaluation of, e.g., masking schemes, 2) construction of accurate leakage models, and 3) selection of suitable devices for experimental research. For example, in relation to 1), we cast further doubt on whether a given device upholds the assumptions required by a given masking scheme; in relation to 2), we conclude that (statistical or formal) device leakage models must include information about the micro-architecture being modelled; in relation to 3), we claim the near mono-culture of devices that dominates existing literature is insufficient to support general claims regarding leakage. This is particularly important in the context of the FIPS 140-3 standard for non-invasive side-channel evaluation.
RISC-V Instruction Set Extensions for Lightweight Symmetric Cryptography
The NIST LightWeight Cryptography (LWC) selection process aims to standardise cryptographic functionality which is suitable for resource-constrained devices. Since the outcome is likely to have significant, long-lived impact, careful evaluation of each submission with respect to metrics explicitly outlined in the call is imperative. Beyond the robustness of submissions against cryptanalytic attack, metrics related to their implementation (e.g., execution latency and memory footprint) form an important example. Aiming to provide evidence allowing richer evaluation with respect to such metrics, this paper presents the design, implementation, and evaluation of one separate Instruction Set Extension (ISE) for each of the 10 LWC final round submissions, namely Ascon, Elephant, GIFT-COFB, Grain-128AEADv2, ISAP, PHOTON-Beetle, Romulus, Sparkle, TinyJAMBU, and Xoodyak; although we base the work on use of RISC-V, we argue that it provides more general insight.
An Instruction Set Extension to Support Software-Based Masking 📺
In both hardware and software, masking can represent an effective means of hardening an implementation against side-channel attack vectors such as Differential Power Analysis (DPA). Focusing on software, however, the use of masking can present various challenges: specifically, it often 1) requires significant effort to translate any theoretical security properties into practice, and, even then, 2) imposes a significant overhead in terms of efficiency. To address both challenges, this paper explores the use of an Instruction Set Extension (ISE) to support masking in software-based implementations of a range of (symmetric) cryptographic kernels including AES: we design, implement, and evaluate such an ISE, using RISC-V as the base ISA. Our ISE-supported first-order masked implementation of AES, for example, is an order of magnitude more efficient than a software-only alternative with respect to both execution latency and memory footprint; this renders it comparable to an unmasked implementation using the same metrics, but also first-order secure.
FENL: an ISE to mitigate analogue micro-architectural leakage 📺
Ge et al. [GYH18] propose the augmented ISA (or aISA), a central tenet of which is the selective exposure of micro-architectural resources via a less opaque abstraction than normal. The aISA proposal is motivated by the need for control over such resources, for example to implement robust countermeasures against microarchitectural attacks. In this paper, we apply an aISA-style approach to challenges stemming from analogue micro-architectural leakage; examples include power-based Hamming weight and distance leakage from relatively fine-grained resources (e.g., pipeline registers), which are not exposed in, and so cannot be reliably controlled via, a normal ISA. Specifically, we design, implement, and evaluate an ISE named FENL: the ISE acts as a fence for leakage, preventing interaction between, and hence leakage from, instructions before and after it in program order. We demonstrate that the implementation and use of FENL has relatively low overhead, and represents an effective tool for systematically localising and reducing leakage.
The design of scalar AES Instruction Set Extensions for RISC-V 📺
Secure, efficient execution of AES is an essential requirement on most computing platforms. Dedicated Instruction Set Extensions (ISEs) are often included for this purpose. RISC-V is a (relatively) new ISA that lacks such a standardized ISE. We survey the state-of-the-art industrial and academic ISEs for AES, implement and evaluate five different ISEs, one of which is novel. We recommend separate ISEs for 32 and 64-bit base architectures, with measured performance improvements for an AES-128 block encryption of 4x and 10x with a hardware cost of 1.1K and 8.2K gates respectively, when compared to a software-only implementation based on use of T-tables. We also explore how the proposed standard bit-manipulation extension to RISC-V can be harnessed for efficient implementation of AES-GCM. Our work supports the ongoing RISC-V cryptography extension standardisation process.
Share-slicing: Friend or Foe? 📺
Masking is a well loved and widely deployed countermeasure against side channel attacks, in particular in software. Under certain assumptions (w.r.t. independence and noise level), masking provably prevents attacks up to a certain security order and leads to a predictable increase in the number of required leakages for successful attacks beyond this order. The noise level in typical processors where software masking is used may not be very high, thus low masking orders are not sufficient for real world security. Higher order masking however comes at a great cost, and therefore a number techniques have been published over the years that make such implementations more efficient via parallelisation in the form of bit or share slicing. We take two highly regarded schemes (ISW and Barthe et al.), and some corresponding open source implementations that make use of share slicing, and discuss their true security on an ARM Cortex-M0 and an ARM Cortex-M3 processor (both from the LPC series). We show that micro-architectural features of the M0 and M3 undermine the independence assumptions made in masking proofs and thus their theoretical guarantees do not translate into practice (even worse it seems unpredictable at which order leaks can be expected). Our results demonstrate how difficult it is to link theoretical security proofs to practical real-world security guarantees.