Siemen Dhooghe

CryptoDB

Siemen Dhooghe

Publications and invited talks

Year

Venue

Title

2025

TCHES

Higher-Order Time Sharing Masking Abstract

Dilip Kumar S. V. Siemen Dhooghe Josep Balasch Benedikt Gierlichs Ingrid Verbauwhede

At CHES 2024, Time Sharing Masking (TSM) was introduced as a novel low-latency masking technique for hardware circuits. TSM offers area and randomness efficiency, as well as glitch-extended PINI security, but it is limited to first-order security. We address this limitation and generalize TSM to higher-order security while maintaining all of TSM’s advantages. Additionally, we propose an area-latency tradeoff. We prove HO-TSM glitch-extended PINI security and successfully evaluate our circuits using formal verification tools. Furthermore, we demonstrate area- and latency-efficient implementations of the AES S-box, which do not exhibit leakage in TVLA on FPGA. Our proposed tradeoff enables a first-order secure implementation of a complete AES-128 encryption core with 92 kGE, 920 random bits per round, and 20 cycles of latency, which does not exhibit leakage in TVLA on FPGA.

2025

TCHES

Code-based Masking: From Fields to Bits Bitsliced Higher-Order Masked SKINNY Abstract

John Gaspoz Siemen Dhooghe

Masking is one of the most prevalent and investigated countermeasures against side-channel analysis. As an alternative to the simple (e.g., additive) encoding function of Boolean masking, a collection of more algebraically complex masking types has emerged. Recently, inner product masking and the more generic codebased masking have proven to enable higher theoretical security properties than Boolean masking. In CARDIS 2017, Poussier et al. connected this “security order amplification” effect to the bit-probing model, demonstrating that for the same shared size, sharings from more complex encoding functions exhibit greater resistance to higher-order attacks. Despite these advantages, masked gadgets designed for code-based implementations face significant overhead compared to Boolean masking. Furthermore, existing code-based masked gadgets are not designed for efficient bitslice representation, which is highly beneficial for software implementations. Thus, current code-based masked gadgets are constrained to operate over words (e.g., elements in F2k ), limiting their applicability to ciphers where the S-box can be efficiently computed via power functions, such as AES. In this paper, we address the aforementioned limitations. We first introduce foundational masked linear and non-linear circuits that operate over bits of code-based sharings, ensuring composability and preserving bit-probing security, specifically achieving t-Probe Isolating Non-Interference (t-PINI). Utilizing these circuits, we construct masked ciphers that operate over bits, preserving the security order amplification effect during computation. Additionally, we present an optimized bitsliced masked assembly implementation of the SKINNY cipher, which outperforms Boolean masking in terms of randomness and gate count. The third-order security of this implementation is formally proven and validated through practical side-channel leakage evaluations on a Cortex-M4 core, confirming its robustness against leakages up to one million traces.

2025

TCHES

Constant-Cycle Hardware Private Circuits Abstract

Daniel Lammers Nicolai Müller Siemen Dhooghe Amir Moradi

The efficient implementation of Boolean masking with minimal overhead in terms of latency has become a critical topic due to the increasing demand for physically secure yet high-performance cryptographic primitives. However, achieving low latency in masked circuits while ensuring that glitches and transitions do not compromise their security remains a significant challenge. State-of-the-art multiplication gadgets, such as the recently introduced HPC4 (CHES 2024), offer composable security against glitches and transitions, as proven under the robust d-probing model. However, these gadgets require at least one clock cycle per computation, resulting in a latency overhead that increases with the algebraic degree. In contrast, LMDPL gadgets (CHES 2014 & CHES 2020) can achieve fixed latency independent of the algebraic degree, effectively addressing this issue. However, they are limited to two shares, and extending them to guarantee composable security at order d with d + 1 shares is considered an open challenge.In this work, we introduce Constant-Cycle Hardware Private Circuits (CCHPC), a novel hardware masking scheme built on the concept of LUT-based Masked Dual-Rail with Pre-charge Logic (LMDPL). Specifically, CCHPC achieves a fixed latency of d clock cycles by masking a Boolean function of arbitrary algebraic degree with d + 1 shares. CCHPC gadgets are secure and trivially composable, as formally proven under the Robust but Relaxed d-probing model (CHES 2024). Using CCHPC gadgets, we design a masked Advanced Encryption Standard (AES) encryption core which can be instantiated for an arbitrary number of d + 1 shares with a total latency of 11 + d clock cycles.

2025

ASIACRYPT

StaMAC: Fault Protection via Stable-MAC Tags Abstract

Dilara Toprakhisar Artemii Ovchinnikov Siemen Dhooghe

Fault attacks pose a significant threat to cryptographic implementations, motivating the development of countermeasures, primarily based on a combination of redundancy and masking techniques. Redundancy, in these countermeasures, is often implemented via duplication or linear codes. However, their inherent structure remains susceptible to strategic fault injections bypassing error checks. To address this, the CAPA countermeasure from CRYPTO 2018 leveraged information-theoretic MAC tags for protection against fault and combined attacks. However, a recent attack has shown that CAPA does not achieve protection against combined attacks and offers only limited fault protection, while also incurring significant hardware overhead. Its successor, M\M, improves efficiency but fails to protect against ineffective faults. In this paper, we introduce StaMAC, a framework for securely integrating MAC tags against both side-channel and fault attacks in a non-combined setting. Building on the security notions from StaTI (TCHES 2024), we propose the notion of \textit{MAC-stability}, which ensures fault propagation in masked and MACed circuits while requiring only a single error check at the end of the computation. We further show that the stability notion from StaTI is arbitrarily composable (whereas it was previously thought to be only serially composable), making it the first arbitrary composable fault security notion which does not require intermediate error checks or corrections. Then, we establish the improved protection of masking combined with MAC tags compared to linear encoding techniques by showing bounds on the advantage considering several fault adversaries: a gate/register faulting adversary, an arbitrary register faulting adversary, and a random register faulting adversary. Then, we show how to transform any probing secure circuit to protect against fault attacks using the proposed MAC-stable gadgets implementing field operations. Finally, we demonstrate StaMAC on an AES implementation, evaluating its security and hardware costs in comparison to MAC-based countermeasures.

2024

TCHES

Time Sharing - A Novel Approach to Low-Latency Masking Abstract

Dilip Kumar S. V. Siemen Dhooghe Josep Balasch Benedikt Gierlichs Ingrid Verbauwhede

We present a novel approach to small area and low-latency first-order masking in hardware. The core idea is to separate the processing of shares in time in order to achieve non-completeness. Resulting circuits are proven first-order glitchextended PINI secure. This means the method can be straightforwardly applied to mask arbitrary functions without constraints which the designer must take care of. Furthermore we show that an implementation can benefit from optimization through EDA tools without sacrificing security. We provide concrete results of several case studies. Our low-latency implementation of a complete PRINCE core shows a 32% area improvement (44% with optimization) over the state-of-the-art. Our PRINCE S-Box passes formal verification with a tool and the complete core on FPGA shows no first-order leakage in TVLA with 100 million traces. Our low-latency implementation of the AES S-Box costs roughly one third (one quarter with optimization) of the area of state-of-the-art implementations. It shows no first-order leakage in TVLA with 250 million traces.

2024

TCHES

Bit t-SNI Secure Multiplication Gadget for Inner Product Masking Abstract

John Gaspoz Siemen Dhooghe

Masking is a sound countermeasure to protect against differential power analysis. Since the work by Balasch et al. in ASIACRYPT 2012, inner product masking has been explored as an alternative to the well known Boolean masking. In CARDIS 2017, Poussier et al. showed that inner product masking achieves higherorder security versus Boolean masking, for the same shared size, in the bit-probing model. Wang et al. in TCHES 2020 verified the inner product masking’s security order amplification in practice and proposed new gadgets for inner product masking. Finally, Wu et al. in TCHES 2022 showed that this security amplification comes from the bit-probing model, but that Wang et al.’s gadgets are not higher-order bitprobing secure reducing the computation’s practical security. The authors concluded their work with the open question of providing an inner product multiplication gadget which maintains the masking’s bit-probing security, and conjectured that such gadget maintains the practical security order amplification of the masking during its computation.In this paper, we answer positively to Wu et al.’s open problems. We are the first to present a multiplication gadget for inner product masking which is proven secure in the bit-level probing model using the t-Strong Non-Interference (SNI) property. Moreover, we provide practical evidence that the gadget indeed maintains the security amplification of its masking. This is done via an evaluation of an assembly implementation of the gadget on an ARM Cortex-M4 core. We used this implementation to take leakage measurements and show no leakage happens for orders below the gadget’s bit-probing security level either for its univariate or multivariate analysis.

2023

TCHES

Threshold Implementations in Software: Micro-architectural Leakages in Algorithms Abstract

John Gaspoz Siemen Dhooghe

This paper provides necessary properties to algorithmically secure firstorder maskings in scalar micro-architectures. The security notions of threshold implementations are adapted following micro-processor leakage effects which are known to the literature. The resulting notions, which are based on the placement of shares, are applied to a two-share randomness-free PRESENT cipher and Keccak-f. The assembly implementations are put on a RISC-V and an ARM Cortex-M4 core. All designs are validated in the glitch and transition extended probing model and their implementations via practical lab analysis.

2023

TCHES

StaTI: Protecting against Fault Attacks Using Stable Threshold Implementations Abstract

Siemen Dhooghe Artemii Ovchinnikov Dilara Toprakhisar

Fault attacks impose a serious threat against the practical implementations of cryptographic algorithms. Statistical Ineffective Fault Attacks (SIFA), exploiting the dependency between the secret data and the fault propagation overcame many of the known countermeasures. Later, several countermeasures have been proposed to tackle this attack using error detection methods. However, the efficiency of the countermeasures, in part governed by the number of error checks, still remains a challenge.In this work, we propose a fault countermeasure, StaTI, based on threshold implementations and linear encoding techniques. The proposed countermeasure protects the implementations of cryptographic algorithms against both side-channel and fault adversaries in a non-combined attack setting. We present a new composable notion, stability, to protect a threshold implementation against a formal gate/register-faulting adversary. Stability ensures fault propagation, making a single error check of the output suffice. To illustrate the stability notion, first, we provide stable encodings of the XOR and AND gates. Then, we present techniques to encode threshold implementations of S-boxes, and provide stable encodings of some quadratic S-boxes together with their security and performance evaluation. Additionally, we propose general encoding techniques to transform a threshold implementation of any function (e.g., non-injective functions) to a stable one. We then provide an encoding technique to use in symmetric primitives which encodes state elements together significantly reducing the encoded state size. Finally, we used StaTI to implement a secure Keccak on FPGA and report on its efficiency.

2022

TCHES

Cryptanalysis of Efficient Masked Ciphers: Applications to Low Latency Abstract

Tim Beyne Siemen Dhooghe Amir Moradi Aein Rezaei Shahmirzadi

This work introduces second-order masked implementation of LED, Midori, Skinny, and Prince ciphers which do not require fresh masks to be updated at every clock cycle. The main idea lies on a combination of the constructions given by Shahmirzadi and Moradi at CHES 2021, and the theory presented by Beyne et al. at Asiacrypt 2020. The presented masked designs only use a minimal number of shares, i.e., three to achieve second-order security, and we make use of a trick to pair a couple of S-boxes to reduce their latency. The theoretical security analyses of our constructions are based on the linear-cryptanalytic properties of the underlying masked primitive as well as SILVER, the leakage verification tool presented at Asiacrypt 2020. To improve this cryptanalytic analysis, we use the noisy probing model which allows for the inclusion of noise in the framework of Beyne et al. We further provide FPGA-based experimental security analysis confirming second-order protection of our masked implementations.

2022

TCHES

Low-Latency and Low-Randomness Second-Order Masked Cubic Functions Abstract

Aein Rezaei Shahmirzadi Siemen Dhooghe Amir Moradi

Masking schemes are the most popular countermeasure to mitigate Side-Channel Analysis (SCA) attacks. Compared to software, their hardware implementations require certain considerations with respect to physical defaults, such as glitches. To counter this extended leakage effect, the technique known as Threshold Implementation (TI) has proven to be a reliable solution. However, its efficiency, namely the number of shares, is tied to the algebraic degree of the target function. As a result, the application of TI may lead to unaffordable implementation costs. This dependency is relaxed by the successor schemes where the minimum number of d + 1 shares suffice for dth-order protection independent of the function’s algebraic degree. By this, although the number of input shares is reduced, the implementation costs are not necessarily low due to their high demand for fresh randomness. It becomes even more challenging when a joint low-latency and low-randomness cost is desired. In this work, we provide a methodology to realize the second-order glitch-extended probing-secure implementation of cubic functions with three shares while allowing to reuse fresh randomness. This enables us to construct low-latency second-order secure implementations of several popular lightweight block ciphers, including Skinny, Midori, and Prince, with a very limited number of fresh masks. Notably, compared to state-of-the-art equivalent implementations, our designs lower the latency in terms of the number of clock cycles while keeping randomness costs low.

2020

TOSC

Design of Symmetric-Key Primitives for Advanced Cryptographic Protocols 📺 Abstract

Abdelrahaman Aly Tomer Ashur Eli Ben-Sasson Siemen Dhooghe Alan Szepieniec

While traditional symmetric algorithms like AES and SHA3 are optimized for efficient hardware and software implementations, a range of emerging applications using advanced cryptographic protocols such as multi-party computation and zero-knowledge proofs require optimization with respect to a different metric: arithmetic complexity. In this paper we study the design of secure cryptographic algorithms optimized to minimize this metric. We begin by identifying the differences in the design space between such arithmetization-oriented ciphers and traditional ones, with particular emphasis on the available tools, efficiency metrics, and relevant cryptanalysis. This discussion highlights a crucial point --- the considerations for designing arithmetization-oriented ciphers are oftentimes different from the considerations arising in the design of software- and hardware-oriented ciphers. The natural next step is to identify sound principles to securely navigate this new terrain, and to materialize these principles into concrete designs. To this end, we present the Marvellous design strategy which provides a generic way to easily instantiate secure and efficient algorithms for this emerging domain. We then show two examples for families following this approach. These families --- Vision and Rescue --- are benchmarked with respect to three use cases: the ZK-STARK proof system, proof systems based on Rank-One Constraint Satisfaction (R1CS), and Multi-Party Computation (MPC). These benchmarks show that our algorithms achieve a highly compact algebraic description, and thus benefit the advanced cryptographic protocols that employ them. Evidence is provided that this is the case also in real-world implementations.

2020

ASIACRYPT

Cryptanalysis of Masked Ciphers: A not so Random Idea 📺 Abstract

Tim Beyne Siemen Dhooghe Zhenda Zhang

A new approach to the security analysis of hardware-oriented masked ciphers against second-order side-channel attacks is developed. By relying on techniques from symmetric-key cryptanalysis, concrete security bounds are obtained in a variant of the probing model that allows the adversary to make only a bounded, but possibly very large, number of measurements. Specifically, it is formally shown how a bounded-query variant of robust probing security can be reduced to the linear cryptanalysis of masked ciphers. As a result, the compositional issues of higher-order threshold implementations can be overcome without relying on fresh randomness. From a practical point of view, the aforementioned approach makes it possible to transfer many of the desirable properties of first-order threshold implementations, such as their low randomness usage, to the second-order setting. For example, a straightforward application to the block cipher LED results in a masking using less than 700 random bits including the initial sharing. In addition, the cryptanalytic approach introduced in this paper provides additional insight into the design of masked ciphers and allows for a quantifiable trade-off between security and performance.