Juelin Zhang

CryptoDB

Juelin Zhang

Publications and invited talks

Year

Venue

Title

2025

EUROCRYPT

Tighter Security Notions for a Modular Approach to Private Circuits Abstract

Bohan Wang Juelin Zhang Yu Yu Weijia Wang

To counteract side-channel attacks, a masking scheme splits each intermediate variable into $n$ shares and transforms each elementary operation (e.g., field addition and multiplication) to the masked correspondence called gadget, such that intrinsic noise in the leakages renders secret recovery infeasible in practice. A simple and efficient security notion is the probing model ensuring that any $n-1$ shares are independently distributed from the secret input. One requirement of the probing model is that the noise in the leakages should increase with the number of shares, largely restricting the side-channel security in the low-noise scenario. Another security notion for masking, called the random probing model, allows each variable to leak with a probability $p$. While this model reflects the physical reality of side channels much better, it brings significant overhead. At Crypto 2018, Ananth et al. proposed a modular approach that can provide random probing security for any security level by expanding small base gadgets with $n$ share recursively, such that the tolerable leakage probability $p$ decreases with $n$ while the security increases exponentially with the recursion depth of expansion. Then, Bela{\"{i}}d et al. provided a formal security definition called Random Probing Expandability~(RPE) and an explicit framework using the modular approach to construct masking schemes at Crypto 2020. In this paper, we investigate how to tighten the RPE definition via allowing the dependent failure probabilities of multiple inputs, which results in a new definition called related RPE. It can be directly used for the expansion of multiplication gates and reduce the complexity of the base multiplication gadget from $\mathcal{O}(n^2\log n)$ proposed at Asiacrypt 2021 to $\mathcal{O}(n^2)$ and maintain the same security level. Furthermore, we describe a method to expand any gates (rather than only multiplication) with the related RPE gadgets. Besides, we denote another new RPE definition called Multiple inputs RPE used for the expansion of multiple-input gates composed with any gates. Utilizing these methods, we reduce the complexity of the 3-share circuit compiler to $\mathcal{O}(|C|\cdot\kappa^{3.2})$, where $|C|$ is the size of the unprotected circuit and the protection failure probability of the global circuit is $2^{-\kappa}$. In comparison, the complexity of the state-of-the-art work, proposed at Eurocrypt 2021, is $\mathcal{O}(|C|\cdot\kappa^{3.9})$ for the same value of $n$. Additionally, we provide the construction of a 5-share circuit compiler with a complexity $\mathcal{O}(|C|\cdot\kappa^{2.8})$.

2024

TCHES

Efficient Table-Based Masking with Pre-processing Abstract

Juelin Zhang Taoyun Wang Yiteng Sun Fanjie Ji Bohan Wang Lu Li Yu Yu Weijia Wang

Masking is one of the most investigated countermeasures against sidechannel attacks. In a nutshell, it randomly encodes each sensitive variable into a number of shares, and compiles the cryptographic implementation into a masked one that operates over the shares instead of the original sensitive variables. Despite its provable security benefits, masking inevitably introduces additional overhead. Particularly, the software implementation of masking largely slows down the cryptographic implementations and requires a large number of random bits that need to be produced by a true random number generator. In this respect, reducing the< overhead of masking is still an essential and challenging task. Among various known schemes, Table-Based Masking (TBM) stands out as a promising line of work enjoying the advantages of generality to any lookup tables. It also allows the pre-processing paradigm, wherein a pre-processing phase is executed independently of the inputs, and a much more efficient online (using the precomputed tables) phase takes place to calculate the result. Obviously, practicality of pre-processing paradigm relies heavily on the efficiency of online phase and the size of precomputed tables.In this paper, we investigate the TBM scheme that offers a combination of linear complexity (in terms of the security order, denoted as d) during the online phase and small precomputed tables. We then apply our new scheme to the AES-128, and provide an implementation on the ARM Cortex architecture. Particularly, for a security order d = 8, the online phase outperforms the current state-of-the-art AES implementations on embedded processors that are vulnerable to the side-channel attacks. The security order of our scheme is proven in theory and verified by the T-test in practice. Moreover, we investigate the speed overhead associated with the random bit generation in our masking technique. Our findings indicate that the speed overhead can be effectively balanced. This is mainly because that the true random number generator operates in parallel with the processor’s execution, ensuring a constant supply of fresh random bits for the masked computation at regular intervals.

2023

TCHES

Efficient Private Circuits with Precomputation Abstract

Weijia Wang Fanjie Ji Juelin Zhang Yu Yu

At CHES 2022, Wang et al. described a new paradigm for masked implementations using private circuits, where most intermediates can be precomputed before the input shares are accessed, significantly accelerating the online execution of masked functions. However, the masking scheme they proposed mainly featured (and was designed for) the cost amortization, leaving its (limited) suitability in the above precomputation-based paradigm just as a bonus. This paper aims to provide an efficient, reliable, easy-to-use, and precomputation-compatible masking scheme. We propose a new masked multiplication over the finite field Fq suitable for the precomputation, and prove its security in the composable notion called Probing-Isolating Non-Inference (PINI). Particularly, the operations (e.g., AND and XOR) in the binary field can be achieved by assigning q = 2, allowing the bitsliced implementation that has been shown to be quite efficient for the software implementations. The new masking scheme is applied to leverage the masking of AES and SKINNY block ciphers on ARM Cortex M architecture. The performance results show that the new scheme contributes to a significant speed-up compared with the state-of-the-art implementations. For SKINNY with block size 64, the speed and RAM requirement can be significantly improved (saving around 45% cycles in the online-computation and 60% RAM space for precomputed values) from AES-128, thanks to its smaller number of AND gates. Besides the security proof by hand, we provide formal verifications for the multiplication and T-test evaluations for the masked implementations of AES and SKINNY. Because of the structure of the new masked multiplication, our formal verification can be performed for security orders up to 16.