Efficient Private Circuits with Precomputation
At CHES 2022, Wang et al. described a new paradigm for masked implementations using private circuits, where most intermediates can be precomputed before the input shares are accessed, significantly accelerating the online execution of masked functions. However, the masking scheme they proposed mainly featured (and was designed for) the cost amortization, leaving its (limited) suitability in the above precomputation-based paradigm just as a bonus. This paper aims to provide an efficient, reliable, easy-to-use, and precomputation-compatible masking scheme. We propose a new masked multiplication over the finite field Fq suitable for the precomputation, and prove its security in the composable notion called Probing-Isolating Non-Inference (PINI). Particularly, the operations (e.g., AND and XOR) in the binary field can be achieved by assigning q = 2, allowing the bitsliced implementation that has been shown to be quite efficient for the software implementations. The new masking scheme is applied to leverage the masking of AES and SKINNY block ciphers on ARM Cortex M architecture. The performance results show that the new scheme contributes to a significant speed-up compared with the state-of-the-art implementations. For SKINNY with block size 64, the speed and RAM requirement can be significantly improved (saving around 45% cycles in the online-computation and 60% RAM space for precomputed values) from AES-128, thanks to its smaller number of AND gates. Besides the security proof by hand, we provide formal verifications for the multiplication and T-test evaluations for the masked implementations of AES and SKINNY. Because of the structure of the new masked multiplication, our formal verification can be performed for security orders up to 16.
Side-Channel Masking with Common Shares
To counter side-channel attacks, a masking scheme randomly encodes keydependent variables into several shares, and transforms operations into the masked correspondence (called gadget) operating on shares. This provably achieves the de facto standard notion of probing security.We continue the long line of works seeking to reduce the overhead of masking. Our main contribution is a new masking scheme over finite fields in which shares of different variables have a part in common. This enables the reuse of randomness / variables across different gadgets, and reduces the total cost of masked implementation. For security order d and circuit size l, the randomness requirement and computational complexity of our scheme are Õ(d2) and Õ(ld2) respectively, strictly improving upon the state-of-the-art Õ(d2) and Õ(ld3) of Coron et al. at Eurocrypt 2020.A notable feature of our scheme is that it enables a new paradigm in which many intermediates can be precomputed before executing the masked function. The precomputation consumes Õ(ld2) and produces Õ(ld) variables to be stored in RAM. The cost of subsequent (online) computation is reduced to Õ(ld), effectively speeding up e.g., challenge-response authentication protocols. We showcase our method on the AES on ARM Cortex M architecture and perform a T-test evaluation. Our results show a speed-up during the online phase compared with state-of-the-art implementations, at the cost of acceptable RAM consumption and precomputation time.To prove security for our scheme, we propose a new security notion intrinsically supporting randomness / variables reusing across gadgets, and bridging the security of parallel compositions of gadgets to general compositions, which may be of independent interest.