## CryptoDB

### Paper: Packed Multiplication: How to Amortize the Cost of Side-channel Masking?

Authors: Weijia Wang Chun Guo François-Xavier Standaert Yu Yu Gaëtan Cassiers DOI: 10.1007/978-3-030-64837-4_28 Search ePrint Search Google Slides Higher-order masking countermeasures provide strong provable security against side-channel attacks at the cost of incurring significant overheads, which largely hinders its applicability. Previous works towards remedying cost mostly concentrated on local'' calculations, i.e., optimizing the cost of computation units such as a single AND gate or a field multiplication. This paper explores a complementary global'' approach, i.e., considering multiple operations in the masked domain as a batch and reducing randomness and computational cost via amortization. In particular, we focus on the amortization of $\ell$ parallel field multiplications for appropriate integer $\ell > 1$, and design a kit named {\it packed multiplication} for implementing such a batch. Higher-order masking countermeasures provide strong provable security against side-channel attacks at the cost of incurring significant overheads, which largely hinders its applicability. Previous works towards remedying cost mostly concentrated on local'' calculations, i.e., optimizing the cost of computation units such as a single AND gate or a field multiplication. This paper explores a complementary global'' approach, i.e., considering multiple operations in the masked domain as a batch and reducing randomness and computational cost via amortization. In particular, we focus on the amortization of $\ell$ parallel field multiplications for appropriate integer $\ell > 1$, and design a kit named {\it packed multiplication} for implementing such a batch. For $\ell+d\leq2^m$, when $\ell$ parallel multiplications over $\mathbb{F}_{2^{m}}$ with $d$-th order probing security are implemented, packed multiplication consumes $d^2+2\ell d + \ell$ bilinear multiplications and $2d^2 + d(d+1)/2$ random field variables, outperforming the state-of-the-art results with $O(\ell d^2)$ multiplications and $\ell \left \lfloor d^2/4\right \rfloor + \ell d$ randomness. To prove $d$-probing security for packed multiplications, we introduce some weaker security notions for multiple-inputs-multiple-outputs gadgets and use them as intermediate steps, which may be of independent interest. As parallel field multiplications exist almost everywhere in symmetric cryptography, lifting optimizations from local'' to global'' substantially enlarges the space of improvements. To demonstrate, we showcase the method on the AES Subbytes step, GCM and TET (a popular disk encryption). Notably, when $d=8$, our implementation of AES Subbytes in ARM Cortex M architecture achieves a gain of up to $33\%$ in total speeds and saves up to $68\%$ random bits than the state-of-the-art bitsliced implementation reported at ASIACRYPT~2018.
##### BibTeX
@article{asiacrypt-2020-30708,
title={Packed Multiplication: How to Amortize the Cost of Side-channel Masking?},
booktitle={Advances in Cryptology - ASIACRYPT 2020},
publisher={Springer},
doi={10.1007/978-3-030-64837-4_28},
author={Weijia Wang and Chun Guo and François-Xavier Standaert and Yu Yu and Gaëtan Cassiers},
year=2020
}