International Association for Cryptologic Research

International Association
for Cryptologic Research

CryptoDB

Highly Vectorized SIKE for AVX-512

Authors:
Hao Cheng , DCS and SnT, University of Luxembourg, Esch-sur-Alzette, Luxembourg
Georgios Fotiadis , DCS and SnT, University of Luxembourg, Esch-sur-Alzette, Luxembourg
Johann Großschädl , DCS and SnT, University of Luxembourg, Esch-sur-Alzette, Luxembourg
Peter Y. A. Ryan , DCS and SnT, University of Luxembourg, Esch-sur-Alzette, Luxembourg
Download:
DOI: 10.46586/tches.v2022.i2.41-68
URL: https://tches.iacr.org/index.php/TCHES/article/view/9480
Search ePrint
Search Google
Presentation: Slides
Abstract: It is generally accepted that a large-scale quantum computer would be capable to break any public-key cryptosystem used today, thereby posing a serious threat to the security of the Internet’s public-key infrastructure. The US National Institute of Standards and Technology (NIST) addresses this threat with an open process for the standardization of quantum-safe key establishment and signature schemes, which is now in the final phase of the evaluation of candidates. SIKE (an abbreviation of Supersingular Isogeny Key Encapsulation) is one of the alternate candidates under evaluation and distinguishes itself from other candidates due to relatively short key lengths and relatively high computing costs. In this paper, we analyze how the latest generation of Intel’s Advanced Vector Extensions (AVX), in particular AVX-512IFMA, can be used to minimize the latency (resp. maximize the hroughput) of the SIKE key encapsulation mechanism when executed on Ice Lake CPUs based on the Sunny Cove microarchitecture. We present various techniques to parallelize and speed up the base/extension field arithmetic, point arithmetic, and isogeny computations performed by SIKE. All these parallel processing techniques are combined in AvxSike, a highly optimized implementation of SIKE using Intel AVX-512IFMA instructions. Our experiments indicate that AvxSike instantiated with the SIKEp503 parameter set is approximately 1.5 times faster than the to-date best AVX-512IFMA-based SIKE software from the literature. When executed on an Intel Core i3-1005G1 CPU, AvxSike outperforms the x64 assembly implementation of SIKE contained in Microsoft’s SIDHv3.4 library by a factor of about 2.5 for key generation and decapsulation, while the encapsulation is even 3.2 times faster.
BibTeX
@article{tches-2022-31983,
  title={Highly Vectorized SIKE for AVX-512},
  journal={IACR Transactions on Cryptographic Hardware and Embedded Systems},
  publisher={Ruhr-Universität Bochum},
  volume={2022, Issue 2},
  pages={41-68},
  url={https://tches.iacr.org/index.php/TCHES/article/view/9480},
  doi={10.46586/tches.v2022.i2.41-68},
  author={Hao Cheng and Georgios Fotiadis and Johann Großschädl and Peter Y. A. Ryan},
  year=2022
}