SIDH on ARM: Faster Modular Multiplications for Faster Post-Quantum Supersingular Isogeny Key Exchange
We present high-speed implementations of the post-quantum supersingular isogeny Diffie-Hellman key exchange (SIDH) and the supersingular isogeny key encapsulation (SIKE) protocols for 32-bit ARMv7-A processors with NEON support. The high performance of our implementations is mainly due to carefully optimized multiprecision and modular arithmetic that finely integrates both ARM and NEON instructions in order to reduce the number of pipeline stalls and memory accesses, and a new Montgomery reduction technique that combines the use of the UMAAL instruction with a variant of the hybrid-scanning approach. In addition, we present efficient implementations of SIDH and SIKE for 64-bit ARMv8-A processors, based on a high-speed Montgomery multiplication that leverages the power of 64-bit instructions. Our experimental results consolidate the practicality of supersingular isogeny-based protocols for many real-world applications. For example, a full key-exchange execution of SIDHp503 is performed in about 176 million cycles on an ARM Cortex-A15 from the ARMv7-A family (i.e., 88 milliseconds @2.0GHz). On an ARM Cortex-A72 from the ARMv8-A family, the same operation can be carried out in about 90 million cycles (i.e., 45 milliseconds @1.992GHz). All our software is protected against timing and cache attacks. The techniques for modular multiplication presented in this work have broad applications to other cryptographic schemes.