International Association for Cryptologic Research

International Association
for Cryptologic Research

IACR News item: 03 November 2025

Ruben Niederhagen, Hoang Nguyen Hien Pham
ePrint Report ePrint Report
This work improves upon the instruction set extension proposed in the paper "Towards ML-KEM and ML-DSA on OpenTitan", in short OTBNTW, for OpenTitan’s big number coprocessor OTBN. OTBNTW introduces a dedicated vector instruction for prime-field Montgomery multiplication, with a high multi-cycle latency and a relatively low utilization of the underlying integer multiplication unit. The design targets post-quantum cryptographic schemes ML-KEM and ML-DSA, which rely on 12-bit and 23-bit prime field arithmetic, respectively. We improve the efficiency of the Montgomery multiplication by fully exploiting existing integer multiplication resources and move modular multiplication from hardware back to software by providing more powerful and versatile integer-multiplication vector instructions. This enables us not only to reduce the overall computational overhead through lazy reduction in software but also to improve performance in other functions beyond finite-field arithmetic. We provide two variants of our instruction set extension, each offering different trade-offs between resource usage and performance. For ML-KEM and ML-DSA, we achieve a speedup of up to 17% in cycle count, with an ASIC area increase of up to 6% and an FPGA resource usage increase of up to 4% more LUT, 20% more CARRY4, 1% more FF, and the same number of DSP compared to OTBNTW. Overall, we significantly reduce the ASIC time-area product, if the designs are clocked at their individual maximum frequency, and at least match that of OTBNTW, if the designs are clocked at the same frequency.
Expand

Additional news items may be found on the IACR news page.