International Association for Cryptologic Research

International Association
for Cryptologic Research

CryptoDB

Kris Gaj

Publications

Year
Venue
Title
2015
EPRINT
2014
EPRINT
2014
EPRINT
2014
CHES
2011
CHES
2010
EPRINT
Comparing Hardware Performance of Fourteen Round Two SHA-3 Candidates Using FPGAs
Performance in hardware has been demonstrated to be an important factor in the evaluation of candidates for cryptographic standards. Up to now, no consensus exists on how such an evaluation should be performed in order to make it fair, transparent, practical, and acceptable for the majority of the cryptographic community. In this report, we formulate a proposal for a fair and comprehensive evaluation methodology, and apply it to the comparison of hardware performance of 14 Round~2 SHA-3 candidates. The most important aspects of our methodology include the definition of clear performance metrics, the development of a uniform and practical interface, generation of multiple sets of results for several representative FPGA families from two major vendors, and the application of a simple procedure to convert multiple sets of results into a single ranking. The VHDL codes for 256 and 512-bit variants of all 14 SHA-3 Round 2 candidates and the old standard SHA-2 have been developed and thoroughly verified. These codes have been then used to evaluate the relative performance of all aforementioned algorithms using seven modern families of Field Programmable Gate Arrays (FPGAs) from two major vendors, Xilinx and Altera. All algorithms have been evaluated using four performance measures: the throughput to area ratio, throughput, area, and the execution time for short messages. Based on these results, the 14 Round 2 SHA-3 candidates have been divided into several groups depending on their overall performance in FPGAs.
2010
CHES
2008
PKC
2007
EPRINT
AN OPTIMIZED HARDWARE ARCHITECTURE OF MONTGOMERY MULTIPLICATION ALGORITHM
Montgomery multiplication is one of the fundamental operations used in cryptographic algorithms, such as RSA and Elliptic Curve Cryptosystems. At CHES 1999, Tenca and Koc introduced a now-classical architecture for implementing Montgomery multiplication in hardware. With parameters optimized for minimum latency, this architecture performs a single Montgomery multiplication in approximately 2n clock cycles, where n is the size of operands in bits. In this paper we propose and discuss an optimized hardware architecture performing the same operation in approximately n clock cycles. Our architecture is based on pre-computing partial results using two possible assumptions regarding the most significant bit of the previous word, and is only marginally more demanding in terms of the circuit area. The new radix-2 architecture can be extended for the case of radix-4, while preserving a factor of two speed-up over the corresponding radix-4 design by Tenca, Todorov, and Koc from CHES 2001. Our architecture has been verified by modeling it in Verilog-HDL, implementing it using Xilinx Virtex-II 6000 FPGA, and experimentally testing it using SRC-6 reconfigurable computer.
2006
CHES
2006
EPRINT
FPGA Accelerated Tate Pairing Based Cryptosystems over Binary Fields
Though the implementation of the Tate pairing is commonly believed to be computationally more intensive than other cryptographic operations, such as ECC point multiplication, there has been a substantial progress in speeding up the Tate pairing computations. Because of their inherent parallelism, the existing Tate pairing algorithms are very suitable for hardware implementation aimed at achieving a high operation speed. Supersingular elliptic curves over binary fields are good candidates for hardware implementation due to their simple underlying algorithms and binary arithmetic. In this paper we propose efficient Tate pairing implementations over binary fields $\mathbb F_{2^{239}}$ and $\mathbb F_{2^{283}}$ via FPGA. Though our field sizes are larger than those used in earlier architectures with the same security strength based on cubic elliptic curves or binary hyperelliptic curves, fewer multiplications in the underlying field are required, so that the computational latency for one pairing can be reduced. As a result, our pairing accelerators implemented via FPGA can run 15-to-25 times faster than other FPGA realizations at the same level of security strength, and at the same time achieve lower product of latency by area.
2004
CHES
2003
CHES
2003
EUROCRYPT

Program Committees

CHES 2015
CHES 2014
Asiacrypt 2013
CHES 2013
CHES 2012
CHES 2011
CHES 2010
CHES 2009 (Program chair)
CHES 2007
CHES 2005
CHES 2002
CHES 2001
CHES 2000