## CryptoDB

### Francisco Rodríguez-Henríquez

#### Publications

**Year**

**Venue**

**Title**

2019

JOFC

Koblitz Curves over Quadratic Fields
Abstract

In this work, we retake an old idea that Koblitz presented in his landmark paper (Koblitz, in: Proceedings of CRYPTO 1991. LNCS, vol 576, Springer, Berlin, pp 279–287, 1991 ), where he suggested the possibility of defining anomalous elliptic curves over the base field $${\mathbb {F}}_4$$ F 4 . We present a careful implementation of the base and quadratic field arithmetic required for computing the scalar multiplication operation in such curves. We also introduce two ordinary Koblitz-like elliptic curves defined over $${\mathbb {F}}_4$$ F 4 that are equipped with efficient endomorphisms. To the best of our knowledge, these endomorphisms have not been reported before. In order to achieve a fast reduction procedure, we adopted a redundant trinomial strategy that embeds elements of the field $${\mathbb {F}}_{4^{m}},$$ F 4 m , with m a prime number, into a ring of higher order defined by an almost irreducible trinomial. We also suggest a number of techniques that allow us to take full advantage of the native vector instructions of high-end microprocessors. Our software library achieves the fastest timings reported for the computation of the timing-protected scalar multiplication on Koblitz curves, and competitive timings with respect to the speed records established recently in the computation of the scalar multiplication over binary and prime fields.

2014

EPRINT

2010

EPRINT

High-Speed Software Implementation of the Optimal Ate Pairing over Barreto-Naehrig Curves
Abstract

This paper describes the design of a fast software library for the computation of the optimal ate pairing on a Barreto--Naehrig elliptic curve. Our library is able to compute the optimal ate pairing over a $254$-bit prime field $\mathbb{F}_{p}$, in just $2.63$ million of clock cycles on a single core of an Intel Core i7 $2.8$GHz processor, which implies that the pairing computation takes $0.942$msec. We are able to achieve this performance by a careful implementation of the base field arithmetic through the usage of the customary Montgomery multiplier for prime fields. The prime field is constructed via the Barreto--Naehrig polynomial parametrization of the prime $p$ given as, $p = 36t^4 +36t^3 +24t^2 +6t+1$, with $t = 2^{62} - 2^{54} + 2^{44}$. This selection of $t$ allows us to obtain important savings for both the Miller loop as well as the final exponentiation steps of the optimal ate pairing.

2009

EPRINT

Low Complexity Cubing and Cube Root Computation over $\F_{3^m}$ in Standard Basis
Abstract

We present low complexity formulae for the computation
of cubing and cube root over $\F_{3^m}$ constructed using special classes of trinomials, tetranomials and pentanomials.
We show that for all those special classes of polynomials, cube root operation
has the same area and time complexity as field cubing when implemented in hardware or software platforms.

2009

CHES

2008

EPRINT

A Comparison Between Hardware Accelerators for the Modified Tate Pairing over $\mathbb{F}_{2^m}$ and $\mathbb{F}_{3^m}$
Abstract

In this article we propose a study of the modified Tate pairing in characteristics two and three. Starting from the $\eta_T$ pairing introduced by Barreto {\em et al.} (Des Codes Crypt, 2007), we detail various algorithmic improvements in the case of characteristic two. As far as characteristic three is concerned, we refer to the survey by Beuchat {\em et al.} (ePrint 2007-417). We then show how to get back to the modified Tate pairing at almost no extra cost. Finally, we explore the trade-offs involved in the hardware implementation of this pairing for both characteristics two and three. From our experiments, characteristic three appears to have a slight advantage over characteristic two.

2008

EPRINT

A Pipelined Karatsuba-Ofman Multiplier over GF($3^{97}$) Amenable for Pairing Computation
Abstract

We present a subquadratic ternary field multiplier based on the combination of several variants of the Karatsuba-Ofman scheme
recently published. Since one of the most relevant applications for this kind of multipliers is pairing computation,
where several field multiplications need to be computed at once, we decided to design a $k$-stage pipeline
structure for $k=1,\ldots,4$, where each stage is composed of a 49-trit polynomial multiplier unit. That
architecture can compute an average of $k$ field multiplications every three clock cycles, which implies that our
four-stage pipeline design can perform more than one field multiplication per clock cycle. When implemented in
a Xilinx Virtex V XC5VLX330 FPGA device, this multiplier can compute one field multiplication over \gf($3^{97}$)
in just $11.47$ns.

2007

EPRINT

Reconfigurable Hardware Implementations of Tweakable Enciphering Schemes
Abstract

Tweakable enciphering schemes are length preserving block cipher
modes of operation that provide a strong pseudo-random permutation.
It has been suggested that these schemes can be used as the main
building blocks for achieving in-place disk encryption. In the past
few years there has been an intense research activity towards
constructing secure and efficient tweakable enciphering schemes.
But, actual experimental performance data of these newly proposed
schemes are yet to be reported. Accordingly, in this paper we
present optimized FPGA implementations of five tweakable enciphering
schemes, namely, HCH, HCTR, XCB, EME and TET, using a 128-bit AES
core as the underlying block cipher. We report performance timings
of these modes when using both, pipelined and sequential AES
structures. The universal polynomial hash function included in the
specification of HCH, HCHfp (a variant of HCH), HCTR, XCB and TET,
was implemented using a Karatsuba-Ofman multiplier as the main
building block. We provide detailed analyses of each of the schemes
and their experimental performances achieved in various scenarios.
Our experiments show that a sequential AES core is not an attractive
option for the design of these modes as it leads to rather poor
throughputs. In contrast, by using an encryption/decryption
pipelined AES core we get a throughput of 3.67 Gbps for HCTR and by
using a encryption only pipeline AES core we get a throughput of
5.71 Gbps for EME. The performance results reported in this paper
provide experimental evidence that hardware implementations of
tweakable enciphering schemes can actually match and even outperform
the data rates achieved by state-of-the-technology disk controllers,
thus showing that they might be used for achieving provably secure
in-place hard disk encryption.

2006

EPRINT

Parallel Itoh-Tsujii Multiplicative Inversion Algorithm for a Special Class of Trinomials
Abstract

In this contribution, we derive a novel parallel formulation of the standard Itoh-Tsujii algorithm for multiplicative inverse computation over GF($2^m$). The main building blocks used by our algorithm are: field multiplication, field squaring and field square root operators. It achieves its best performance when using a special class of irreducible trinomials, namely, $P(X) = X^m + X^k + 1$, with $m$ and $k$ odd numbers and when implemented in hardware
platforms. Under these conditions, our experimental results show that
our parallel version of the Itoh-Tsujii algorithm yields a speedup of about 30% when compared with the standard version of it. Implemented in a Virtex 3200E FPGA device, our design is able to compute multiplicative inversion over GF($2^193$) after 20 clock cycles in about $0.94\mu$S.

2006

EPRINT

Low Complexity Bit-Parallel Square Root Computation over GF($2^m$) for all Trinomials
Abstract

In this contribution we introduce a low-complexity bit-parallel algorithm for computing square roots over binary extension fields. Our proposed method can be applied for any type of irreducible polynomials. We derive explicit formulae for the space and time complexities associated to the square root operator when working with binary extension fields generated using irreducible trinomials. We show that for those finite fields, it is possible to compute the square root of an arbitrary field element with equal or better hardware efficiency than the one associated to the field squaring operation. Furthermore, a practical application of the square root operator in the domain of field exponentiation computation is presented. It is shown that by using as building blocks squarers, multipliers and square root blocks, a parallel version of the classical square-and-multiply exponentiation algorithm can be obtained. A hardware implementation of that parallel version may provide a speedup of up to 50\% percent when compared with the traditional version.

#### Program Committees

- CHES 2020
- CHES 2019
- CHES 2009

#### Coauthors

- Gora Adj (1)
- Omran Ahmadi (1)
- Diego F. Aranha (3)
- Jean-Luc Beuchat (4)
- Nicolas Brisebarre (1)
- Daniel Cervantes-Vázquez (1)
- Debrup Chakraborty (1)
- Nidia Cortez-Duarte (1)
- Nareli Cruz-Cortés (1)
- Jérémie Detrey (2)
- Jorge Enrique González Díaz (1)
- Nicolas Estibals (1)
- Armando Faz-Hernández (1)
- Darrel Hankerson (1)
- Julio López (6)
- Cuauhtemoc Mancillas-Lopez (1)
- Alfred Menezes (1)
- Shigeo Mitsunari (2)
- Guillermo Morales-Luna (2)
- Eiji Okamoto (4)
- Thomaz Oliveira (5)
- Luis J. Dominguez Perez (1)
- Ana H. Sánchez-Ramírez (1)
- Nazar A. Saqib (1)
- Jonathan Taverne (1)
- Tadanori Teruya (2)
- Eric Zavattoni (1)