

# Benchmarking of Cryptographic Hardware

## Kris Gaj George Mason University

### **Lessons from the Past**

# Various groups tend to choose independently the same (or similar) devices and tools

Round 2 of AES contest, 1999-2000: Xilinx Virtex 1000, Xilinx ISE





# Results for ASICs match very well results for FPGAs, and are both very different than software



Serpent fastest in hardware, slowest in software

# Differences in hardware efficiency of cryptographic algorithms (even of the same type) are very significant

eSTREAM contest, 2007-2008: FPGA, Xilinx Spartan 3



#### Hardware results matter!

#### Round 2 of AES Contest, 2000

#### **Speed in FPGAs**

#### Votes at the AES 3 conference



# **Plans for the Future**

### **Modern Benchmarking**

**Software FPGAs ASICs** XILINX" **eBACS** D. Bernstein, T. Lange

#### **Our Solution**

#### ATHENa – Automated Tool for Hardware EvaluatioN



Set of scripts written in Perl aimed at an AUTOMATED generation of OPTIMIZED results for MULTIPLE hardware platforms

Currently under development at George Mason University.

More details about the project at http://cryptography.gmu.edu/athena

#### **Basic Dataflow of ATHENa**





### **ATHENa Major Features (1)**

- synthesis, implementation, and timing analysis in the batch mode
- support for devices and tools of multiple FPGA vendors:







generation of results for multiple families of FPGAs of a given vendor









automated choice of a best-matching device within a given family



### **ATHENa Major Features (2)**

 automated verification of the design through simulation in the batch mode



• exhaustive search for optimum options of the tools

 heuristic optimization algorithms aimed at maximizing selected performance measures (e.g., speed, area, speed/area ratio, power, cost, etc.)

# Multi-Pass Place-and-Route Analysis GMU SHA-512, Xilinx Virtex 5

100 runs for different placement starting points



#### Dependence of Results on Requested Clock Frequency



#### My Favorite Hardware Performance Metrics:

Mbit/s for Throughput

ns for Latency

Allows for easy cross-comparison among implementations in software (microprocessors), FPGAs (various vendors), ASICs (various libraries)

#### How to measure hardware cost in FPGAs?

#### 1. Stand-alone cryptographic core on FPGA



Cost of a smallest FPGA that can fit the core.
Unit: USD [FPGA vendors would need to publish MSRP (manufacturer's suggested retail price) of their chips] – not very likely or size of the chip in mm<sup>2</sup> - easy to obtain

#### 2. Part of an FPGA System On-Chip

Vector: (CLB slices, BRAMs, MULs, DSP units) for Xilinx (LEs, memory bits, PLLs, MULs, DSP units) for Altera

#### 3. FPGA prototype of an ASIC implementation

Force the implementation using only reconfigurable logic (no DSPs or multipliers, distributed memory vs. BRAM): Use **CLB slices** as a metric. [LEs for Altera]

#### **Level of openness**

#### **Source files**

"No source proofs"

**Testimonies** 

**Netlists** 

Current situation: conference/journal papers

Results
FPGA device

**Tool names+versions** 

Options of tools Constraint files

Interfaces
Testbenches

ATHENa space

#### **No Source Proof**

