*00:17* [Pub][ePrint]
The Q-curve Construction for Endomorphism-Accelerated Elliptic Curves, by Benjamin Smith
We give a detailed account of the use of \\(\\mathbb{Q}\\)-curve reductions to construct elliptic curves over \\(\\mathbb{F}_{p^2}\\) with efficiently computable endomorphisms, which can be used to accelerate elliptic curve-based cryptosystems in the same way as Gallant--Lambert--Vanstone (GLV) and Galbraith--Lin--Scott (GLS) endomorphisms. Like GLS (which is a degenerate case of our construction), we offer the advantage over GLV of selecting from a much wider range of curves, and thus finding secure group orders when \\(p\\) is fixed for efficient implementation.

Unlike GLS, we also offer the possibility of constructing twist-secure curves.

We construct several one-parameter families of elliptic curves over \\(\\mathbb{F}_{p^2}\\) equipped with efficient endomorphisms for every \\(p > 3\\), and exhibit examples of twist-secure curves over \\(\\mathbb{F}_{p^2}\\) for the efficient Mersenne prime \\(p = 2^{127}-1\\).

*00:17* [Pub][ePrint]
Faster Binary-Field Multiplication and Faster Binary-Field MACs, by Daniel J. Bernstein and Tung Chou
This paper shows how to securely authenticate messages using just 29 bit operations per authenticated bit, plus a constant overhead per message. The authenticator is a standard type of \"universal\" hash function providing information-theoretic security; what is new is computing this type of hash function at very high speed.At a lower level, this paper shows how to multiply two elements of a field of size 2^128 using just 9062 \\approx 71 * 128 bit operations, and how to multiply two elements of a field of size 2^256 using just 22164 \\approx 87 * 256 bit operations. This performance relies on a new representation of field elements and new FFT-based multiplication techniques.

This paper\'s constant-time software uses just 1.89 Core 2 cycles per byte to authenticate very long messages. On a Sandy Bridge it takes 1.43 cycles per byte, without using Intel\'s PCLMULQDQ polynomial-multiplication hardware. This is much faster than the speed records for constant-time implementations of GHASH without PCLMULQDQ (over 10 cycles/byte), even faster than Intel\'s best Sandy Bridge implementation of GHASH with PCLMULQDQ (1.79 cycles/byte), and almost as fast as state-of-the-art 128-bit prime-field MACs using Intel\'s integer-multiplication hardware (around 1 cycle/byte).

*00:17* [Pub][ePrint]
Differentially Private Linear Algebra in the Streaming Model, by Jalaj Upadhyay
The focus of this paper is on differential privacy of streaming data using sketch-based algorithms. Previous works, like Dwork {\\it et al.} (ICS 2010, STOC 2010), explored random sampling based streaming algorithms. We work in the well studied streaming model of computation, where the database is stored in the form of a matrix and a curator can access the database row-wise or column-wise. Dwork {\\it et al.} (STOC 2010) gave impossibility result for any non-trivial query on a streamed data with respect to the user level privacy. Therefore, in this paper, we restrict our attention to the event level privacy. {We provide optimal, up to logarithmic factor, space differentially private mechanism in the streaming model for three basic linear algebraic tasks: matrix multiplication, linear regression, and low rank approximation, while incurring significantly less additive error}.

Our approach for matrix multiplication and linear regression has some similarities with Blocki {\\it et al.} (FOCS 2012) and Upadhyay (ASIACRYPT 2013) on the superficial level, but there are some subtle differences. For example, they perform an affine transformation to convert the private matrix in to a set of $\\{\\sqrt{w/n},1\\}^n$ vectors for some appropriate $w$, while we perform an input perturbation that raises the singular value of the private matrix. %On a high level, the mechanism for linear regression and matrix multiplication can be seen as a private analogue of the known streaming algorithms. In order to get a streaming algorithm for low rank approximation, we have to reuse the random Gaussian matrix in a specific way. We prove that the resulting distribution also preserve differential privacy.

We do not make any assumptions, like singular value separation, as made in the earlier works of Hardt and Roth (STOC 2013) and Kapralov and Talwar (SODA 2013). Further, we do not assume normalized row as in the work of Dwork {\\it et al.} (STOC 2014). All our mechanisms, in the form presented, can also be computed in the distributed setting of Biemel, Nissim, and Omri (CRYPTO 2008).

*00:17* [Pub][ePrint]
Secure modular password authentication for the web using channel bindings, by Mark Manulis and Douglas Stebila and Nick Denham
Secure protocols for password-based user authentication are well-studied in the cryptographic literature but have failed to see wide-spread adoption on the Internet; most proposals to date require extensive modifications to the Transport Layer Security (TLS) protocol, making deployment challenging. Recently, a few modular designs have been proposed in which a cryptographically secure password-based mutual authentication protocol is run inside a confidential (but not necessarily authenticated) channel such as TLS; the password protocol is bound to the established channel to prevent active attacks. Such protocols are useful in practice for a variety of reasons: security no longer relies on users\' ability to validate server certificates and can potentially be implemented with no modifications to the secure channel protocol library.We provide a systematic study of such authentication protocols. Building on recent advances in modelling TLS, we give a formal definition of the intended security goal, which we call password-authenticated and confidential channel establishment (PACCE). We show generically that combining a secure channel protocol, such as TLS, with a password authentication protocol, where the two protocols are bound together using either the transcript of the secure channel\'s handshake or the server\'s certificate, results in a secure PACCE protocol. Our prototype based on TLS is available as a cross-platform client-side Firefox browser extension and a server-side web application which can easily be installed on deployed web browsers and servers.

*00:17* [Pub][ePrint]
Augmented Learning with Errors: The Untapped Potential of the Error Term, by Rachid El Bansarkhani and Özgür Dagdelen and Johannes Buchmann
The Learning with Errors (LWE) problem has gained a lot of attention in recent years leading to a series of new cryptographic applications. Specifically, it states that it is hard to distinguish random linear equations disguised by some small error from truly random ones. Interestingly, cryptographic primitives based on LWE often do not exploit the full potential of the error term beside of its importance for security.To this end, we introduce a novel LWE-close assumption, namely Augmented Learning with Errors (A-LWE), which allows to hide auxiliary data injected into the error term by a technique that we call message embedding. In particular, it enables existing cryptosystems to strongly increase the message throughput per ciphertext. We show that A-LWE is for certain instantiations at least as hard as the LWE problem. This inherently leads to new cryptographic constructions providing high data load encryption and customized security properties as required, for instance, in economic environments such as stock markets resp. for financial transactions. The security of those constructions basically stems from the hardness to solve the A-LWE problem.

As an application we introduce (among others) the first lattice-based replayable chosen-ciphertext secure encryption scheme from A-LWE.

*00:17* [Pub][ePrint]
S-box pipelining using genetic algorithms for high-throughput AES implementations: How fast can we go?, by Lejla Batina and Domagoj Jakobovic and Nele Mentens and Stjepan Picek and Antonio de la Piedr
In the last few years, several practitioners have proposed awide range of approaches for reducing the implementation area of the

AES in hardware. However, an area-throughput trade-off that undermines high-speed is not realistic for real-time cryptographic applications. In this manuscript, we explore how Genetic Algorithms (GAs) can be used for pipelining the AES substitution box based on composite field arithmetic. We implemented a framework that parses and analyzes a Verilog netlist, abstracts it as a graph of interconnected cells and generates circuit statistics on its elements and paths. With this information, the GA extracts the appropriate arrangement of Flip-Flops (FFs) that maximizes the throughput of the given netlist. In doing so, we show that it is possible to achieve a 50 % improvement in throughput with only an 18 % increase in area in the UMC 0.13 um low-leakage standard cell library.

*00:17* [Pub][ePrint]
Cube Attacks and Cube-attack-like Cryptanalysis on the Round-reduced Keccak Sponge Function, by Itai Dinur and Pawel Morawiecki and Josef Pieprzyk and Marian Srebrny and Michal Straus
In this paper, we comprehensively study the resistance of keyed variants of SHA-3 (Keccak) against algebraic attacks. This analysis covers a wide range of key recovery, MAC forgery and other types of attacks, breaking up to 9 rounds (out of the full 24) of the Keccak internal permutation much faster than exhaustive search. Moreover, some of our attacks on the 6-round Keccak are completely practical and were verified on a desktop PC. Our methods combine cube attacks (an algebraic key recovery attack) and related algebraic techniques with structural analysis of the Keccak permutation. These techniques should be useful in future cryptanalysis of Keccak and similar designs.Although our attacks break more rounds than previously published techniques, the security margin of Keccak remains large. For Keyak -- a Keccak-based authenticated encryption scheme -- the nominal number of rounds is 12 and therefore its security margin is smaller (although still sufficient).