International Association for Cryptologic Research

International Association
for Cryptologic Research


Thomas Espitau


Guessing Bits: Improved Lattice Attacks on (EC)DSA with Nonce Leakage
The lattice reduction attack on (EC)DSA (and other Schnorr-like signature schemes) with partially known nonces, originally due to Howgrave-Graham and Smart, has been at the core of many concrete cryptanalytic works, side-channel based or otherwise, in the past 20 years. The attack itself has seen limited development, however: improved analyses have been carried out, and the use of stronger lattice reduction algorithms has pushed the range of practically vulnerable parameters further, but the lattice construction based on the signatures and known nonce bits remain the same.In this paper, we propose a new idea to improve the attack based on the same data in exchange for additional computation: carry out an exhaustive search on some bits of the secret key. This turns the problem from a single bounded distance decoding (BDD) instance in a certain lattice to multiple BDD instances in a fixed lattice of larger volume but with the same bound (making the BDD problem substantially easier). Furthermore, the fact that the lattice is fixed lets us use batch/preprocessing variants of BDD solvers that are far more efficient than repeated lattice reductions on non-preprocessed lattices of the same size. As a result, our analysis suggests that our technique is competitive or outperforms the state of the art for parameter ranges corresponding to the limit of what is achievable using lattice attacks so far (around 2-bit leakage on 160-bit groups, or 3-bit leakage on 256-bit groups).We also show that variants of this idea can also be applied to bits of the nonces (leading to a similar improvement) or to filtering signature data (leading to a data-time trade-off for the lattice attack). Finally, we use our technique to obtain an improved exploitation of the TPM–FAIL dataset similar to what was achieved in the Minerva attack.
Mitaka: A Simpler, Parallelizable, Maskable Variant of Falcon 📺
This work describes the Mitaka signature scheme: a new hash-and-sign signature scheme over NTRU lattices which can be seen as a variant of NIST finalist Falcon. It achieves comparable efficiency but is considerably simpler, online/offline, and easier to parallelize and protect against side-channels, thus offering significant advantages from an implementation standpoint. It is also much more versatile in terms of parameter selection. We obtain this signature scheme by replacing the FFO lattice Gaussian sampler in Falcon by the “hybrid” sampler of Ducas and Prest, for which we carry out a detailed and corrected security analysis. In principle, such a change can result in a substantial security loss, but we show that this loss can be largely mitigated using new techniques in key generation that allow us to construct much higher quality lattice trapdoors for the hybrid sampler relatively cheaply. This new approach can also be instantiated on a wide variety of base fields, in contrast with Falcon's restriction to power-of-two cyclotomics. We also introduce a new lattice Gaussian sampler with the same quality and efficiency, but which is moreover compatible with the integral matrix Gram root technique of Ducas et al., allowing us to avoid floating point arithmetic. This makes it possible to realize the same signature scheme as Mitaka efficiently on platforms with poor support for floating point numbers. Finally, we describe a provably secure masking of Mitaka. More precisely, we introduce novel gadgets that allow provable masking at any order at much lower cost than previous masking techniques for Gaussian sampling-based signature schemes, for cheap and dependable side-channel protection.
Shorter Hash-and-Sign Lattice-Based Signatures 📺
Lattice-based digital signature schemes following the hash-and-sign design paradigm of Gentry, Peikert and Vaikuntanathan (GPV) tend to offer an attractive level of efficiency, particularly when instantiated with structured compact trapdoors. In particular, NIST postquantum finalist Falcon is both quite fast for signing and verification and quite compact: NIST notes that it has the smallest bandwidth (as measured in combined size of public key and signature) of all round 2 digital signature candidates. Nevertheless, while Falcon--512, for instance, compares favorably to ECDSA--384 in terms of speed, its signatures are well over 10 times larger. For applications that store large number of signatures, or that require signatures to fit in prescribed packet sizes, this can be a critical limitation. In this paper, we explore several approaches to further improve the size of hash-and-sign lattice-based signatures, particularly instantiated over NTRU lattices like Falcon and its recent variant Mitaka. In particular, while GPV signatures are usually obtained by sampling lattice points according to some *spherical* discrete Gaussian distribution, we show that it can be beneficial to sample instead according to a suitably chosen *ellipsoidal* discrete Gaussian: this is because only half of the sampled Gaussian vector is actually output as the signature, while the other half is recovered during verification. Making the half that actually occurs in signatures shorter reduces signature size at essentially no security loss (in a suitable range of parameters). Similarly, we show that reducing the modulus $q$ with respect to which signatures are computed can improve signature size as well as verification key size almost ``for free''; this is particularly true for constructions like Falcon and Mitaka that do not make substantial use of NTT-based multiplication (and rely instead on transcendental FFT). Finally, we show that the Gaussian vectors in signatures can be represented in a more compact way with appropriate coding-theoretic techniques, improving signature size by an additional 7 to 14%. All in all, we manage to reduce the size of, e.g., Falcon signatures by 30--40% at the cost of only 4--6 bits of Core-SVP security.
Towards faster polynomial-time lattice reduction 📺
The LLL algorithm is a polynomial-time algorithm for reducing d-dimensional lattice with exponential approximation factor. Currently, the most efficient variant of LLL, by Neumaier and Stehl\'e, has a theoretical running time in $d^4\cdot B^{1+o(1)}$ where $B$ is the bitlength of the entries, but has never been implemented. This work introduces new asymptotically fast, parallel, yet heuristic, reduction algorithms with their optimized implementations. Our algorithms are recursive and fully exploit fast block matrix multiplication. We experimentally demonstrate that by carefully controlling the floating-point precision during the recursion steps, we can reduce euclidean lattices of rank d in time $\tilde{O}(d^\omega\cdot C)$, i.e., almost a constant number of matrix multiplications, where $\omega$ is the exponent of matrix multiplication and C is the log of the condition number of the matrix. For cryptographic applications, C is close to B, while it can be up to d times larger in the worst case. It improves the running-time of the state-of-the-art implementation fplll by a multiplicative factor of order $d^2\cdot B$. Further, we show that we can reduce structured lattices, the so-called knapsack lattices, in time $\tilde{O}(d^{\omega-1}\cdot C)$ with a progressive reduction strategy. Besides allowing reducing huge lattices, our implementation can break several instances of Fully Homomorphic Encryption schemes based on large integers in dimension 2,230 with 4 millions of bits.
Fast reduction of algebraic lattices over cyclotomic fields 📺
We introduce a framework generalizing lattice reduction algorithms to module lattices in order to \emph{practically} and \emph{efficiently} solve the $\gamma$-Hermite Module-SVP problem over arbitrary cyclotomic fields. The core idea is to exploit the structure of the subfields for designing a recursive strategy of reduction in the tower of fields we are working in. Besides, we demonstrate how to leverage the inherent symplectic geometry existing such fields to provide a significant speed-up of the reduction for rank two modules. As a byproduct, we also generalize to all cyclotomic fields and provide speedups for many previous number theoretical algorithms, in particular to the rounding in the so-called Log-unit lattice. Quantitatively, we show that a module of rank 2 over a cyclotomic field of degree $n$ can be heuristically reduced within approximation factor $2^{\tilde{O}(n)}$ in time $\tilde{O}(n^2B)$, where $B$ is the bitlength of the entries. For $B$ large enough, this complexity shrinks to $\tilde{O}(n^{\log_2 3}B)$. This last result is particularly striking as it goes below the estimate of $n^2B$ swaps given by the classical analysis of the LLL algorithm using the decrease of the \emph{potential} of the basis. Finally, all this framework is fully parallelizable, and we provide a full implementation. We apply it to break multilinear cryptographic candidates on concrete proposed parameters. We were able to reduce matrices of dimension 4096 with 6675-bit integers in 4 days, which is more than a million times faster than previous state-of-the-art implementations. Eventually, we demonstrate a quasicubic time for the Gentry-Szydlo algorithm which finds a generator given the relative norm and a basis of an ideal. This algorithm is important in cryptanalysis and requires efficient ideal multiplications and lattice reductions; as such we can practically use it in dimension 1024.
LWE Without Modular Reduction and Improved Side-Channel Attacks Against BLISS
This paper is devoted to analyzing the variant of Regev’s learning with errors (LWE) problem in which modular reduction is omitted: namely, the problem (ILWE) of recovering a vector $$\mathbf {s}\in \mathbb {Z}^n$$ given polynomially many samples of the form $$(\mathbf {a},\langle \mathbf {a},\mathbf {s}\rangle + e)\in \mathbb {Z}^{n+1}$$ where $$\mathbf { a}$$ and e follow fixed distributions. Unsurprisingly, this problem is much easier than LWE: under mild conditions on the distributions, we show that the problem can be solved efficiently as long as the variance of e is not superpolynomially larger than that of $$\mathbf { a}$$. We also provide almost tight bounds on the number of samples needed to recover $$\mathbf {s}$$.Our interest in studying this problem stems from the side-channel attack against the BLISS lattice-based signature scheme described by Espitau et al. at CCS 2017. The attack targets a quadratic function of the secret that leaks in the rejection sampling step of BLISS. The same part of the algorithm also suffers from a linear leakage, but the authors claimed that this leakage could not be exploited due to signature compression: the linear system arising from it turns out to be noisy, and hence key recovery amounts to solving a high-dimensional problem analogous to LWE, which seemed infeasible. However, this noisy linear algebra problem does not involve any modular reduction: it is essentially an instance of ILWE, and can therefore be solved efficiently using our techniques. This allows us to obtain an improved side-channel attack on BLISS, which applies to 100% of secret keys (as opposed to $${\approx }7\%$$ in the CCS paper), and is also considerably faster.

Program Committees

PKC 2022
CHES 2022