## CryptoDB

### Yu Yu

#### ORCID: 0000-0002-9278-4521

#### Publications

**Year**

**Venue**

**Title**

2024

PKC

ReSolveD: Shorter Signatures from Regular Syndrome Decoding and VOLE-in-the-Head
Abstract

We present ReSolveD, a new candidate post-quantum signature scheme under the regular syndrome decoding (RSD) assumption for random linear codes, which is a well-established variant of the well-known syndrome decoding (SD) assumption. Our signature scheme is obtained by designing a new zero-knowledge proof for proving knowledge of a solution to the RSD problem in the recent VOLE-in-the-head framework using a sketching scheme to verify that a vector has weight exactly one. We achieve a signature size of 3.99 KB with a signing time of 27.3 ms and a verification time of 23.1 ms on a single core of a standard desktop for a 128-bit security level. Compared to the state-of-the-art code-based signature schemes, our signature scheme achieves 1.5X ~ 2X improvement in terms of the common “signature size + public-key size” metric, while keeping the computational efficiency competitive.

2024

EUROCRYPT

The Hardness of LPN over Any Integer Ring and Field for PCG Applications
Abstract

Learning parity with noise (LPN) has been widely studied and used in cryptography.
It was recently brought to new prosperity since Boyle et al. (CCS'18), putting LPN to a central role in designing secure multi-party computation, zero-knowledge proofs, private set intersection, and many other protocols. In this paper, we thoroughly studied security of LPN problems in this particular context. We found that some important aspects are long ignored and many conclusions from classical LPN cryptanalysis do not apply to this new setting, due to the low noise rates, extremely high dimensions, various types (in addition to $\FF_2$) and noise distributions.
For LPN over a field, we give a parameterized reduction from exact-noise LPN to regular-noise LPN. Compared to the recent result by Feneuil, Joux and Rivain (Crypto'22), we significantly reduce the security loss by paying only a small additive price in dimension and number of samples.
We analyze the security of LPN over a ring $\ZZ_{2^\lambda}$. Existing protocols based on LPN over integer rings use parameters as if they are over fields, but we found an attack that effectively reduces the weight of a noise by half compared to LPN over fields. Consequently, prior works that use LPN over $\ZZ_{2^\lambda}$ overestimate up to 40 bits of security.
We provide a complete picture of the hardness of LPN over integer rings by showing: 1) the equivalence between its search and decisional versions; 2) an efficient reduction from LPN over $\FF_{2}$ to LPN over $\ZZ_{2^\lambda}$; and 3) generalization of our results to any integer ring.
Finally, we provide an all-in-one estimator tool for the bit security of LPN parameters in the context of PCG, incorporating the recent advanced attacks.

2024

TCHES

Efficient Table-Based Masking with Pre-processing
Abstract

Masking is one of the most investigated countermeasures against sidechannel attacks. In a nutshell, it randomly encodes each sensitive variable into a number of shares, and compiles the cryptographic implementation into a masked one that operates over the shares instead of the original sensitive variables. Despite its provable security benefits, masking inevitably introduces additional overhead. Particularly, the software implementation of masking largely slows down the cryptographic implementations and requires a large number of random bits that need to be produced by a true random number generator. In this respect, reducing the< overhead of masking is still an essential and challenging task. Among various known schemes, Table-Based Masking (TBM) stands out as a promising line of work enjoying the advantages of generality to any lookup tables. It also allows the pre-processing paradigm, wherein a pre-processing phase is executed independently of the inputs, and a much more efficient online (using the precomputed tables) phase takes place to calculate the result. Obviously, practicality of pre-processing paradigm relies heavily on the efficiency of online phase and the size of precomputed tables.In this paper, we investigate the TBM scheme that offers a combination of linear complexity (in terms of the security order, denoted as d) during the online phase and small precomputed tables. We then apply our new scheme to the AES-128, and provide an implementation on the ARM Cortex architecture. Particularly, for a security order d = 8, the online phase outperforms the current state-of-the-art AES implementations on embedded processors that are vulnerable to the side-channel attacks. The security order of our scheme is proven in theory and verified by the T-test in practice. Moreover, we investigate the speed overhead associated with the random bit generation in our masking technique. Our findings indicate that the speed overhead can be effectively balanced. This is mainly because that the true random number generator operates in parallel with the processor’s execution, ensuring a constant supply of fresh random bits for the masked computation at regular intervals.

2023

EUROCRYPT

Actively Secure Half-Gates with Minimum Overhead under Duplex Networks
Abstract

Actively secure two-party computation (2PC) is one of the canonical building blocks
in modern cryptography. One main goal for designing actively secure 2PC protocols is to reduce the communication overhead, compared to semi-honest 2PC protocols.
In this paper, we propose a new actively secure constant-round 2PC protocol with one-way communication of $2\kappa+5$ bits per AND gate (for $\kappa$-bit computational
security and any statistical security), essentially matching the one-way communication of semi-honest half-gates protocol. This is achieved by two new techniques:
- The recent compression technique by Dittmer et al. (Crypto 2022) shows that a relaxed preprocessing is sufficient for authenticated garbling that does not reveal masked wire values to the garbler. We introduce a new form of authenticated bits and propose a new technique of generating authenticated AND triples to reduce the one-way communication of preprocessing from $5\rho+1$ bits to $2$ bits per AND gate for $\rho$-bit statistical security.
- Unfortunately, the above compressing technique is only compatible
with a less compact authenticated garbled circuit of size $2\kappa+3\rho$ bits per AND gate.
We designed a new authenticated garbling that does not use information
theoretic MACs but rather dual execution without leakage to authenticate wire values in the circuit.
This allows us to use a more compact half-gates based authenticated garbled circuit of size $2\kappa+1$ bits per AND gate, and meanwhile keep compatible
with the compression technique. Our new technique can achieve one-way communication of $2\kappa+5$ bits per AND gate.
Our technique of yielding authenticated AND triples can also be used to optimize the two-way communication (i.e., the total communication) by combining it with the authenticated garbled circuits by Dittmer et al., which results in an actively secure 2PC protocol with two-way communication of $2\kappa+3\rho+4$ bits per AND gate.

2023

TCHES

Efficient Private Circuits with Precomputation
Abstract

At CHES 2022, Wang et al. described a new paradigm for masked implementations using private circuits, where most intermediates can be precomputed before the input shares are accessed, significantly accelerating the online execution of masked functions. However, the masking scheme they proposed mainly featured (and was designed for) the cost amortization, leaving its (limited) suitability in the above precomputation-based paradigm just as a bonus. This paper aims to provide an efficient, reliable, easy-to-use, and precomputation-compatible masking scheme. We propose a new masked multiplication over the finite field Fq suitable for the precomputation, and prove its security in the composable notion called Probing-Isolating Non-Inference (PINI). Particularly, the operations (e.g., AND and XOR) in the binary field can be achieved by assigning q = 2, allowing the bitsliced implementation that has been shown to be quite efficient for the software implementations. The new masking scheme is applied to leverage the masking of AES and SKINNY block ciphers on ARM Cortex M architecture. The performance results show that the new scheme contributes to a significant speed-up compared with the state-of-the-art implementations. For SKINNY with block size 64, the speed and RAM requirement can be significantly improved (saving around 45% cycles in the online-computation and 60% RAM space for precomputed values) from AES-128, thanks to its smaller number of AND gates. Besides the security proof by hand, we provide formal verifications for the multiplication and T-test evaluations for the masked implementations of AES and SKINNY. Because of the structure of the new masked multiplication, our formal verification can be performed for security orders up to 16.

2023

CRYPTO

Revisiting the Constant-sum Winternitz One-time Signature with Applications to SPHINCS+ and XMSS
Abstract

Hash-based signatures offer a conservative alternative to post-quantum signatures with arguably better-understood security than other post-quantum candidates. As a core building block of hash-based signatures, the efficiency of one-time signature (OTS) largely dominates that of hash-based signatures. The WOTS$^{+}$ signature scheme (Africacrypt 2013) is the current state-of-the-art OTS adopted by the signature schemes standardized by NIST---XMSS, LMS, and SPHINCS$^+$.
A natural question is whether there is (and how much) room left for improving one-time signatures (and thus standard hash-based signatures). In this paper, we show that WOTS$^{+}$ one-time signature, when adopting the constant-sum encoding scheme (Bos and Chaum, Crypto 1992), is size-optimal not only under Winternitz's OTS framework, but also among all tree-based OTS designs. Moreover, we point out a flaw in the DAG-based OTS design previously shown to be size-optimal at Asiacrypt 1996, which makes the constant-sum WOTS$^{+}$ the most size-efficient OTS to the best of our knowledge. Finally, we evaluate the performance of constant-sum WOTS$^{+}$ integrated into the SPHINCS$^+$ (CCS 2019) and XMSS (PQC 2011) signature schemes which exhibit certain degrees of improvement in both signing time and signature size.

2023

ASIACRYPT

Algebraic Attacks on Round-Reduced Rain and Full AIM-III
Abstract

Picnic is a NIST PQC Round 3 Alternate signature candidate that builds upon symmetric primitives following the MPC-in-the-head paradigm. Recently, researchers have been exploring more secure/efficient signature schemes from conservative one-way functions based on AES, or new low complexity one-way functions like Rain (CCS 2022) and AIM (CCS 2023). The signature schemes based on Rain and AIM are currently the most efficient among MPC-in-the-head-based schemes, making them promising post-quantum digital signature candidates.
However, the exact hardness of these new one-way functions deserves further study and scrutiny. This work presents algebraic attacks on Rain and AIM for certain instances, where one-round Rain can be compromised in $2^{n/2}$ for security parameter $n\in \{128,192,256\}$, and two-round Rain can be broken in $2^{120.3}$, $2^{180.4}$, and $2^{243.1}$ encryptions, respectively. Additionally, we demonstrate an attack on AIM-III (which aims at 192-bit security) with a complexity of $2^{186.5}$ encryptions. These attacks exploit the algebraic structure of the power function over fields with characteristic 2, which provides potential insights into the algebraic structures of some symmetric primitives and thus might be of independent interest.

2023

TCC

Security Proofs for Key-Alternating Ciphers with Non-Independent Round Permutations
Abstract

This work studies the key-alternating ciphers (KACs) whose round permutations are not necessarily independent. We revisit existing security proofs for key-alternating ciphers with a single permutation (KACSPs), and extend their method to an arbitrary number of rounds. In particular, we propose new techniques that can significantly simplify the proofs, and also remove two unnatural restrictions in the known security bound of 3-round KACSP (Wu et al., Asiacrypt 2020). With these techniques, we prove the first tight security bound for t-round KACSP, which was an open problem. We stress that our techniques apply to all variants of KACs with non-independent round permutations, as well as to the standard KACs.

2022

EUROCRYPT

Cryptanalysis of Candidate Obfuscators for Affine Determinant Programs
📺
Abstract

At ITCS 2020, Bartusek et al. proposed a candidate indistinguishability obfuscator (iO) for affine determinant programs (ADPs). The candidate is special since it is the only unbroken candidate iO to date that does not rely on the hardness of traditional cryptographic assumptions like discrete-log or learning with errors. Instead, it directly applies specific randomization techniques to the underlying ADP. It is relatively efficient compared to the rest of the iO candidates. However, the obfuscation scheme requires further cryptanalysis since it was not known to be based on any well-formed mathematical assumptions.
In this paper, we show cryptanalytic attacks on the iO candidate provided by Bartusek et al. Our attack exploits the weakness of one of the randomization steps in the candidate. The attack applies to a fairly general class of programs. At the end of the paper we discuss plausible countermeasures to defend against our attacks.

2022

TCHES

Side-Channel Masking with Common Shares
Abstract

To counter side-channel attacks, a masking scheme randomly encodes keydependent variables into several shares, and transforms operations into the masked correspondence (called gadget) operating on shares. This provably achieves the de facto standard notion of probing security.We continue the long line of works seeking to reduce the overhead of masking. Our main contribution is a new masking scheme over finite fields in which shares of different variables have a part in common. This enables the reuse of randomness / variables across different gadgets, and reduces the total cost of masked implementation. For security order d and circuit size l, the randomness requirement and computational complexity of our scheme are Õ(d2) and Õ(ld2) respectively, strictly improving upon the state-of-the-art Õ(d2) and Õ(ld3) of Coron et al. at Eurocrypt 2020.A notable feature of our scheme is that it enables a new paradigm in which many intermediates can be precomputed before executing the masked function. The precomputation consumes Õ(ld2) and produces Õ(ld) variables to be stored in RAM. The cost of subsequent (online) computation is reduced to Õ(ld), effectively speeding up e.g., challenge-response authentication protocols. We showcase our method on the AES on ARM Cortex M architecture and perform a T-test evaluation. Our results show a speed-up during the online phase compared with state-of-the-art implementations, at the cost of acceptable RAM consumption and precomputation time.To prove security for our scheme, we propose a new security notion intrinsically supporting randomness / variables reusing across gadgets, and bridging the security of parallel compositions of gadgets to general compositions, which may be of independent interest.

2022

ASIACRYPT

A Third is All You Need: Extended Partial Key Exposure Attack on CRT-RSA with Additive Exponent Blinding
📺
Abstract

At Eurocrypt 2022, May et al. proposed a partial key exposure (PKE) attack on CRT-RSA that efficiently factors $N$ knowing only a $\frac{1}{3}$-fraction of either most significant bits (MSBs) or least significant bits (LSBs) of private exponents $d_p$ and $d_q$ for public exponent $e \approx N^{\frac{1}{12}}$. In practice, PKE attacks typically rely on the side-channel leakage of these exponents, while a side-channel resistant implementation of CRT-RSA often uses additively blinded exponents $d^{\prime}_p = d_p + r_p(p-1)$ and $d^{\prime}_q = d_q + r_q(q-1)$ with unknown random blinding factors $r_p$ and $r_q$, which makes PKE attacks more challenging.
Motivated by the above, we extend the PKE attack of May et al. to CRT-RSA with additive exponent blinding. While admitting $r_pe\in(0,N^{\frac{1}{4}})$, our extended PKE works ideally when $r_pe \approx N^{\frac{1}{12}}$, in which case the entire private key can be recovered using only $\frac{1}{3}$ known MSBs or LSBs of the blinded CRT exponents $d^{\prime}_p$ and $d^{\prime}_q$. Our extended PKE follows their novel two-step approach to first compute the key-dependent constant $k^{\prime}$ ($ed^{\prime}_p = 1 + k^{\prime}(p-1)$, $ed^{\prime}_q = 1 + l^{\prime}(q-1)$), and then to factor $N$ by computing the root of a univariate polynomial modulo $k^{\prime}p$.
We extend their approach as follows. For the MSB case, we propose two options for the first step of the attack, either by obtaining a single estimate $k^{\prime}l^{\prime}$ and calculating $k^{\prime}$ via factoring, or by obtaining multiple estimates $k^{\prime}l^{\prime}_1,\ldots,k^{\prime}l^{\prime}_z$ and calculating $k^{\prime}$ probabilistically via GCD.
For the LSB case, we extend their approach by constructing a different univariate polynomial in the second step of the LSB attack. A formal analysis shows that our LSB attack runs in polynomial time under the standard Coppersmith-type assumption, while our MSB attack either runs in sub-exponential time with a reduced input size (the problem is reduced to factor a number of size $e^2r_pr_q\approx N^{\frac{1}{6}}$) or in probabilistic polynomial time under a novel heuristic assumption. Under the settings of the most common key sizes (1024-bit, 2048-bit, and 3072-bit) and blinding factor lengths (32-bit, 64-bit, and 128-bit), our experiments verify the validity of the Coppersmith-type assumption and our own assumption, as well as the feasibility of the factoring step.
To the best of our knowledge, this is the first PKE on CRT-RSA with experimentally verified effectiveness against 128-bit unknown exponent blinding factors. We also demonstrate an application of the proposed PKE attack using real partial side-channel key leakage targeting a Montgomery Ladder exponentiation CRT implementation.

2022

ASIACRYPT

A Non-heuristic Approach to Time-space Tradeoffs and Optimizations for BKW
📺
Abstract

Blum, Kalai and Wasserman (JACM 2003) gave the first sub-exponential algorithm to solve the Learning Parity with Noise (LPN) problem. In particular, consider the LPN problem with constant noise and dimension $n$. The BKW solves it with space complexity $2^{\frac{(1+\epsilon)n}{\log n}}$ and time/sample complexity $2^{\frac{(1+\epsilon)n}{\log n}}\cdot 2^{\Omega(n^{\frac{1}{1+\epsilon}})}$ for small constant $\epsilon\to 0^+$.
We propose a variant of the BKW by tweaking Wagner's generalized birthday problem (Crypto 2002) and adapting the technique to a $c$-ary tree structure. In summary, our algorithm achieves the following:
\begin{enumerate}
\item {\bf (Time-space tradeoff).} We obtain the same time-space tradeoffs for LPN and LWE as those given by Esser et al. (Crypto 2018), but without resorting to any heuristics. For any $2\leq c\in\mathbb{N}$, our algorithm solves the LPN problem with time complexity $2^{\frac{\log c(1+\epsilon)n}{\log n}}\cdot 2^{\Omega(n^{\frac{1}{1+\epsilon}})}$ and space complexity $2^{\frac{\log c(1+\epsilon)n}{(c-1)\log n}}$, where one can use Grover's quantum algorithm or Dinur et al.'s dissection technique (Crypto 2012) to further accelerate/optimize the time complexity.
\item {\bf (Time/sample optimization).}
A further adjusted variant of our algorithm solves the LPN problem with sample, time and space complexities all kept at $2^{\frac{(1+\epsilon)n}{\log n}}$ for $\epsilon\to 0^+$, saving factor $2^{\Omega(n^{\frac{1}{1+\epsilon}})}$ in time/sample compared to the original BKW, and the variant of Devadas et al. (TCC 2017).
\item {\bf (Sample reduction).} Our algorithm provides an alternative to Lyubashevsky's BKW variant (RANDOM 2005) for LPN with a restricted amount of samples. In particular, given $Q=n^{1+\epsilon}$ (resp., $Q=2^{n^{\epsilon}}$) samples, our algorithm saves a factor of $2^{\Omega(n)/(\log n)^{1-\kappa}}$ (resp., $2^{\Omega(n^{\kappa})}$) for constant $\kappa \to 1^-$ in running time while consuming roughly the same space, compared with Lyubashevsky's algorithm.
\end{enumerate}
In particular, the time/sample optimization benefits from a careful analysis of the error distribution among the correlated candidates, which was not studied by previous rigorous approaches such as the analysis of Minder and Sinclair (J.Cryptology 2012) or Devadas et al. (TCC 2017).

2021

TOSC

Provable Security of SP Networks with Partial Non-Linear Layers
📺
Abstract

Motivated by the recent trend towards low multiplicative complexity blockciphers (e.g., Zorro, CHES 2013; LowMC, EUROCRYPT 2015; HADES, EUROCRYPT 2020; MALICIOUS, CRYPTO 2020), we study their underlying structure partial SPNs, i.e., Substitution-Permutation Networks (SPNs) with parts of the substitution layer replaced by an identity mapping, and put forward the first provable security analysis for such partial SPNs built upon dedicated linear layers. For different instances of partial SPNs using MDS linear layers, we establish strong pseudorandom security as well as practical provable security against impossible differential attacks. By extending the well-established MDS code-based idea, we also propose the first principled design of linear layers that ensures optimal differential propagation. Our results formally confirm the conjecture that partial SPNs achieve the same security as normal SPNs while consuming less non-linearity, in a well-established framework.

2021

CRYPTO

Pushing the Limits of Valiant's Universal Circuits: Simpler, Tighter and More Compact
📺
Abstract

A universal circuit (UC) is a general-purpose circuit that can simulate arbitrary circuits (up to a certain size $n$). Valiant provides a $k$-way recursive construction of UCs (STOC 1976), where $k$ tunes the complexity of the recursion. More concretely, Valiant gives theoretical constructions of 2-way and 4-way UCs of asymptotic (multiplicative) sizes $5n\log n$ and $4.75 n\log n$ respectively, which matches the asymptotic lower bound $\Omega(n\log n)$ up to some constant factor.
Motivated by various privacy-preserving cryptographic applications, Kiss et al. (Eurocrypt 2016) validated the practicality of $2$-way universal circuits by giving example implementations for private function evaluation. G{\"{u}}nther et al. (Asiacrypt 2017) and Alhassan et al. (J. Cryptology 2020) implemented the 2-way/4-way hybrid UCs with various optimizations in place towards making universal circuits more practical. Zhao et al. (Asiacrypt 2019) optimized Valiant's 4-way UC to asymptotic size $4.5 n\log n$ and proved a lower bound $3.64 n\log n$ for UCs under the Valiant framework.
As the scale of computation goes beyond 10-million-gate ($n=10^7$) or even billion-gate level ($n=10^9$), the constant factor in UCs size plays an increasingly important role in application performance. In this work, we investigate Valiant's universal circuits and present an improved framework for constructing universal circuits with the following advantages.
[Simplicity.] Parameterization is no longer needed. In contrast to that previous implementations resorted to a hybrid construction combining $k=2$ and $k=4$ for a tradeoff between fine granularity and asymptotic size-efficiency, our construction gets the best of both worlds when configured at the lowest complexity (i.e., $k=2$).
[Compactness.] Our universal circuits have asymptotic size $3n\log n$, improving upon the best previously known $4.5n\log n$ by 33\% and beating the $3.64n\log n$ lower bound for UCs constructed under Valiant's framework (Zhao et al., Asiacrypt 2019).
[Tightness.] We show that under our new framework the UCs size is lower bounded by $2.95 n\log n$, which almost matches the $3n\log n$ circuit size of our $2$-way construction.
We implement the 2-way universal circuits and evaluate its performance with other implementations, which confirms our theoretical analysis.

2021

CRYPTO

Smoothing Out Binary Linear Codes and Worst-case Sub-exponential Hardness for LPN
📺
Abstract

Learning parity with noise (LPN) is a notorious (average-case) hard problem that has been well studied in learning theory, coding theory and cryptography since the early 90's. It further inspires the Learning with Errors (LWE) problem [Regev, STOC 2005], which has become one of the central building blocks for post-quantum cryptography and advanced cryptographic. Unlike LWE whose hardness can be reducible from worst-case lattice problems, no corresponding worst-case hardness results were known for LPN until very recently. At Eurocrypt 2019, Brakerski et al. [BLVW19] established the first feasibility result that the worst-case hardness of nearest codeword problem (NCP) (on balanced linear code) at the extremely low noise rate $\frac{\log^2 n}{n}$ implies the quasi-polynomial hardness of LPN at the extremely high noise rate $1/2-1/\poly(n)$. It remained open whether a worst-case to average-case reduction can be established for standard (constant-noise) LPN, ideally with sub-exponential hardness.
We start with a simple observation that the hardness of high-noise LPN over large fields is implied by that of the LWE of the same modulus, and is thus reducible from worst-case hardness of lattice problems. We then revisit [BLVW19], which is the main focus of this work. We first expand the underlying binary linear codes (of the NCP) to not only the balanced code considered in [BLVW19] but also to another code (in some sense dual to balanced code). At the core of our reduction is a new variant of smoothing lemma (for both binary codes) that circumvents the barriers (inherent in the underlying worst-case randomness extraction) and admits tradeoffs for a wider spectrum of parameter choices. In addition to the worst-case hardness result obtained in [BLVW19], we show that for any constant $0<c<1$ the constant-noise LPN problem is ($T=2^{\Omega(n^{1-c})},\epsilon=2^{-\Omega(n^{\min(c,1-c)})},q=2^{\Omega(n^{\min(c,1-c)})}$)-hard assuming that the NCP at the low-noise rate $\tau=n^{-c}$ is ($T'={2^{\Omega(\tau n)}}$, $\epsilon'={2^{-\Omega(\tau n)}}$,$m={2^{\Omega(\tau n)}}$)-hard in the worst case, where $T$, $\epsilon$, $q$ and $m$ are time complexity, success rate, sample complexity, and codeword length respectively. Moreover, refuting the worst-case hardness assumption would imply arbitrary polynomial speedups over the current state-of-the-art algorithms for solving the NCP (and LPN), which is a win-win result. Unfortunately, public-key encryptions and collision resistant hash functions need constant-noise LPN with ($T={2^{\omega(\sqrt{n})}}$, $\epsilon'={2^{-\omega(\sqrt{n})}}$,$q={2^{\sqrt{n}}}$)-hardness (Yu et al., CRYPTO 2016 \& ASIACRYPT 2019), which is almost (up to an arbitrary $\omega(1)$ factor in the exponent) what is reducible from the worst-case NCP when $c= 0.5$. We leave it as an open problem whether the gap can be closed or there is a separation in place.

2021

TCHES

Learning Parity with Physical Noise: Imperfections, Reductions and FPGA Prototype
📺
Abstract

Hard learning problems are important building blocks for the design of various cryptographic functionalities such as authentication protocols and post-quantum public key encryption. The standard implementations of such schemes add some controlled errors to simple (e.g., inner product) computations involving a public challenge and a secret key. Hard physical learning problems formalize the potential gains that could be obtained by leveraging inexact computing to directly generate erroneous samples. While they have good potential for improving the performances and physical security of more conventional samplers when implemented in specialized integrated circuits, it remains unknown whether physical defaults that inevitably occur in their instantiation can lead to security losses, nor whether their implementation can be viable on standard platforms such as FPGAs. We contribute to these questions in the context of the Learning Parity with Physical Noise (LPPN) problem by: (1) exhibiting new (output) data dependencies of the error probabilities that LPPN samples may suffer from; (2) formally showing that LPPN instances with such dependencies are as hard as the standard LPN problem; (3) analyzing an FPGA prototype of LPPN processor that satisfies basic security and performance requirements.

2020

TOSC

Efficient Side-Channel Secure Message Authentication with Better Bounds
📺
Abstract

We investigate constructing message authentication schemes from symmetric cryptographic primitives, with the goal of achieving security when most intermediate values during tag computation and verification are leaked (i.e., mode-level leakage-resilience). Existing efficient proposals typically follow the plain Hash-then-MAC paradigm T = TGenK(H(M)). When the domain of the MAC function TGenK is {0, 1}128, e.g., when instantiated with the AES, forgery is possible within time 264 and data complexity 1. To dismiss such cheap attacks, we propose two modes: LRW1-based Hash-then-MAC (LRWHM) that is built upon the LRW1 tweakable blockcipher of Liskov, Rivest, and Wagner, and Rekeying Hash-then-MAC (RHM) that employs internal rekeying. Built upon secure AES implementations, LRWHM is provably secure up to (beyond-birthday) 278.3 time complexity, while RHM is provably secure up to 2121 time. Thus in practice, their main security threat is expected to be side-channel key recovery attacks against the AES implementations. Finally, we benchmark the performance of instances of our modes based on the AES and SHA3 and confirm their efficiency.

2020

PKC

Tweaking the Asymmetry of Asymmetric-Key Cryptography on Lattices: KEMs and Signatures of Smaller Sizes
📺
Abstract

Currently, lattice-based cryptosystems are less efficient than their number-theoretic counterparts (based on RSA, discrete logarithm, etc.) in terms of key and ciphertext (signature) sizes. For adequate security the former typically needs thousands of bytes while in contrast the latter only requires at most hundreds of bytes. This significant difference has become one of the main concerns in replacing currently deployed public-key cryptosystems with lattice-based ones. Observing the inherent asymmetries in existing lattice-based cryptosystems, we propose asymmetric variants of the (module-)LWE and (module-)SIS assumptions, which yield further size-optimized KEM and signature schemes than those from standard counterparts. Following the framework of Lindner and Peikert (CT-RSA 2011) and the Crystals-Kyber proposal (EuroS&P 2018), we propose an IND-CCA secure KEM scheme from the hardness of the asymmetric module-LWE (AMLWE), whose asymmetry is fully exploited to obtain shorter public keys and ciphertexts. To target at a 128-bit quantum security, the public key (resp., ciphertext) of our KEM only has 896 bytes (resp., 992 bytes). Our signature scheme bears most resemblance to and improves upon the Crystals-Dilithium scheme (ToCHES 2018). By making full use of the underlying asymmetric module-LWE and module-SIS assumptions and carefully selecting the parameters, we construct an SUF-CMA secure signature scheme with shorter public keys and signatures. For a 128-bit quantum security, the public key (resp., signature) of our signature scheme only has 1312 bytes (resp., 2445 bytes). We adapt the best known attacks and their variants to our AMLWE and AMSIS problems and conduct a comprehensive and thorough analysis of several parameter choices (aiming at different security strengths) and their impacts on the sizes, security and error probability of lattice-based cryptosystems. Our analysis demonstrates that AMLWE and AMSIS problems admit more flexible and size-efficient choices of parameters than the respective standard versions.

2020

CRYPTO

Better Concrete Security for Half-Gates Garbling (in the Multi-Instance Setting)
📺
Abstract

We study the concrete security of high-performance implementations of half-gates garbling, which all rely on (hardware-accelerated) AES. We find that current instantiations using k-bit wire labels can be completely broken—in the sense that the circuit evaluator learns all the inputs of the circuit garbler—in time O(2k/C), where C is the total number of (non-free) gates that are garbled, possibly across multiple independent executions. The attack can be applied to existing circuit-garbling libraries using k = 80 when C ≈ $10^9$, and would require 267 machine-months and cost about $3500 to implement on the Google Cloud Platform. Since the attack can be entirely parallelized, the attack could be carried out in about a month using ≈ 250 machines.
With this as our motivation, we seek a way to instantiate the hash function in the half-gates scheme so as to achieve better concrete security. We present a construction based on AES that achieves optimal security in the single-instance setting (when only a single circuit is garbled). We also show how to modify the half-gates scheme so that its concrete security does not degrade in the multi-instance setting. Our modified scheme is as efficient as prior work in networks with up to 2 Gbps bandwidth.

2020

ASIACRYPT

Packed Multiplication: How to Amortize the Cost of Side-channel Masking?
📺
Abstract

Higher-order masking countermeasures provide strong provable security against side-channel attacks at the cost of incurring significant overheads, which largely hinders its applicability. Previous works towards remedying cost mostly concentrated on ``local'' calculations, i.e., optimizing the cost of computation units such as a single AND gate or a field multiplication. This paper explores a complementary ``global'' approach, i.e., considering multiple operations in the masked domain as a batch and reducing randomness and computational cost via amortization. In particular, we focus on the amortization of $\ell$ parallel field multiplications for appropriate integer $\ell > 1$, and design a kit named {\it packed multiplication} for implementing such a batch.
Higher-order masking countermeasures provide strong provable security against side-channel attacks at the cost of incurring significant overheads, which largely hinders its applicability. Previous works towards remedying cost mostly concentrated on ``local'' calculations, i.e., optimizing the cost of computation units such as a single AND gate or a field multiplication. This paper explores a complementary ``global'' approach, i.e., considering multiple operations in the masked domain as a batch and reducing randomness and computational cost via amortization.
In particular, we focus on the amortization of $\ell$ parallel field multiplications for appropriate integer $\ell > 1$, and design a kit named {\it packed multiplication} for implementing such a batch.
For $\ell+d\leq2^m$, when $\ell$ parallel multiplications over $\mathbb{F}_{2^{m}}$ with $d$-th order probing security are implemented, packed multiplication consumes $d^2+2\ell d + \ell$ bilinear multiplications and $2d^2 + d(d+1)/2$ random field variables, outperforming the state-of-the-art results with $O(\ell d^2)$ multiplications and $\ell \left \lfloor d^2/4\right \rfloor + \ell d$ randomness. To prove $d$-probing security for packed multiplications, we introduce some weaker security notions for multiple-inputs-multiple-outputs gadgets and use them as intermediate steps, which may be of independent interest.
As parallel field multiplications exist almost everywhere in symmetric cryptography, lifting optimizations from ``local'' to ``global'' substantially enlarges the space of improvements. To demonstrate, we showcase the method on the AES Subbytes step, GCM and TET (a popular disk encryption). Notably, when $d=8$, our implementation of AES Subbytes in ARM Cortex M architecture achieves a gain of up to $33\%$ in total speeds and saves up to $68\%$ random bits than the state-of-the-art bitsliced implementation reported at ASIACRYPT~2018.

2019

ASIACRYPT

Valiant’s Universal Circuits Revisited: An Overall Improvement and a Lower Bound
Abstract

A universal circuit (UC) is a general-purpose circuit that can simulate arbitrary circuits (up to a certain size n). At STOC 1976 Valiant presented a graph theoretic approach to the construction of UCs, where a UC is represented by an edge universal graph (EUG) and is recursively constructed using a dedicated graph object (referred to as supernode). As a main end result, Valiant constructed a 4-way supernode of size 19 and an EUG of size $$4.75n\log n$$ (omitting smaller terms), which remained the most size-efficient even to this day (after more than 4 decades).Motivated by the emerging applications of UCs in various privacy preserving computation scenarios, we revisit Valiant’s universal circuits, and propose a 4-way supernode of size 18, and an EUG of size $$4.5n\log n$$. As confirmed by our implementations, we reduce the size of universal circuits (and the number of AND gates) by more than 5% in general, and thus improve upon the efficiency of UC-based cryptographic applications accordingly. Our approach to the design of optimal supernodes is computer aided (rather than by hand as in previous works), which might be of independent interest. As a complement, we give lower bounds on the size of EUGs and UCs in Valiant’s framework, which significantly improves upon the generic lower bound on UC size and therefore reduces the gap between theory and practice of universal circuits.

2019

ASIACRYPT

Collision Resistant Hashing from Sub-exponential Learning Parity with Noise
Abstract

The Learning Parity with Noise (LPN) problem has recently found many cryptographic applications such as authentication protocols, pseudorandom generators/functions and even asymmetric tasks including public-key encryption (PKE) schemes and oblivious transfer (OT) protocols. It however remains a long-standing open problem whether LPN implies collision resistant hash (CRH) functions. Inspired by the recent work of Applebaum et al. (ITCS 2017), we introduce a general construction of CRH from LPN for various parameter choices. We show that, just to mention a few notable ones, under any of the following hardness assumptions (for the two most common variants of LPN) 1.constant-noise LPN is $$2^{n^{0.5+\varepsilon }}$$-hard for any constant $$\varepsilon >0$$;2.constant-noise LPN is $$2^{\varOmega (n/\log n)}$$-hard given $$q=\mathsf {poly}(n)$$ samples;3.low-noise LPN (of noise rate $$1/\sqrt{n}$$) is $$2^{\varOmega (\sqrt{n}/\log n)}$$-hard given $$q=\mathsf {poly}(n)$$ samples. there exists CRH functions with constant (or even poly-logarithmic) shrinkage, which can be implemented using polynomial-size depth-3 circuits with NOT, (unbounded fan-in) AND and XOR gates. Our technical route LPN $$\rightarrow $$ bSVP $$\rightarrow $$ CRH is reminiscent of the known reductions for the large-modulus analogue, i.e., LWE $$\rightarrow $$ SIS $$\rightarrow $$ CRH, where the binary Shortest Vector Problem (bSVP) was recently introduced by Applebaum et al. (ITCS 2017) that enables CRH in a similar manner to Ajtai’s CRH functions based on the Short Integer Solution (SIS) problem.Furthermore, under additional (arguably minimal) idealized assumptions such as small-domain random functions or random permutations (that trivially imply collision resistance), we still salvage a simple and elegant collision-resistance-preserving domain extender combining the best of the two worlds, namely, maximized (depth one) parallelizability and polynomial shrinkage. In particular, assume $$2^{n^{0.5+\varepsilon }}$$-hard constant-noise LPN or $$2^{n^{0.25+\varepsilon }}$$-hard low-noise LPN, we obtain a collision resistant hash function that evaluates in parallel only a single layer of small-domain random functions (or random permutations) and shrinks polynomially.

2015

CRYPTO

2013

ASIACRYPT

#### Program Committees

- Asiacrypt 2024
- Crypto 2023
- Asiacrypt 2023
- Eurocrypt 2022
- CHES 2022
- PKC 2022
- TCC 2022
- Asiacrypt 2021
- Crypto 2021
- Eurocrypt 2021
- Asiacrypt 2020
- Eurocrypt 2020
- PKC 2019
- TCC 2019
- Asiacrypt 2018
- TCC 2017

#### Coauthors

- Boaz Barak (1)
- Davide Bellizia (1)
- Zhenfu Cao (1)
- Gaëtan Cassiers (1)
- Yilei Chen (1)
- Hongrui Cui (4)
- Yevgeniy Dodis (2)
- Xiaolei Dong (1)
- Shuqin Fan (1)
- Rong Fu (1)
- Dawu Gu (3)
- Zheng Guo (1)
- Chun Guo (7)
- Clément Hoffmann (1)
- Zhenkai Hu (1)
- Fanjie Ji (3)
- Dina Kamel (1)
- Jonathan Katz (1)
- Hugo Krawczyk (1)
- Lu Li (1)
- Xiangxue Li (4)
- Wenling Liu (1)
- Junrong Liu (1)
- Hanlin Liu (6)
- Pierrick Méaux (1)
- Olivier Pereira (2)
- Krzysztof Pietrzak (1)
- Joop van de Pol (1)
- François-Xavier Standaert (8)
- John P. Steinberger (1)
- Yang Su (1)
- Yiteng Sun (1)
- Weijia Wang (7)
- Qingju Wang (1)
- Taoyun Wang (1)
- Bohan Wang (1)
- Xiao Wang (4)
- Jian Weng (4)
- Chenkai Weng (1)
- Yusai Wu (1)
- Sen Xu (1)
- Di Yan (1)
- Kang Yang (4)
- Li Yao (1)
- Liqing Yu (1)
- Yu Yu (33)
- Kaiyi Zhang (1)
- Kai Zhang (2)
- Jiang Zhang (7)
- Juelin Zhang (2)
- Zhenfeng Zhang (1)
- Shuoyao Zhao (2)
- Yuanyuan Zhou (1)