LogStack: Stacked Garbling with O(b log b) Computation
Secure two party computation (2PC) of arbitrary programs can be efficiently achieved using garbled circuits (GC). Until recently, it was widely believed that a GC proportional to the entire program, including parts of the program that are entirely discarded due to conditional branching, must be transmitted over a network. Recent work shows that this belief is false, and that communication proportional only to the longest program execution path suffices (Heath and Kolesnikov, CRYPTO 20, [HK20a]). Although this recent work reduces needed communication, it increases computation. For a conditional with b branches, the players use O(b^2) computation (traditional GC uses only O(b)). Our scheme LogStack reduces stacked garbling computation from O(b^2) to O(b log b) with no increase in communication over [HK20a]. The cause of [HK20a]'s increased computation is the oblivious collection of garbage labels that emerge during the evaluation of inactive branches. Garbage is collected by a multiplexer that is costly to generate. At a high level, we redesign stacking and garbage collection to avoid quadratic scaling. Our construction is also more space efficient: [HK20a] algorithms require O(b) space, while ours use only O(log b) space. This space efficiency allows even modest setups to handle large numbers of branches. [HK20a] assumes a random oracle (RO). We track the source of this need, formalize a simple and natural added assumption on the base garbling scheme, and remove reliance on RO: LogStack is secure in the standard model. Nevertheless, LogStack can be instantiated with typical GC tricks based on non-standard assumptions, such as free XOR and half-gates, and hence can be implemented with high efficiency. We implemented LogStack (in the RO model, based on half-gates garbling) and report performance. In terms of wall-clock time and for fewer than 16 branches, our performance is comparable to [HK20a]'s; for larger branching factors, our approach clearly outperforms [HK20a]. For example, given 1024 branches, our approach is 31x faster.
Masked Triples: Amortizing Multiplication Triples across Conditionals 📺
A classic approach to MPC uses preprocessed multiplication triples to evaluate arbitrary Boolean circuits. If the target circuit features conditional branching, e.g. as the result of a IF program statement, then triples are wasted: one triple is consumed per AND gate, even if the output of the gate is entirely discarded by the circuit’s conditional behavior. In this work, we show that multiplication triples can be re-used across conditional branches. For a circuit with b branches, each having n AND gates, we need only a total of n triples, rather than the typically required bn. Because preprocessing triples is often the most expensive step in protocols that use them, this significantly improves performance. Prior work similarly amortized oblivious transfers across branches in the classic GMW protocol (Heath et al., Asiacrypt 2020, [HKP20]). In addition to demonstrating conditional improvements are possible for a different class of protocols, we also concretely improve over [HKP20]: their maximum improvement is bounded by the topology of the circuit. Our protocol yields improvement independent of topology: we need triples proportional to the size of the program’s longest execution path, regardless of the structure of the program branches. We implemented our approach in C++. Our experiments show that we significantly improve over a "naive" protocol and over prior work: for a circuit with 16 branches and in terms of total communication, we improved over naive by 12x and over [HKP20] by an average of 2.6x. Our protocol is secure against the semi-honest corruption of p-1 parties.
Stacked Garbling for Disjunctive Zero-Knowledge Proofs 📺
Zero-knowledge (ZK) proofs (ZKP) have received wide attention, focusing on non-interactivity, short proof size, and fast verification time. We focus on the fastest total proof time, in particular for large Boolean circuits. Under this metric, Garbled Circuit (GC)-based ZKP (Jawurek et al., [JKO], CCS 2013) remained the state-of-the-art technique due to the low-constant linear scaling of computing the garbling. We improve GC-ZKP for proof statements with conditional clauses. Our communication is proportional to the longest branch rather than to the entire proof statement. This is most useful when the number m of branches is large, resulting in up to factor m× improvement over JKO. In our proof-of-concept illustrative application, prover P demonstrates knowledge of a bug in a codebase consisting of any number of snippets of actual C code. Our computation cost is linear in the size of the code- base and communication is constant in the number of snippets. That is, we require only enough communication for a single largest snippet! Our conceptual contribution is stacked garbling for ZK, a privacy-free circuit garbling scheme that can be used with the JKO GC-ZKP protocol to construct more efficient ZKP. Given a Boolean circuit C and computational security parameter k, our garbling is L · k bits long, where L is the length of the longest execution path in C. All prior concretely efficient garbling schemes produce garblings of size |C| · k. The computational cost of our scheme is not increased over prior state-of-the-art. We implement our GC-ZKP and demonstrate significantly improved (m× over JKO) ZK performance for functions with branching factor m. Compared with recent ZKP (STARK, Libra, KKW, Ligero, Aurora, Bulletproofs), our scheme offers much better proof times for larger circuits (35-1000× or more, depending on circuit size and compared scheme). For our illustrative application, we consider four C code snippets, each of about 30-50 LOC; one snippet allows an invalid memory dereference. The entire proof takes 0.15 seconds and communication is 1.5 MB.
Stacked Garbling: Garbled Circuit Proportional to Longest Execution Path 📺
Secure two party computation (2PC) of arbitrary programs can be efficiently achieved using garbled circuits (GC). The bottleneck of GC efficiency is communication. It is widely believed that it is necessary to transmit the entire GC during 2PC, even for conditional branches that are not taken. This folklore belief is false. We propose a novel GC technique, stacked garbling, that eliminates the communication cost of inactive conditional branches. We extend the ideas of conditional GC evaluation explored in (Kolesnikov, Asiacrypt 18) and (Heath and Kolesnikov, Eurocrypt 20). Unlike these works, ours is for general 2PC where no player knows which conditional branch is taken. Our garbling scheme, Stack, requires communication proportional to the longest execution path rather than to the entire circuit. Stack is compatible with state-of-the-art techniques, such as free XOR and half-gates. Stack is a garbling scheme. As such, it can be plugged into a variety of existing protocols, and the resulting round complexity is the same as that of standard GC. The approach does incur computation cost quadratic in the conditional branching factor vs linear in standard schemes, but the tradeoff is beneficial for most programs: GC computation even on weak hardware is faster than GC transmission on fast channels. We implemented Stack in C++. Stack reduces communication cost by approximately the branching factor: for 16 branches, communication is reduced by 10.5x. In terms of wall-clock time for circuits with branching factor 16 over a 50 Mbps WAN on a laptop, Stack outperforms state-of- the-art half-gates-based 2PC by more than 4x.
MOTIF: (Almost) Free Branching in GMW via Vector-Scalar Multiplication 📺
MPC functionalities are increasingly specified in high-level languages, where control-flow constructions such as conditional statements are extensively used. Today, concretely efficient MPC protocols are circuit-based and must evaluate all conditional branches at high cost to hide the taken branch. The Goldreich-Micali-Wigderson, or GMW, protocol is a foundational circuit-based technique that realizes MPC for p players and is secure against up to p-1 semi-honest corruptions. While GMW requires communication rounds proportional to the computed circuit’s depth, it is effective in many natural settings. Our main contribution is MOTIF (Minimizing OTs for IFs), a novel GMW extension that evaluates conditional branches almost for free by amortizing Oblivious Transfers (OTs) across branches. That is, we simultaneously evaluate multiple independent AND gates, one gate from each mutually exclusive branch, by representing them as a single cheap vector-scalar multiplication (VS) gate. For 2PC with b branches, we simultaneously evaluate up to b AND gates using only two 1-out-of-2 OTs of b-bit secrets. This is a factor ~b improvement over the state-of-the-art 2b 1-out-of-2 OTs of 1-bit secrets. Our factor b improvement generalizes to the multiparty setting as well: b AND gates consume only p(p ? 1) 1-out-of-2 OTs of b-bit secrets. We implemented our approach and report its performance. For 2PC and a circuit with 16 branches, each comparing two length-65000 bitstrings, MOTIF outperforms standard GMW in terms of communication by ~9.4x. Total wall-clock time is improved by 4.1 - 9.2x depending on network settings. Our work is in the semi-honest model, tolerating all-but-one corruptions.