## CryptoDB

### Ni Trieu

#### Publications

Year
Venue
Title
2021
CRYPTO
Many recent private set intersection (PSI) protocols encode input sets as polynomials. We consider the more general notion of an oblivious key-value store (OKVS), which is a data structure that compactly represents a desired mapping $k_i$ to $v_i$. When the $v_i$ values are random, the OKVS data structure hides the $k_i$ values that were used to generate it. The simplest (and size-optimal) OKVS is a polynomial $p$ that is chosen using interpolation such that $p(k_i)=v_i$. We initiate the formal study of oblivious key-value stores, and show new constructions resulting in the fastest OKVS to date. Similarly to cuckoo hashing, current analysis techniques are insufficient for finding *concrete* parameters to guarantee a small failure probability for our OKVS constructions. Moreover, it would cost too much to run experiments to validate a small upperbound on the failure probability. We therefore show novel techniques to amplify an OKVS construction which has a failure probability $p$, to an OKVS with a similar overhead and failure probability $p^c$. Setting $p$ to be moderately small enables to validate it by running a relatively small number of $O(1/p)$ experiments. This validates a $p^c$ failure probability for the amplified OKVS. Finally, we describe how OKVS can significantly improve the state of the art of essentially all variants of PSI. This leads to the fastest two-party PSI protocols to date, for both the semi-honest and the malicious settings. Specifically, in networks with moderate bandwidth (e.g., 30 - 300 Mbps) our malicious two-party PSI protocol has 40\% less communication and is 20-40% faster than the previous state of the art protocol, even though the latter only has heuristic confidence.
2021
ASIACRYPT
The private join and compute (PJC) functionality enables secure computation over data distributed across different databases, and is applicable to a wide range of applications, many of which address settings where the input databases are of significantly different sizes. We introduce the notion of private information retrieval (PIR) with default, which enables two-party PJC functionalities in a way that hides the size of the intersection of the two databases and incurs sublinear communication cost in the size of the bigger database. We provide two constructions for this functionality, one of which requires offline linear communication, which can be amortized across queries, and one that provides sublinear cost for each query but relies on more computationally expensive tools. We construct inner-product PJC, which has applications to ads conversion measurement and contact tracing, relying on an extension of PIR with default. We evaluate the efficiency of our constructions, which can enable $\mathbf{2^{8}}$ PIR with default lookups on a database of size $\mathbf{2^{25}}$ (or inner-product PJC on databases with such sizes) with the communication of $\mathbf{44}$MB, which costs less than $\mathbf{0.17}$c. for the client and $\mathbf{26.48}$c. for the server.
2020
EUROCRYPT
We present a 2-party private set intersection (PSI) protocol which provides security against malicious participants, yet is almost as fast as the fastest known semi-honest PSI protocol of Kolesnikov et al. (CCS 2016). Our protocol is based on a new approach for two-party PSI, which can be instantiated to provide security against either malicious or semi-honest adversaries. The protocol is unique in that the only difference between the semi-honest and malicious versions is an instantiation with different parameters for a linear error-correction code. It is also the first PSI protocol which is concretely efficient while having linear communication and security against malicious adversaries, while running in the OT-hybrid model (assuming a non-programmable random oracle). State of the art semi-honest PSI protocols take advantage of cuckoo hashing, but it has proven a challenge to use cuckoo hashing for malicious security. Our protocol is the first to use cuckoo hashing for malicious- secure PSI. We do so via a new data structure, called a probe-and-XOR of strings (PaXoS), which may be of independent interest. This abstraction captures important properties of previous data structures, most notably garbled Bloom filters. While an encoding by a garbled Bloom filter is larger by a factor of $\Omega(\lambda)$ than the original data, we describe a significantly improved PaXoS based on cuckoo hashing that achieves constant rate while being no worse in other relevant efficiency measures.
2020
ASIACRYPT
Private Set Intersection Cardinality (PSI-CA) allows two parties, each holding a set of items, to learn the size of the intersection of those sets without revealing any additional information. To the best of our knowledge, this work presents the first protocol that allows one of the parties to delegate PSI-CA computation to untrusted servers. At the heart of our delegated PSI-CA protocol is a new oblivious distributed key PRF (Odk-PRF) abstraction, which may be of independent interest. We explore in detail how to use our delegated PSI-CA protocol to perform privacy-preserving contact tracing. It has been estimated that a significant percentage of a given population would need to use a contact tracing app to stop a disease’s spread. Prior privacy-preserving contact tracing systems, however, impose heavy bandwidth or computational demands on client devices. These demands present an economic disincentive to participate for end users who may be billed per MB by their mobile data plan or for users who want to save battery life. We propose Catalic (ContAct TrAcing for LIghtweight Clients), a new contact tracing system that minimizes bandwidth cost and computation workload on client devices. By applying our new delegated PSI-CA protocol, Catalic shifts most of the client-side computation of contact tracing to untrusted servers, and potentially saves each user hundreds of megabytes of mobile data per day while preserving privacy.
2019
EUROCRYPT
We improve the attack of Durak and Vaudenay (CRYPTO’17) on NIST Format-Preserving Encryption standard FF3, reducing the running time from $O(N^5)$O(N5) to $O(N^{17/6})$O(N17/6) for domain $\mathbb {Z}_N \times \mathbb {Z}_N$ZN×ZN. Concretely, DV’s attack needs about $2^{50}$250 operations to recover encrypted 6-digit PINs, whereas ours only spends about $2^{30}$230 operations. In realizing this goal, we provide a pedagogical example of how to use distinguishing attacks to speed up slide attacks. In addition, we improve the running time of DV’s known-plaintext attack on 4-round Feistel of domain $\mathbb {Z}_N \times \mathbb {Z}_N$ZN×ZN from $O(N^3)$O(N3) time to just $O(N^{5/3})$O(N5/3) time. We also generalize our attacks to a general domain $\mathbb {Z}_M \times \mathbb {Z}_N$ZM×ZN, allowing one to recover encrypted SSNs using about $2^{50}$250 operations. Finally, we provide some proof-of-concept implementations to empirically validate our results.
2019
CRYPTO
We describe a novel approach for two-party private set intersection (PSI) with semi-honest security. Compared to existing PSI protocols, ours has a more favorable balance between communication and computation. Specifically, our protocol has the lowest monetary cost of any known PSI protocol, when run over the Internet using cloud-based computing services (taking into account current rates for CPU + data). On slow networks (e.g., 10 Mbps) our protocol is actually the fastest.Our novel underlying technique is a variant of oblivious transfer (OT) extension that we call sparse OT extension. Conceptually it can be thought of as a communication-efficient multipoint oblivious PRF evaluation. Our sparse OT technique relies heavily on manipulating high-degree polynomials over large finite fields (i.e. elements whose representation requires hundreds of bits). We introduce extensive algorithmic and engineering improvements for interpolation and multi-point evaluation of such polynomials, which we believe will be of independent interest.Finally, we present an extensive empirical comparison of state-of-the-art PSI protocols in several application scenarios and along several dimensions of measurement: running time, communication, peak memory consumption, and—arguably the most relevant metric for practice—monetary cost.
2019
ASIACRYPT
We present a new efficient protocol for computing private set union (PSU). Here two semi-honest parties, each holding a dataset of known size (or of a known upper bound), wish to compute the union of their sets without revealing anything else to either party. Our protocol is in the OT hybrid model. Beyond OT extension, it is fully based on symmetric-key primitives. We motivate the PSU primitive by its direct application to network security and other areas.At the technical core of our PSU construction is the reverse private membership test (RPMT) protocol. In RPMT, the sender with input $x^*$ interacts with a receiver holding a set X. As a result, the receiver learns (only) the bit indicating whether $x^* \in X$, while the sender learns nothing about the set X. (Previous similar protocols provide output to the opposite party, hence the term “reverse” private membership.) We believe our RPMT abstraction and constructions may be a building block in other applications as well.We demonstrate the practicality of our proposed protocol with an implementation. For input sets of size $2^{20}$ and using a single thread, our protocol requires 238 s to securely compute the set union, regardless of the bit length of the items. Our protocol is amenable to parallelization. Increasing the number of threads from 1 to 32, our protocol requires only 13.1 s, a factor of $18.25{\times }$ improvement.To the best of our knowledge, ours is the first protocol that reports on large-size experiments, makes code available, and avoids extensive use of computationally expensive public-key operations. (No PSU code is publicly available for prior work, and the only prior symmetric-key-based work reports on small experiments and focuses on the simpler 3-party, 1-corruption setting.) Our work improves reported PSU state of the art by factor up to $7,600{\times }$ for large instances.
2018
CRYPTO
Format-preserving encryption (FPE) produces ciphertexts which have the same format as the plaintexts. Building secure FPE is very challenging, and recent attacks (Bellare, Hoang, Tessaro, CCS ’16; Durak and Vaudenay, CRYPTO ’17) have highlighted security deficiencies in the recent NIST SP800-38G standard. This has left the question open of whether practical schemes with high security exist.In this paper, we continue the investigation of attacks against FPE schemes. Our first contribution are new known-plaintext message recovery attacks against Feistel-based FPEs (such as FF1/FF3 from the NIST SP800-38G standard) which improve upon previous work in terms of amortized complexity in multi-target scenarios, where multiple ciphertexts are to be decrypted. Our attacks are also qualitatively better in that they make no assumptions on the correlation between the targets to be decrypted and the known plaintexts. We also surface a new vulnerability specific to FF3 and how it handles odd length domains, which leads to a substantial speedup in our attacks.We also show the first attacks against non-Feistel based FPEs. Specifically, we show a strong message-recovery attack for FNR, a construction proposed by Cisco which replaces two rounds in the Feistel construction with a pairwise-independent permutation, following the paradigm by Naor and Reingold (JoC, ’99). We also provide a strong ciphertext-only attack against a variant of the DTP construction by Brightwell and Smith, which is deployed by Protegrity within commercial applications. All of our attacks show that existing constructions fall short of achieving desirable security levels. For Feistel and the FNR schemes, our attacks become feasible on small domains, e.g., 8 bits, for suggested round numbers. Our attack against the DTP construction is practical even for large domains. We provide proof-of-concept implementations of our attacks that verify our theoretical findings.