## CryptoDB

### Adam Smith

#### Affiliation: Pennsylvania State University, USA

#### Publications

**Year**

**Venue**

**Title**

2019

EUROCRYPT

Distributed Differential Privacy via Shuffling
Abstract

We consider the problem of designing scalable, robust protocols for computing statistics about sensitive data. Specifically, we look at how best to design differentially private protocols in a distributed setting, where each user holds a private datum. The literature has mostly considered two models: the “central” model, in which a trusted server collects users’ data in the clear, which allows greater accuracy; and the “local” model, in which users individually randomize their data, and need not trust the server, but accuracy is limited. Attempts to achieve the accuracy of the central model without a trusted server have so far focused on variants of cryptographic multiparty computation (MPC), which limits scalability.In this paper, we initiate the analytic study of a shuffled model for distributed differentially private algorithms, which lies between the local and central models. This simple-to-implement model, a special case of the ESA framework of [5], augments the local model with an anonymous channel that randomly permutes a set of user-supplied messages. For sum queries, we show that this model provides the power of the central model while avoiding the need to trust a central server and the complexity of cryptographic secure function evaluation. More generally, we give evidence that the power of the shuffled model lies strictly between those of the central and local models: for a natural restriction of the model, we show that shuffled protocols for a widely studied selection problem require exponentially higher sample complexity than do central-model protocols.

2015

JOFC

2015

EPRINT

2015

TCC

2010

EPRINT

Robust Fuzzy Extractors and Authenticated Key Agreement from Close Secrets
Abstract

Consider two parties holding samples from correlated distributions W and W', respectively, that are within distance t of each other in some metric space. These parties wish to agree on a uniformly distributed secret key R by sending a single message over an insecure channel controlled by an all-powerful adversary. We consider both the keyless case, where the parties share no additional secret information, and the keyed case, where the parties share a long-term secret SK that they can use to generate a sequence of session keys {R_j} using multiple pairs {W_j, W'_j}. The former has applications to, e.g., biometric authentication, while the latter arises in, e.g., the bounded storage model with errors.
Our results improve upon previous work in several respects:
-- The best previous solution for the keyless case with no errors (i.e., t=0) requires the min-entropy of W to exceed 2n/3, where n is the bit-length of W. Our solution applies whenever min-entropy of W exceeds the minimal possible} threshold n/2, and yields a longer key.
-- Previous solutions for the keyless case in the presence of errors (i.e., t>0) required random oracles. We give the first constructions (for certain metrics) in the standard model.
-- Previous solutions for the keyed case were stateful. We give the first stateless solution.

2008

EPRINT

A Note on Differential Privacy: Defining Resistance to Arbitrary Side Information
Abstract

In this note we give a precise formulation of "resistance to arbitrary side information" and show that several relaxations of differential privacy imply it. The formulation follows the ideas originally due to Dwork and McSherry, stated implicitly in [Dwork06]. This is, to our knowledge, the first place such a formulation appears explicitly. The proof that relaxed definitions satisfy the Bayesian formulation is new.

2006

CRYPTO

2006

EPRINT

Analyzing the HB and HB+ Protocols in the ``Large Error'' Case
Abstract

HB and HB+ are two shared-key, unidirectional authentication protocols whose extremely low computational cost makes them potentially well-suited for severely resource-constrained devices. Security of these protocols is based on the conjectured hardness of learning parity with noise; that is, learning a secret $s$ given ``noisy'' dot products of $s$ that are incorrect with probability $\epsilon$.
Although the problem of learning parity with noise is meaningful for any constant $\epsilon < 1/2$, existing proofs of security for HB and HB+ only imply security when $\epsilon < 1/4$. In this note, we show how to extend these proofs to the case of arbitrary $\epsilon < 1/2$.

2006

EPRINT

Scrambling Adversarial Errors Using Few Random Bits, Optimal Information Reconciliation, and Better Private Codes
Abstract

When communicating over a noisy channel, it is typically much easier to deal with random, independent errors with a known distribution than with adversarial errors. This paper looks at how one can use schemes designed for random errors in an adversarial context, at the cost of relatively few additional random bits and without using unproven computational assumptions.
The basic approach is to permute the positions of a bit string using a permutation drawn from a $t$-wise independent family, where $t=o(n)$. This leads to two new results:
1. We construct *computationally efficient* information reconciliation protocols correcting $pn$ adversarial binary Hamming errors with optimal communication and entropy loss $n(h(p)+o(1))$ bits, where $n$ is the length of the strings and $h()$ is the binary entropy function. Information reconciliation protocols are important tools for dealing with noisy secrets in cryptography; they are also used to synchronize remote copies of large files.
2. We improve the randomness complexity (key length) of efficiently decodable capacity-approaching "private codes" from $\Theta(n\log n)$ to $n+o(n)$.
We also present a simplified proof of an existential result on private codes due to Langberg (FOCS '04).

2006

EPRINT

Tight Bounds for Unconditional Authentication Protocols in the Manual Channel and Shared Key Models
Abstract

We address the message authentication problem in two seemingly
different communication models. In the first model, the sender and
receiver are connected by an insecure channel and by a
low-bandwidth auxiliary channel, that enables the sender to
``manually'' authenticate one short message to the receiver (for
example, by typing a short string or comparing two short strings).
We consider this model in a setting where no computational
assumptions are made, and prove that for any $0 < \epsilon < 1$
there exists a $\log^* n$-round protocol for authenticating
$n$-bit messages, in which only $2 \log(1 / \epsilon) + O(1)$ bits
are manually authenticated, and any adversary (even
computationally unbounded) has probability of at most $\epsilon$
to cheat the receiver into accepting a fraudulent message.
Moreover, we develop a proof technique showing that our protocol
is essentially optimal by providing a lower bound of $2 \log(1 /
\epsilon) - O(1)$ on the required length of the manually
authenticated string.
The second model we consider is the traditional message
authentication model. In this model the sender and the receiver
share a short secret key; however, they are connected only by an
insecure channel. We apply the proof technique above to obtain a
lower bound of $2 \log(1 / \epsilon) - 2$ on the required Shannon
entropy of the shared key. This settles an open question posed by
Gemmell and Naor (CRYPTO '93).
Finally, we prove that one-way functions are {\em necessary} (and
sufficient) for the existence of protocols breaking the above
lower bounds in the computational setting.

2004

EPRINT

Efficient Consistency Proofs for Generalized Queries on a Committed Database
Abstract

A *consistent query protocol* (CQP) allows a database owner to publish a very short string $c$ which *commits* her and everybody else to a particular database $D$, so that any copy of the database can later be used to answer queries and give short proofs that the answers are consistent with the commitment $c$. Here *commits* means that there is at most one database $D$ that anybody can find (in polynomial time) which is consistent with $c$. (Unlike in some previous work, this strong guarantee holds even for owners who try to cheat while creating $c$.) Efficient CQPs for membership and one-dimensional range queries are known \cite{BRW02,K98,MR}: given a query pair $a,b\in \mathbb{R}$, the server answers with all the keys in the database which lie in the interval $[a,b]$ and a proof that the answer is correct.
This paper explores CQPs for more general types of databases. We put forward a general technique for constructing CQPs for any type of query, assuming the existence of a data structure/algorithm with certain inherent robustness properties that we define (called a *data robust algorithm*). We illustrate our technique by constructing an efficient protocol for *orthogonal range queries*, where the database keys are points in $\mathbb{R}^d$ and a query asks for all keys in a rectangle $[a_1,b_1]\times\ldots \times [a_d,b_d]$. Our data-robust algorithm is within a $O(\log N)$ factor of the best known standard data structure (a range tree, due to Bentley (1980)).
We modify our protocol so that it is also private, that is, the proofs leak no information about the database beyond the query answers. We show a generic modification to ensure privacy based on zero-knowledge proofs, and also give a new, more efficient protocol tailored to hash trees.

2004

EPRINT

Entropic Security and the Encryption of High Entropy Messages
Abstract

Russell and Wang (Eurocrypt 2002) recently introduced an elegant,
information-theoretic notion called entropic security of
encryption: they required that the cipher text leak no predicate of
the plaintext (similar to semantic security (Goldwasser and Micali,
JCSS 1984)) but only as long as the distribution on messages has high
entropy from the adversary's point of view. They show that this
notion of security can be achieved with very short keys for
entropically rich message spaces. Canetti, Miccianci and Reingold
(STOC, 1998) had previously constructed hash functions which satisfy a
similar entropic security condition. The output of such hash function
leaks no partial information about the input, provided the input has
sufficiently high entropy. This paper studies entropic security in
general, and its application to the encryption of high-entropy
messages.
(1) We elucidate the notion of entropic security. Our results apply to
all entropically-secure primitives, including both encryption and hash
functions. We strengthen the formulation of Canetti and Russell and
Wang: we require that an entropically-secure primitive leak no
function whasoever of its input, while the previous works focus only
on predicates. This stronger formulation is much closer to the
original notion of semantic security. We also show that entropic
security is equivalent to indistinguishability on pairs of input
distributions of sufficiently high entropy. This equivalence
generalizes the conventional equivalence between indistinguishability
and semantic security. Indistinguishability, in turn, is related to
randomness extraction from non-uniform distributions. The proof of
the equivalence of hiding predicates to hiding all functions is quite
involved, and requires very different techniques from those of
Goldwasser and Micali.
(2) We use the equivalence above, and the connection to randomness
extraction, to prove several new results on entropically-secure
encryption. We obtain:
(a) Two general frameworks for constructing entropically secure
encryption schemes, one based on expander graphs and the other on
XOR-universal hash functions. These schemes generalize the schemes of
Russell and Wang, yielding simpler constructions and proofs as well as
improved parameters. The best scheme uses a key of length
k=n-t+2log(1/eps)+O(1), where eps is a measure of leakage and
t is the initial min-entropy of the message.
(b) Lower bounds on the key length k for entropic security and
indistinguishability. In particular, we show near tightness of
Russell-Wang constructions: k > n-t. (In fact, for a large class
of schemes k >= n-t + log(1/eps).)

2003

EPRINT

Fuzzy Extractors: How to Generate Strong Keys from Biometrics and Other Noisy Data
Abstract

We provide formal definitions and efficient secure techniques for
-- turning noisy information into keys usable for any cryptographic application, and, in particular,
-- reliably and securely authenticating biometric data.
Our techniques apply not just to biometric information, but to any keying material that, unlike traditional cryptographic keys, is (1) not reproducible precisely and (2) not distributed uniformly. We propose two primitives: a fuzzy extractor reliably extracts nearly uniform randomness R from its input; the extraction is error-tolerant in the sense that R will be the same even if the input changes, as long as it remains reasonably close to the original. Thus, R can be used as a key in a cryptographic application. A secure sketch produces public information about its input w that does not reveal w, and yet allows exact recovery of w given another value that is close to w. Thus, it can be used to reliably reproduce error-prone biometric inputs without incurring the security risk inherent in storing them.
We define the primitives to be both formally secure and versatile, generalizing much prior work. In addition, we provide nearly optimal constructions of both primitives for various measures of "closeness" of input data, such as Hamming distance, edit distance, and set difference.

2002

EPRINT

Authentication of Quantum Messages
Abstract

Authentication is a well-studied area of classical cryptography: a sender A and a receiver B sharing a classical private key want to exchange a classical message with the guarantee that the message has not been modified or replaced by a dishonest party with control of the communication line. In this paper we study the authentication of messages composed of quantum states.
We give a formal definition of authentication in the quantum setting. Assuming A and B have access to an insecure quantum channel and share a private, classical random key, we provide a non-interactive scheme that both enables A to encrypt and authenticate (with unconditional security) an m qubit message by encoding it into m+s qubits, where the probability decreases exponentially in the security parameter s. The scheme requires a private key of size 2m+O(s). To achieve this, we give a highly efficient protocol for testing the purity of shared EPR pairs.
It has long been known that learning information about a general quantum state will necessarily disturb it. We refine this result to show that such a disturbance can be done with few side effects, allowing it to circumvent cryptographic protections. Consequently, any scheme to authenticate quantum messages must also encrypt them. In contrast, no such constraint exists classically: authentication and encryption are independent tasks, and one can authenticate a message while leaving it publicly readable.
This reasoning has two important consequences: On one hand, it allows us to give a lower bound of 2m key bits for authenticating m qubits, which makes our protocol asymptotically optimal. On the other hand, we use it to show that digitally signing quantum states is impossible, even with only computational security.

2001

EPRINT

Efficient and Non-Interactive Non-Malleable Commitment
Abstract

We present new constructions of non-malleable commitment schemes, in
the public parameter model (where a trusted party makes parameters
available to all parties), based on the discrete logarithm or RSA
assumptions. The main features of our schemes are: they achieve
near-optimal communication for arbitrarily-large messages and are
non-interactive. Previous schemes either required (several rounds of)
interaction or focused on achieving non-malleable commitment based on
general assumptions and were thus efficient only when committing to a
single bit. Although our main constructions are for the case of
perfectly-hiding commitment, we also present a communication-efficient,
non-interactive commitment scheme (based on general assumptions) that
is perfectly binding.

#### Program Committees

- Eurocrypt 2020
- Eurocrypt 2018
- TCC 2016
- TCC 2014
- Crypto 2013
- Crypto 2011
- TCC 2010
- Crypto 2009
- Crypto 2008
- Crypto 2007
- TCC 2006
- Crypto 2005

#### Coauthors

- Howard Barnum (1)
- Xavier Boyen (1)
- Ran Canetti (1)
- Shuchi Chawla (1)
- Albert Cheu (1)
- Claude Crépeau (2)
- Giovanni Di Crescenzo (2)
- Ivan Damgård (1)
- Yevgeniy Dodis (9)
- Cynthia Dwork (4)
- Benjamin Fuller (2)
- Craig Gentry (1)
- Daniel Gottesman (2)
- Vipul Goyal (1)
- Jens Groth (1)
- Sean Hallgren (2)
- Yuval Ishai (2)
- Shiva Kasiviswanathan (1)
- Shiva Prasad Kasiviswanathan (1)
- Jonathan Katz (9)
- Eike Kiltz (2)
- Mikkel Krøigaard (1)
- Mark Lewko (1)
- Moses Liskov (1)
- Anna Lysyanskaya (1)
- Frank McSherry (3)
- Silvio Micali (1)
- Payman Mohassel (1)
- Moni Naor (2)
- Jesper Buus Nielsen (1)
- Kobbi Nissim (3)
- Adam O'Neill (3)
- Rafail Ostrovsky (6)
- Omer Paneth (1)
- Chris Peikert (1)
- Charles Rackoff (1)
- Sofya Raskhodnikova (1)
- Leonid Reyzin (7)
- Amit Sahai (2)
- Gil Segev (2)
- Ronen Shaltiel (1)
- Ji Sun Shin (1)
- Fang Song (2)
- Alain Tapp (1)
- Luca Trevisan (1)
- Jonathan Ullman (1)
- Shabsi Walfish (1)
- Hoeteck Wee (1)
- David Zeber (1)
- Ye Zhang (2)
- Maxim Zhilyaev (1)