## CryptoDB

### Jonathan Ullman

#### Affiliation: Harvard University SEAS

#### Publications

**Year**

**Venue**

**Title**

2020

JOFC

PCPs and the Hardness of Generating Synthetic Data
Abstract

Assuming the existence of one-way functions, we show that there is no polynomial-time differentially private algorithm $${\mathcal {A}}$$ A that takes a database $$D\in (\{0,1\}^d)^n$$ D ∈ ( { 0 , 1 } d ) n and outputs a “synthetic database” $${\hat{D}}$$ D ^ all of whose two-way marginals are approximately equal to those of D . (A two-way marginal is the fraction of database rows $$x\in \{0,1\}^d$$ x ∈ { 0 , 1 } d with a given pair of values in a given pair of columns.) This answers a question of Barak et al. (PODS ‘07), who gave an algorithm running in time $$\mathrm {poly}(n,2^d)$$ poly ( n , 2 d ) . Our proof combines a construction of hard-to-sanitize databases based on digital signatures (by Dwork et al., STOC ‘09) with encodings based on the PCP theorem. We also present both negative and positive results for generating “relaxed” synthetic data, where the fraction of rows in D satisfying a predicate c are estimated by applying c to each row of $${\hat{D}}$$ D ^ and aggregating the results in some way.

2019

EUROCRYPT

Distributed Differential Privacy via Shuffling
Abstract

We consider the problem of designing scalable, robust protocols for computing statistics about sensitive data. Specifically, we look at how best to design differentially private protocols in a distributed setting, where each user holds a private datum. The literature has mostly considered two models: the “central” model, in which a trusted server collects users’ data in the clear, which allows greater accuracy; and the “local” model, in which users individually randomize their data, and need not trust the server, but accuracy is limited. Attempts to achieve the accuracy of the central model without a trusted server have so far focused on variants of cryptographic multiparty computation (MPC), which limits scalability.In this paper, we initiate the analytic study of a shuffled model for distributed differentially private algorithms, which lies between the local and central models. This simple-to-implement model, a special case of the ESA framework of [5], augments the local model with an anonymous channel that randomly permutes a set of user-supplied messages. For sum queries, we show that this model provides the power of the central model while avoiding the need to trust a central server and the complexity of cryptographic secure function evaluation. More generally, we give evidence that the power of the shuffled model lies strictly between those of the central and local models: for a natural restriction of the model, we show that shuffled protocols for a widely studied selection problem require exponentially higher sample complexity than do central-model protocols.

2018

CRYPTO

Hardness of Non-interactive Differential Privacy from One-Way Functions
📺
Abstract

A central challenge in differential privacy is to design computationally efficient non-interactive algorithms that can answer large numbers of statistical queries on a sensitive dataset. That is, we would like to design a differentially private algorithm that takes a dataset $$D \in X^n$$D∈Xn consisting of some small number of elements n from some large data universe X, and efficiently outputs a summary that allows a user to efficiently obtain an answer to any query in some large family Q.Ignoring computational constraints, this problem can be solved even when X and Q are exponentially large and n is just a small polynomial; however, all algorithms with remotely similar guarantees run in exponential time. There have been several results showing that, under the strong assumption of indistinguishability obfuscation, no efficient differentially private algorithm exists when X and Q can be exponentially large. However, there are no strong separations between information-theoretic and computationally efficient differentially private algorithms under any standard complexity assumption.In this work we show that, if one-way functions exist, there is no general purpose differentially private algorithm that works when X and Q are exponentially large, and n is an arbitrary polynomial. In fact, we show that this result holds even if X is just subexponentially large (assuming only polynomially-hard one-way functions). This result solves an open problem posed by Vadhan in his recent survey [52].

#### Program Committees

- TCC 2015

#### Coauthors

- Albert Cheu (1)
- Anupam Gupta (1)
- Lucas Kowalczyk (2)
- Tal Malkin (2)
- Aaron Roth (1)
- Adam Smith (1)
- Salil Vadhan (1)
- Salil P. Vadhan (1)
- Daniel Wichs (1)
- David Zeber (1)
- Mark Zhandry (1)
- Maxim Zhilyaev (1)