## CryptoDB

### Kasper Green Larsen

#### ORCID: 0000-0001-8841-5929

#### Publications

**Year**

**Venue**

**Title**

2023

EUROCRYPT

How to Compress Encrypted Data
Abstract

We study the task of obliviously compressing a vector comprised of $n$ ciphertexts of size $\xi$ bits each, where at most $t$ of the corresponding plaintexts are non-zero.
This problem commonly features in applications involving encrypted outsourced storages, such as searchable encryption or oblivious message retrieval.
We present two new algorithms with provable worst-case guarantees, solving this problem by using only homomorphic additions and multiplications by constants.
Both of our new constructions improve upon the state of the art asymptotically and concretely.
Our first construction, based on sparse polynomials, is perfectly correct and the first to achieve an asymptotically optimal compression rate by compressing the input vector into $\bigO{t \xi}$ bits.
Compression can be performed homomorphically by performing $\bigO{n \log n}$ homomorphic additions and multiplications by constants.
The main drawback of this construction is a decoding complexity of $\Omega(\sqrt{n})$.
Our second construction is based on a novel variant of invertible bloom lookup tables and is correct with probability $1-2^{-\kappa}$.
It has a slightly worse compression rate compared to our first construction as it compresses the input vector into $\bigO{\xi\kappa t /\log t}$ bits, where $\kappa \geq \log t$.
In exchange, both compression and decompression of this construction are highly efficient.
The compression complexity is dominated by $\bigO{n \kappa/\log t}$ homomorphic additions and multiplications by constants.
The decompression complexity is dominated by $\bigO{\kappa t /\log t}$ decryption operations and equally many inversions of a pseudorandom permutation.

2022

EUROCRYPT

Property-Preserving Hash Functions for Hamming Distance from Standard Assumptions
📺
Abstract

Property-preserving hash functions allow for compressing long inputs $x_0$ and $x_1$ into short hashes $h(x_0)$ and $h(x_1)$ in a manner that allows for computing a predicate $P(x_0, x_1)$ given only the two hash values without having access to the original data.
Such hash functions are said to be adversarially robust if an adversary that gets to pick $x_0$ and $x_1$ after the hash function has been sampled, cannot find inputs for which the predicate evaluated on the hash values outputs the incorrect result.
In this work we construct robust property-preserving hash functions for the hamming-distance predicate which distinguishes inputs with a hamming distance at least some threshold $t$ from those with distance less than $t$. The security of the construction is based on standard lattice hardness assumptions.
Our construction has several advantages over the best known previous construction by Fleischhacker and Simkin (Eurocrypt 2021).
Our construction relies on a single well-studied hardness assumption from lattice cryptography whereas the previous work relied on a newly introduced family of computational hardness assumptions.
In terms of computational effort, our construction only requires a small number of modular additions per input bit, whereas the work of Fleischhacker and Simkin required several exponentiations per bit as well as the interpolation and evaluation of high-degree polynomials over large fields.
An additional benefit of our construction is that the description of the hash function can be compressed to $\lambda$ bits assuming a random oracle.
Previous work has descriptions of length $\bigO{\ell \lambda}$ bits for input bit-length $\ell$.
We prove a lower bound on the output size of any property-preserving hash function for the hamming distance predicate.
The bound shows that the size of our hash value is not far from optimal.

2020

TCC

Lower Bounds for Multi-Server Oblivious RAMs
📺
Abstract

In this work, we consider oblivious RAMs (ORAM) in a setting with multiple servers and the adversary may corrupt a subset of the servers. We present an $\Omega(log n)$ overhead lower bound for any k-server ORAM that limits any PPT adversary to distinguishing advantage at most $1/4k$ when only one server is corrupted. In other words, if one insists on negligible distinguishing advantage, then multi-server ORAMs cannot be faster than single-server ORAMs even with polynomially many servers of which only one unknown server is corrupted. Our results apply to ORAMs that may err with probability at most 1/128 as well as scenarios where the adversary corrupts larger subsets of servers. We also extend our lower bounds to other important data structures including oblivious stacks, queues, deques, priority queues and search trees.

2019

CRYPTO

Communication Lower Bounds for Statistically Secure MPC, With or Without Preprocessing
📺
Abstract

We prove a lower bound on the communication complexity of unconditionally secure multiparty computation, both in the standard model with
$$n=2t+1$$
parties of which t are corrupted, and in the preprocessing model with
$$n=t+1$$
. In both cases, we show that for any
$$g \in \mathbb {N}$$
there exists a Boolean circuit C with g gates, where any secure protocol implementing C must communicate
$$\varOmega (n g)$$
bits, even if only passive and statistical security is required. The results easily extends to constructing similar circuits over any fixed finite field. This shows that for all sizes of circuits, the O(n) overhead of all known protocols when t is maximal is inherent. It also shows that security comes at a price: the circuit we consider could namely be computed among n parties with communication only O(g) bits if no security was required. Our results extend to the case where the threshold t is suboptimal. For the honest majority case, this shows that the known optimizations via packed secret-sharing can only be obtained if one accepts that the threshold is
$$t= (1/2 - c)n$$
for a constant c. For the honest majority case, we also show an upper bound that matches the lower bound up to a constant factor (existing upper bounds are a factor
$$\lg n$$
off for Boolean circuits).

2018

CRYPTO

Yes, There is an Oblivious RAM Lower Bound!
📺 ★
Abstract

An Oblivious RAM (ORAM) introduced by Goldreich and Ostrovsky [JACM’96] is a (possibly randomized) RAM, for which the memory access pattern reveals no information about the operations performed. The main performance metric of an ORAM is the bandwidth overhead, i.e., the multiplicative factor extra memory blocks that must be accessed to hide the operation sequence. In their seminal paper introducing the ORAM, Goldreich and Ostrovsky proved an amortized
$$\varOmega (\lg n)$$
bandwidth overhead lower bound for ORAMs with memory size n. Their lower bound is very strong in the sense that it applies to the “offline” setting in which the ORAM knows the entire sequence of operations ahead of time.However, as pointed out by Boyle and Naor [ITCS’16] in the paper “Is there an oblivious RAM lower bound?”, there are two caveats with the lower bound of Goldreich and Ostrovsky: (1) it only applies to “balls in bins” algorithms, i.e., algorithms where the ORAM may only shuffle blocks around and not apply any sophisticated encoding of the data, and (2), it only applies to statistically secure constructions. Boyle and Naor showed that removing the “balls in bins” assumption would result in super linear lower bounds for sorting circuits, a long standing open problem in circuit complexity. As a way to circumventing this barrier, they also proposed a notion of an “online” ORAM, which is an ORAM that remains secure even if the operations arrive in an online manner. They argued that most known ORAM constructions work in the online setting as well.Our contribution is an
$$\varOmega (\lg n)$$
lower bound on the bandwidth overhead of any online ORAM, even if we require only computational security and allow arbitrary representations of data, thus greatly strengthening the lower bound of Goldreich and Ostrovsky in the online setting. Our lower bound applies to ORAMs with memory size n and any word size
$$r \ge 1$$
. The bound therefore asymptotically matches the known upper bounds when
$$r = \varOmega (\lg ^2 n)$$
.

#### Coauthors

- Carsten Baum (1)
- Ivan Damgård (2)
- Nils Fleischhacker (2)
- Kasper Green Larsen (6)
- Michael Nielsen (1)
- Jesper Buus Nielsen (2)
- Mark Simkin (3)
- Kevin Yeo (1)