International Association for Cryptologic Research

International Association
for Cryptologic Research

CryptoDB

Subhadeep Banik

Publications

Year
Venue
Title
2024
TCHES
Compact Circuits for Efficient Möbius Transform
Subhadeep Banik Francesco Regazzoni
The Möbius transform is a linear circuit used to compute the evaluations of a Boolean function over all points on its input domain. The operation is very useful in finding the solution of a system of polynomial equations over GF(2) for obvious reasons. However the operation, although linear, needs exponential number of logic operations (around n · 2n−1 bit xors) for an n-variable Boolean function. As such, the only known hardware circuit to efficiently compute the Möbius Transform requires silicon area that is exponential in n. For Boolean functions whose algebraic degree is bound by some parameter d, recursive definitions of the Möbius Transform exist that requires only O(nd+1) space in software. However converting the mathematical definition of this space-efficient algorithm into a hardware architecture is a non-trivial task, primarily because the recursion calls notionally lead to a depth-first search in a transition graph that requires context switches at each recursion call for which straightforward mapping to hardware is difficult. In this paper we look to overcome these very challenges in an engineering sense. We propose a space efficient sequential hardware circuit for the Möbius Transform that requires only polynomial circuit area (i.e. O(nd+1)) provided the algebraic degree of the Boolean function is limited to d. We show how this circuit can be used as a component to efficiently solve polynomial equations of degree at most d by using fast exhaustive search. We propose three different circuit architectures for this, each of which uses the Möbius Transform circuit as a core component. We show that asymptotically, all the solutions of a system of m polynomials in n unknowns and algebraic degree d over GF(2) can be found using a circuit of silicon area proportional to m · nd+1 and circuit depth proportional to 2 · log2(n − d).In the second part of the paper we introduce a fourth hardware solver that additionally aims to achieve energy efficiency. The main idea is to reduce the solution space to a small enough value by parallel application of Möbius Transform circuits over the first few equations of the system. This is done so that one can check individually whether the vectors of this reduced solution space satisfy each of the remaining equations of the system using lower power consumption. The new circuit has area also bound by m · nd+1 and has circuit depth proportional to d · log2 n. We also show that further optimizations with respect to energy consumption may be obtained by using depth-bound Möbius circuits that exponentially decrease run time at the cost of additional logic area and depth.
2024
TCHES
Gleeok: A Family of Low-Latency PRFs and its Applications to Authenticated Encryption
In this paper, we propose a new family of low-latency pseudorandom functions (PRFs), dubbed Gleeok.Gleeok utilizes three 128-bit branches to achieve a 256-bit key size while maintaining low latency. The first two branches are specifically designed to defend against statistical attacks, especially for differential attacks, while the third branch provides resilience against algebraic attacks. This unique design enables Gleeok to offer ultralow latency while supporting 256-bit keys, setting it apart from existing ciphers dedicated to low-latency requirements. In addition, we propose wide-block variants having three 256-bit branches. We also present an application of Gleeok to short-input authenticated encryption which is crucial for memory encryption and various realtime communication applications. Furthermore, we present comprehensive hardware implementation results that establish the capabilities of Gleeok and demonstrate its competitiveness against related schemes in the literature. In particular, Gleeok achieves a minimum latency of roughly 360 ps with the NanGate 15 nm cell library and is thus on par with related low-latency schemes that only feature 128-bit keys while maintaining minimal overhead when equipped in an authenticated mode of operation.
2023
TOSC
The QARMAv2 Family of Tweakable Block Ciphers
We introduce the QARMAv2 family of tweakable block ciphers. It is a redesign of QARMA (from FSE 2017) to improve its security bounds and allow for longer tweaks, while keeping similar latency and area. The wider tweak input caters to both specific use cases and the design of modes of operation with higher security bounds. This is achieved through new key and tweak schedules, revised S-Box and linear layer choices, and a more comprehensive security analysis. QARMAv2 offers competitive latency and area in fully unrolled hardware implementations.Some of our results may be of independent interest. These include: new MILP models of certain classes of diffusion matrices; the comparative analysis of a full reflection cipher against an iterative half-cipher; our boomerang attack framework; and an improved approach to doubling the width of a block cipher.
2022
TOSC
Cryptanalysis of Draco
Subhadeep Banik
Draco is a lightweight stream cipher designed by Hamann et al. in IACR ToSC 2022. It has a Grain-like structure with two state registers of size 95 and 33 bits. In addition, the cipher uses a 128-bit secret key and a 96-bit IV. The first 32 bits of the key and the IV forms a non-volatile internal state that does not change during the time that the cipher produces keystream bits.The authors claim that the cipher is provably secure against Time-Memory-Data (TMD) Tradeoff attacks. However in this paper, we first present two TMD tradeoff attacks against Draco. Both attacks leverage the fact that for certain judiciously chosen IVs the state update function of the cipher depend on only a small fraction of the non-volatile internal state. This makes the state update function in Draco essentially a one way function over a much smaller domain and range. The first attack requires around 2114.2 Draco iterations and requires that the adversary has access to 232 chosen IVs. The second attack is such that the attack parameters can be tuned as per the requirements of the attacker. If the attacker prioritizes that the number of different chosen IVs is limited to 220 say, then the attack can be done in around time proportional to 2126 Draco rounds. However if the total attack complexity is to be optimized, then the attack can be performed in 2107 time using around 240 chosen IVs.
2021
TOSC
Atom: A Stream Cipher with Double Key Filter 📺
It has been common knowledge that for a stream cipher to be secure against generic TMD tradeoff attacks, the size of its internal state in bits needs to be at least twice the size of the length of its secret key. In FSE 2015, Armknecht and Mikhalev however proposed the stream cipher Sprout with a Grain-like architecture, whose internal state was equal in size with its secret key and yet resistant against TMD attacks. Although Sprout had other weaknesses, it germinated a sequence of stream cipher designs like Lizard and Plantlet with short internal states. Both these designs have had cryptanalytic results reported against them. In this paper, we propose the stream cipher Atom that has an internal state of 159 bits and offers a security of 128 bits. Atom uses two key filters simultaneously to thwart certain cryptanalytic attacks that have been recently reported against keystream generators. In addition, we found that our design is one of the smallest stream ciphers that offers this security level, and we prove in this paper that Atom resists all the attacks that have been proposed against stream ciphers so far in literature. On the face of it, Atom also builds on the basic structure of the Grain family of stream ciphers. However, we try to prove that by including the additional key filter in the architecture of Atom we can make it immune to all cryptanalytic advances proposed against stream ciphers in recent cryptographic literature.
2021
TOSC
Orthros: A Low-Latency PRF 📺
We present Orthros, a 128-bit block pseudorandom function. It is designed with primary focus on latency of fully unrolled circuits. For this purpose, we adopt a parallel structure comprising two keyed permutations. The round function of each permutation is similar to Midori, a low-energy block cipher, however we thoroughly revise it to reduce latency, and introduce different rounds to significantly improve cryptographic strength in a small number of rounds. We provide a comprehensive, dedicated security analysis. For hardware implementation, Orthros achieves the lowest latency among the state-of-the-art low-latency primitives. For example, using the STM 90nm library, Orthros achieves a minimum latency of around 2.4 ns, while other constructions like PRINCE, Midori-128 and QARMA9-128- σ0 achieve 2.56 ns, 4.10 ns, 4.38 ns respectively.
2021
ASIACRYPT
New Attacks on LowMC instances with a Single Plaintext/Ciphertext pair 📺
Cryptanalysis of the LowMC block cipher when the attacker has access to a single known plaintext/ciphertext pair is a mathematically challenging problem. This is because the attacker is unable to employ most of the standard techniques in symmetric cryptography like linear and differential cryptanalysis. This scenario is particularly relevant while arguing the security of the Picnic digital signature scheme in which the plaintext/ciphertext pair generated by the LowMC block cipher serves as the public (verification) key and the corresponding LowMC encryption key also serves as the secret (signing) key of the signature scheme. In the paper by Banik et al. (IACR ToSC 2020:4), the authors used a linearization technique of the LowMC S-box to mount attacks on some instances of the block cipher. In this paper, we first make a more precise complexity analysis of the linearization attack. Then, we show how to perform a 2-stage MITM attack on LowMC. The first stage reduces the key candidates corresponding to a fraction of key bits of the master key. The second MITM stage between this reduced candidate set and the remaining fraction of key bits successfully recovers the master key. We show that the combined computational complexity of both these stages is significantly lower than those reported in the ToSC paper by Banik et al.
2021
TOSC
Perfect Trees: Designing Energy-Optimal Symmetric Encryption Primitives 📺
Energy efficiency is critical in battery-driven devices, and designing energyoptimal symmetric-key ciphers is one of the goals for the use of ciphers in such environments. In the paper by Banik et al. (IACR ToSC 2018), stream ciphers were identified as ideal candidates for low-energy solutions. One of the main conclusions of this paper was that Trivium, when implemented in an unrolled fashion, was by far the most energy-efficient way of encrypting larger quantity of data. In fact, it was shown that as soon as the number of databits to be encrypted exceeded 320 bits, Trivium consumed the least amount of energy on STM 90 nm ASIC circuits and outperformed the Midori family of block ciphers even in the least energy hungry ECB mode (Midori was designed specifically for energy efficiency).In this work, we devise the first heuristic energy model in the realm of stream ciphers that links the underlying algebraic topology of the state update function to the consumptive behaviour. The model is then used to derive a metric that exhibits a heavy negative correlation with the energy consumption of a broad range of stream cipher architectures, i.e., the families of Trivium-like, Grain-like and Subterranean-like constructions. We demonstrate that this correlation is especially pronounced for Trivium-like ciphers which leads us to establish a link between the energy consumption and the security guarantees that makes it possible to find several alternative energy-optimal versions of Trivium that meet the requirements but consume less energy. We present two such designs Trivium-LE(F) and Trivium-LE(S) that consume around 15% and 25% less energy respectively making them the to date most energy-efficient encryption primitives. They inherit the same security level as Trivium, i.e., 80-bit security. We further present Triad-LE as an energy-efficient variant satisfying a higher security level. The simplicity and wide applicability of our model has direct consequences for the conception of future hardware-targeted stream ciphers as for the first time it is possible to optimize for energy during the design phase. Moreover, we extend the reach of our model beyond plain encryption primitives and propose a novel energy-efficient message authentication code Trivium-LE-MAC.
2020
TOSC
Swap and Rotate: Lightweight Linear Layers for SPN-based Blockciphers 📺
In CHES 2017, Jean et al. presented a paper on “Bit-Sliding” in which the authors proposed lightweight constructions for SPN based block ciphers like AES, PRESENT and SKINNY. The main idea behind these constructions was to reduce the length of the datapath to 1 bit and to reformulate the linear layer for these ciphers so that they require fewer scan flip-flops (which have built-in multiplexer functionality and so larger in area as compared to a simple flip-flop). In this paper, we develop their idea even further in few separate directions.First, we prove that given an arbitrary linear transformation, it is always possible to construct the linear layer using merely 2 scan flip-flops. This points to an optimistic venue to follow to gain further GE reductions, yet the straightforward application of the techniques in our proof to PRESENT and GIFT leads to inefficient implementations of the linear layer, as reducing ourselves to 2 scan flip-flops setting requires thousands of clock cycles and leads to very high latency.Equipped with the well-established formalism on permutation groups, we explore whether we can reduce the number of clock cycles to a practical level, i.e. few hundreds, by adding few more pairs of scan flip flops. For PRESENT, we show that 4 (resp. 8, 12) scan flip-flops are sufficient to complete the permutation layer in 384 (resp. 256, 128) clock cycles. For GIFT, we show that 4 (resp. 8, 10) scan flip flops correspond to 320 (resp. 192, 128) clock cycles. Finally, in order to provide the best of the two worlds (i.e. circuit area and latency), we push our scan flip-flop choices even further to completely eliminate the latency incurred by the permutation layer, without compromising our stringent GE budget. We show that not only 12 scan flip flops are sufficient to execute PRESENT permutation in 64 clock cycles, but also the same scan flip flops can be used readily in a combined encryption decryption circuit. Our final design of PRESENT and GIFT beat the record of Jean et al. and Banik et al. in both latency and in circuit-size metric. We believe that the techniques presented in our work can also be used at choosing bit-sliding-friendly linear layer permutations for the future SPN-based designs.
2020
TCHES
The Area-Latency Symbiosis: Towards Improved Serial Encryption Circuits 📺
Fatih Balli Andrea Caforio Subhadeep Banik
The bit-sliding paper of Jean et al. (CHES 2017) showed that the smallest-size circuit for SPN based block ciphers such as AES, SKINNY and PRESENT can be achieved via bit-serial implementations. Their technique decreases the bit size of the datapath and naturally leads to a significant loss in latency (as well as the maximum throughput). Their designs complete a single round of the encryption in 168 (resp. 68) clock cycles for 128 (resp. 64) bit blocks. A follow-up work by Banik et al. (FSE 2020) introduced the swap-and-rotate technique that both eliminates this loss in latency and achieves even smaller footprints.In this paper, we extend these results on bit-serial implementations all the way to four authenticated encryption schemes from NIST LWC. Our first focus is to decrease latency and improve throughput with the use of the swap-and-rotate technique. Our block cipher implementations have the most efficient round operations in the sense that a round function of an n-bit block cipher is computed in exactly n clock cycles. This leads to implementations that are similar in size to the state of the art, but have much lower latency (savings up to 20 percent). We then extend our technique to 4- and 8-bit implementations. Although these results are promising, block ciphers themselves are not end-user primitives, as they need to be used in conjunction with a mode of operation. Hence, in the second part of the paper, we use our serial block ciphers to bootstrap four active NIST authenticated encryption candidates: SUNDAE-GIFT, Romulus, SAEAES and SKINNY-AEAD. In the wake of this effort, we provide the smallest block-cipher-based authenticated encryption circuits known in the literature so far.
2020
TOSC
Cryptanalysis of LowMC instances using single plaintext/ciphertext pair
Arguably one of the main applications of the LowMC family ciphers is in the post-quantum signature scheme PICNIC. Although LowMC family ciphers have been studied from a cryptanalytic point of view before, none of these studies were directly concerned with the actual use case of this cipher in PICNIC signature scheme. Due to the design paradigm of PICNIC, an adversary trying to perform a forgery attack on the signature scheme instantiated with LowMC would have access to only a single given plaintext/ciphertext pair, i.e. an adversary would only be able to perform attacks with data complexity 1 in a known-plaintext attack scenario. This restriction makes it impossible to employ classical cryptanalysis methodologies such as differential and linear cryptanalysis. In this paper we introduce two key-recovery attacks, both in known-plaintext model and of data complexity 1 for two variants of LowMC, both instances of the LowMC cryptanalysis challenge.
2019
TOSC
Cryptanalysis of Plantlet 📺
Plantlet is a lightweight stream cipher designed by Mikhalev, Armknecht and Müller in IACR ToSC 2017. It has a Grain-like structure with two state registers of size 40 and 61 bits. In spite of this, the cipher does not seem to lose in security against generic Time-Memory-Data Tradeoff attacks due to the novelty of its design. The cipher uses a 80-bit secret key and a 90-bit IV. In this paper, we first present a key recovery attack on Plantlet that requires around 276.26 Plantlet encryptions. The attack leverages the fact that two internal states of Plantlet that differ in the 43rd LFSR location are guaranteed to produce keystream that are either equal or unequal in 45 locations with probability 1. Thus an attacker can with some probability guess that when 2 segments of keystream blocks possess the 45 bit difference just mentioned, they have been produced by two internal states that differ only in the 43rd LFSR location. Thereafter by solving a system of polynomial equations representing the keystream bits, the attacker can find the secret key if his guess was indeed correct, or reach some kind of contradiction if his guess was incorrect. In the latter event, he would repeat the procedure for other keystream blocks with the given difference. We show that the process when repeated a finite number of times, does indeed yield the value of the secret key. In the second part of the paper, we observe that the previous attack was limited to internal state differences that occurred at time instances that were congruent to 0 mod 80. We further observe that by generalizing the attack to include internal state differences that are congruent to all equivalence classed modulo 80, we lower the total number of keystream bits required to perform the attack and in the process reduce the attack complexity to 269.98 Plantlet encryptions.
2018
TOSC
Towards Low Energy Stream Ciphers 📺
Energy optimization is an important design aspect of lightweight cryptography. Since low energy ciphers drain less battery, they are invaluable components of devices that operate on a tight energy budget such as handheld devices or RFID tags. At Asiacrypt 2015, Banik et al. presented the block cipher family Midori which was designed to optimize the energy consumed per encryption and which reduces the energy consumption by more than 30% compared to previous block ciphers. However, if one has to encrypt/decrypt longer streams of data, i.e. for bulk data encryption/decryption, it is expected that a stream cipher should perform even better than block ciphers in terms of energy required to encrypt. In this paper, we address the question of designing low energy stream ciphers. To this end, we analyze for common stream cipher design components their impact on the energy consumption. Based on this, we give arguments why indeed stream ciphers allow for encrypting long data streams with less energy than block ciphers and validate our findings by implementations. Afterwards, we use the analysis results to identify energy minimizing design principles for stream ciphers.
2018
TOSC
SUNDAE: Small Universal Deterministic Authenticated Encryption for the Internet of Things 📺
Lightweight cryptography was developed in response to the increasing need to secure devices for the Internet of Things. After significant research effort, many new block ciphers have been designed targeting lightweight settings, optimizing efficiency metrics which conventional block ciphers did not. However, block ciphers must be used in modes of operation to achieve more advanced security goals such as data confidentiality and authenticity, a research area given relatively little attention in the lightweight setting. We introduce a new authenticated encryption (AE) mode of operation, SUNDAE, specially targeted for constrained environments. SUNDAE is smaller than other known lightweight modes in implementation area, such as CLOC, JAMBU, and COFB, however unlike these modes, SUNDAE is designed as a deterministic authenticated encryption (DAE) scheme, meaning it provides maximal security in settings where proper randomness is hard to generate, or secure storage must be minimized due to expense. Unlike other DAE schemes, such as GCM-SIV, SUNDAE can be implemented efficiently on both constrained devices, as well as the servers communicating with those devices. We prove SUNDAE secure relative to its underlying block cipher, and provide an extensive implementation study, with results in both software and hardware, demonstrating that SUNDAE offers improved compactness and power consumption in hardware compared to other lightweight AE modes, while simultaneously offering comparable performance to GCM-SIV on parallel high-end platforms.
2017
TOSC
Analysis of Software Countermeasures for Whitebox Encryption
Whitebox cryptography aims to ensure the security of cryptographic algorithms in the whitebox model where the adversary has full access to the execution environment. To attain security in this setting is a challenging problem: Indeed, all published whitebox implementations of standard symmetric-key algorithms such as AES to date have been practically broken. However, as far as we know, no whitebox implementation in real-world products has suffered from a key recovery attack. This is due to the fact that commercial products deploy additional software protection mechanisms on top of the whitebox implementation. This makes practical attacks much less feasible in real-world applications. There are numerous software protection mechanisms which protect against standard whitebox attacks. One such technique is control flow obfuscation which randomizes the order of table lookups for each execution of the whitebox encryption module. Another technique is randomizing the locations of the various Look up tables (LUTs) in the memory address space. In this paper we investigate the effectiveness of these countermeasures against two attack paradigms. The first known as Differential Computational Analysis (DCA) attack was developed by Bos, Hubain, Michiels and Teuwen in CHES 2016. The attack passively collects software execution traces for several plaintext encryptions and uses the collected data to perform an analysis similar to the well known differential power attacks (DPA) to recover the secret key. Since the software execution traces contain time demarcated physical addresses of memory locations being read/written into, they essentially leak the values of the inputs to the various LUTs accessed during the whitebox encryption operation, which as it turns out leaks sufficient information to perform the power attack. We found that if in addition to control flow obfuscation, one were to randomize the locations of the LUTs in the memory, then it is very difficult to perform the DCA on the resultant system using such table inputs and extract the secret key in reasonable time. As an alternative, we investigate the version of the DCA attack which uses the outputs of the tables instead of the inputs to mount the power analysis attack. This modified DCA is able to extract the secret key from the flow obfuscated and location randomized versions of several whitebox binaries available in crypto literature. We develop another attack called the Zero Difference Enumeration (ZDE) attack. The attack records software traces for several pairs of strategically selected plaintexts and performs a simple statistical test on the effective difference of the traces to extract the secret key. We show that ZDE is able to recover the keys of whitebox systems. Finally we propose a new countermeasure for protecting whitebox binaries based on insertion of random delays which aims to make both the ZDE and DCA attackspractically difficult by adding random noise in the information leaked to the attacker.
2017
TOSC
Some cryptanalytic results on Lizard
Lizard is a lightweight stream cipher proposed by Hamann, Krause and Meier in IACR ToSC 2017. It has a Grain-like structure with two state registers of size 90 and 31 bits. The cipher uses a 120-bit secret key and a 64-bit IV. The authors claim that Lizard provides 80-bit security against key recovery attacks and a 60-bit security against distinguishing attacks. In this paper, we present an assortment of results and observations on Lizard. First, we show that by doing 258 random trials it is possible to find a set of 264 triplets (K, IV0, IV1) such that the Key-IV pairs (K, IV0) and (K, IV1) produce identical keystream bits. Second, we show that by performing only around 228 random trials it is possible to obtain 264 Key-IV pairs (K0, IV0) and (K1, IV1) that produce identical keystream bits. Thereafter, we show that one can construct a distinguisher for Lizard based on IVs that produce shifted keystream sequences. The process takes around 251.5 random IV encryptions (with encryption required to produce 218 keystream bits) and around 276.6 bits of memory. Next, we propose a key recovery attack on a version of Lizard with the number of initialization rounds reduced to 223 (out of 256) based on IV collisions. We then outline a method to extend our attack to 226 rounds. Our results do not affect the security claims of the designers.
2017
CHES
GIFT: A Small Present
In this article, we revisit the design strategy of PRESENT, leveraging all the advances provided by the research community in construction and cryptanalysis since its publication, to push the design up to its limits. We obtain an improved version, named GIFT, that provides a much increased efficiency in all domains (smaller and faster), while correcting the well-known weakness of PRESENT with regards to linear hulls. GIFT is a very simple and clean design that outperforms even SIMON or SKINNY for round-based implementations, making it one of the most energy efficient ciphers as of today. It reaches a point where almost the entire implementation area is taken by the storage and the Sboxes, where any cheaper choice of Sbox would lead to a very weak proposal. In essence, GIFT is composed of only Sbox and bit-wiring, but its natural bitslice data flow ensures excellent performances in all scenarios, from area-optimised hardware implementations to very fast software implementation on high-end platforms.We conducted a thorough analysis of our design with regards to state-of-the-art cryptanalysis, and we provide strong bounds with regards to differential/linear attacks.
2016
FSE
2015
ASIACRYPT
2013
CHES
2012
CHES

Program Committees

FSE 2023
FSE 2022
FSE 2020
FSE 2019