Cube attacks are an important type of key recovery attacks against NFSR-based cryptosystems. The key step in cube attacks closely related to key recovery is recovering superpolies. However, in the previous well-known cube attacks including original, division property based, and correlation cube attacks, the algebraic normal forms of superpolies could hardly be proved to be exact, which involves a small failure probability or unpractical computations. In this paper, we propose a new variant of cube attacks called deterministic cube attacks, which aims at recovering the exact algebraic normal forms of superpolies efficiently and practically. These new attacks are developed based on degree evaluation method proposed by Liu in CRYPTO2017. We apply our new cube attacks to the round-reduced Trivium. As a result, we recover the exact algebraic normal forms of some superpolies for the 818-, 819-, 837-, and 838-round Trivium. By the way, it is proved that the best cube of size 37 given by Liu in CRYPTO2017 is not a zero-sum distinguisher but a zero-biased distinguisher by recovering its exact superpoly for the first time. To the best of our knowledge, it is the first time that superpolies of cubes with the sizes less than 40 could be practically recovered for Trivium up to 838 rounds. Hopefully, our new attacks would provide some new insights on cube attacks against NFSR-based ciphers.

Parallel cryptographic implementations are generally considered to be more advantageous than their non-parallel counterparts in mitigating side-channel attacks because of their higher noise-level. So far as we know, the side-channel security of GPU-based cryptographic implementations have been studied in recent years, and those implementations then turn out to be susceptible to some side-channel attacks. Unfortunately, the target parallel implementations in their work do not achieve strict parallelism because of the occurrence of cached memory accesses or the use of conditional branches, so how strict parallelism affects the side-channel security of cryptographic implementations is still an open problem. In this work, we make a case study of the side-channel security of a GPU-based bitsliced AES implementation in terms of bit-level parallelism and thread-level parallelism in order to show the way that works to reduce the side-channel security of strict parallel implementations. We present GPU-based bitsliced AES implementation as the study case because (1) it achieves strict parallelism so as to be resistant to cache-based attacks and timing attacks; and (2) it achieves both bit-level parallelism and thread-level parallelism (a.k.a. task-level parallelism), which enables us to research from multiple perspectives. More specifically, we first set up our testbed and collect electro-magnetic (EM) traces with some special techniques. Then, the measured traces are analyzed in two granularity. In bit-level parallelism, we give a non-profiled leakage detection test before mounting attacks with our proposed bit-level fusion techniques like multi-bits feature-level fusion attacks (MBFFA) and multi-bits decision-level fusion attacks (MBDFA). In thread-level parallelism, a profiled leakage detection test is employed to extract some special information from multi-threads leakages, and with the help of those information our proposed multi-threads hybrid fusion attack (MTHFA) method takes effect. Last, we propose a simple metric to quantify the side-channel security of parallel cryptographic implementations. Our research shows that the secret key of our target implementation can be recovered with less cost than expected, which suggests that the side-channel security of parallel cryptographic implementations should be reevaluated before application.

Most classical consensus protocols rely on a leader to coordinate nodes’ voting efforts. One
novel idea that stems from blockchain-style consensus is to rely, instead, on a “longest-chain”
idea for such coordination. Such a longest-chain idea was initially considered in randomized
protocols, where in each round, a node has some probability of being elected a leader who can
propose the next block. Recently, well-known systems have started implementing the deterministic counterpart of such longest-chain protocols — the deterministic counterpart is especially
attractive since it is even simpler to implement than their randomized cousins. A notable
instantiation is the Aura protocol which is shipped with Parity’s open-source Ethereum implementation.
Interestingly, mathematical analyses of deterministic, longest-chain protocols are lacking
even though there exist several analyses of randomized versions. In this paper, we provide the
first formal analysis of deterministic, longest-chain-style consensus. We show that a variant of
the Aura protocol can defend against a Byzantine adversary that controls fewer than 1/3 fraction
of the nodes, and this resilience parameter is tight by some technical interpretation. Based
on insights gained through our mathematical treatment, we point out that Aura’s concrete
instantiation actually fails to achieve the resiliene level they claim.
Finally, while our tight proof for the longest-chain protocol is rather involved and non-trivial;
we show that a variant of the “longest-chain” idea which we call “largest-set” enables a textbook
construction that admits a simple proof (albeit with slower confirmation).

We provide the first constructions of two round information-theoretic (IT) secure multiparty computation (MPC) protocols in the plain model that tolerate any $t<n/2$ malicious corruptions. Our protocols satisfy the strongest achievable standard notions of security in two rounds in different communication models.
Previously, IT-MPC protocols in the plain model either required a larger number of rounds, or a smaller minority of corruptions.

We develop new constructions of lattice-based PRFs using keyed pseudorandom synthesizers. We generalize all of the known `basic' parallel lattice-based PRFs--those of [BPR12], [BLMR13], and [BP14]--to build highly parallel lattice-based PRFs with smaller modulus (and thus better reductions from worst-case lattice problems) while still maintaining computational efficiency asymptotically equal to the fastest known lattice-based PRFs at only the cost of larger key sizes.
In particular, we build several parallel (in $NC^{2}$) lattice-based PRFs with modulus independent of the number of PRF input bits based on both standard LWE and ring LWE. Our modulus for these PRFs is just $O \left(m^{ f \left(m \right)} \right)$ for lattice dimension $m$ and any function $f \left(m \right) \in \omega \left(1 \right)$. The only known parallel construction of a lattice-based PRF with such a small modulus is a construction from Banerjee's thesis, and some of our parallel PRFs with equivalently small modulus have smaller key sizes and are very slightly faster (when using FFT multiplication). These PRFs also asymptotically match the computational efficiency of the most efficient PRFs built from any LWE- or ring LWE-based assumptions known (potentially excluding some concurrent work), respectively, and concretely require less computation per output than any known parallel lattice-based PRFs (again when using FFT multiplication).
We additionally use our techniques to build other efficient PRFs with very low circuit complexity (but higher modulus) which improve known results on highly parallel lattice PRFs. For instance, for input length $\lambda$, we show that there exists a ring LWE-based PRF in $NC^{1}$ with modulus proportional to $m^{\lambda^{c}}$ for any $c \in \left(0, 1 \right)$. Constructions from lattices with this circuit depth were only previously known from larger moduli.

Coin toss has been extensively studied in the cryptography literature, and the well-accepted notion of fairness (henceforth called strong fairness) requires that a corrupt coalition cannot cause non-negligible bias. It is well-understood that two-party coin toss is impossible if one of the parties can prematurely abort; further, this impossibility generalizes to multiple parties with a corrupt majority (even if the adversary is computationally bounded and fail-stop only).
Interestingly, the original proposal of (two-party) coin toss protocols by Blum in fact considered a weaker notion of fairness: imagine that the (randomized) transcript of the coin toss protocol defines a winner among the two parties. Now Blum's notion requires that a corrupt party cannot bias the outcome in its favor (but self-sacrificing bias is allowed). Blum showed that this weak notion is indeed attainable for two parties assuming the existence of one-way functions.
In this paper, we ask a very natural question which, surprisingly, has been overlooked by the cryptography literature: can we achieve Blum's weak fairness notion in multi-party coin toss? What is particularly interesting is whether this relaxation allows us to circumvent the corrupt majority impossibility that pertains to strong fairness. Even more surprisingly, in answering this question, we realize that it is not even understood how to define weak fairness for multi-party coin toss. We propose several natural notions drawing inspirations from game theory, all of which equate to Blum's notion for the special case of two parties. We show, however, that for multiple parties, these notions vary in strength and lead to different feasibility and infeasibility results.

The rapid distribution of lightweight devices raised the demand for efficient encryption and authenticated encryption schemes for small messages. For this purpose, Andreeva et al. recently proposed forkciphers, which fork the middle state within a cipher and encrypt it twice further under two smaller independent permutations. So, forkciphers can produce two output blocks which can allow to authenticate and encrypt small messages more efficiently.
As instance of particular interest, Andreeva et al. proposed ForkAES, a tweakable forkcipher based on the AES-128 round function, which forks the state after five out of ten rounds. While their authenticated encrypted schemes were accompanied by proofs, the security discussion for ForkAES could not be covered in their work, and founded on existing results on the AES and KIASU-BC; so, the study of advanced differential attacks remained to be filled by the community.
This work tries to foster the understanding of the security of ForkAES. It outlines a rectangle and an impossible-differential attack on nine rounds in the single-key related-tweak model; moreover, it describes a rectangle attack on ten rounds for a fraction of approximately $2^{32}$ keys. We emphasize that our results do not break ForkAES in the single-key setting, but shed more light on its security margin.

It is well known that Canright’s tower field construction leads to a very small, unprotected AES S-box circuit by recursively embedding Galois Field operations into smaller fields. The current size record for the AES S-box by Boyar, Matthews and Peralta improves the original design with optimal subcomponents, while maintaining the overall tower-field structure. Similarly, all small state-of-the-art first-order SCA-secure AES S-box constructions are based on a tower field structure.
We demonstrate that a smaller first-order secure AES S-box is achievable by representing the field inversion as a multiplication chain of length 4. Based on this representation, we showcase a very compact S-box circuit with only one GF($2^8$)-multiplier instance. Thereby, we introduce a new high-level representation of the AES S-box and set a new record for the smallest first-order secure implementation

In this work, we propose a faster homomorphic linear transform algorithm for structured matrices such as the discrete Fourier transform (DFT) and linear transformations in bootstrapping.
First, we proposed new method to evaluate the DFT homomorphically for a given packed ciphertext from the Cooley-Tukey fast Fourier transform algorithm. While the previous method requires $O(\sqrt n)$ rotations and $O(n)$ constant vector multiplications, our method only needs $O(\log n)$ rotations/multiplications by consuming $O(\log n)$ depth for the length of input vector $n$.
Second, we apply the same method to the linear transform of bootstrapping for $\textsf{HEAAN}$. To achieve this, we construct a recursive relation of matrices in those linear transformations. Accordingly, we can highly accelerate the linear transformation part of bootstrapping: the number of homomorphic operations becomes logarithmic to the number of slots, as in homomorphic DFT.
We also implement both algorithms. Our homomorphic DFT with length $2^{14}$ only takes about 8 seconds which is about 150 times faster result than previous one. The bootstrapping for $\textsf{HEAAN}$ with our linear transform algorithm takes about 2 minutes for $\mathbb{C}^{32768}$ plaintext space with 8 bit precision, which takes 26 hours using the previous method.

This paper investigates the construction of lightweight MDS matrices with generalized Feistel structures (GFS). The approach developed by this paper consists in deriving MDS matrices from the product of several sparser ones. This can be seen as a generalization to several matrices of the recursive construction which derives MDS matrices as the powers of a single companion matrix. The first part of this paper gives some theoretical results on the iteration of GFS and the second part gives concrete instantiations. The results match the best known lightweight $4\times 4$ MDS matrix and improve the best known $6\times 6$ and $8\times 8$ MDS matrices.
Based on GFS structure, we propose some types of sparse matrices that are called EGFS matrices. Then, by applying binary linear functions to several round of EGFS matrices, we propose lightweight $4\times 4$, $6\times 6$ and $8\times 8$ MDS matrices which are implemented with 67, 158 and 272 XOR for 8-bit input, respectively. The major work of this paper is the design of an $8\times 8$ MDS matrix with 272 XOR for 8-bit input, since the best known result is 392 XOR.

The FHE (fully homomorphic encryption) schemes [7, 13] based on the modified AGCD problem (noise-free AGCD problem) are vulnerable to quantum attacks, because its security relies partly on the hardness of factoring, and some FHE schemes based on the decisional AGCD without the noise-free assumption, for example [1], has a drawback that its ciphertexts are very large.
In this paper, we construct a new batch FHE scheme based on the decisional AGCD problem to overcome these weaknesses and prove its security.

Non-malleable codes were introduced by Dziembowski, Pietrzak, and Wichs (JACM 2018) as a generalization of standard error correcting codes to handle severe forms of tampering on codewords. This notion has attracted a lot of recent research, resulting in various explicit constructions, which have found applications in tamper-resilient cryptography and connections to other pseudorandom objects in theoretical computer science.
We continue the line of investigation on explicit constructions of non-malleable codes in the information theoretic setting, and give explicit constructions for several new classes of tampering functions. These classes strictly generalize several previously studied classes of tampering functions, and in particular extend the well studied split-state model which is a "compartmentalized" model in the sense that the codeword is partitioned a prior into disjoint intervals for tampering. Specifically, we give explicit non-malleable codes for the following classes of tampering functions.
(1) Interleaved split-state tampering: Here the codeword is partitioned in an unknown way by an adversary, and then tampered with by a split-state tampering function.
(2) Linear function composed with split-state tampering: In this model, the codeword is first tampered with by a split-state adversary, and then the whole tampered codeword is further tampered with by a linear function. In fact our results are stronger, and we can handle linear function composed with interleaved split-state tampering.
(3) Bounded communication split-state tampering: In this model, the two split-state tampering adversaries are allowed to participate in a communication protocol with a bounded communication budget.
Our results are the first explicit constructions of non-malleable codes in any of these tampering models. We derive all these results from explicit constructions of seedless non-malleable extractors, which we believe are of independent interest.
Using our techniques, we also give an improved seedless extractor for an unknown interleaving of two independent sources.

We initiate the study of partial key exposure in ring-LWE-based cryptosystems.
Specifically, we
- Introduce the search and decision Leaky-RLWE assumptions (Leaky-SRLWE, Leaky-DRLWE), to formalize the hardness of search/decision RLWE under leakage of some fraction of coordinates of the NTT transform of the RLWE secret and/or error.
- Present and implement an efficient key exposure attack that, given certain $1/4$-fraction of the coordinates of the NTT transform of the RLWE secret, along with RLWE instances,
recovers the full RLWE secret for standard parameter settings.
- Present a search-to-decision reduction for Leaky-RLWE for certain types of key exposure.
- Analyze the security of NewHope key exchange under partial key exposure of $1/8$-fraction of the secrets and error.
We show that, assuming that Leaky-DRLWE is hard for these parameters, the shared key $v$ (which is then hashed using a random oracle) is computationally indistinguishable from a random variable with average min-entropy $238$, conditioned on transcript and leakage, whereas without leakage the min-entropy is $256$.

At Crypto 2016, Kaplan et al. proposed the first quantum exponential acceleration of a classical
symmetric cryptanalysis technique: they showed that, in the superposition query model, Simon's algorithm could be applied to accelerate the slide attack on the alternate-key cipher. This allows to recover an n-bit key with O(n) quantum time and queries.
In this paper we propose many other types of quantum slide attacks. First, we are able to quantize classical advanced slide attacks on Feistel networks. With modular additions inside branch or key-addition operations, these attacks reach up to two round self-similarity. With only XOR operations, they reach up to four rounds self-similarity, with a cost at most quadratic in the block size.
Moreover, some of these variants combined with whitening keys (FX construction) can be successfully attacked. We show how these results relate to general quantization principles of classical techniques including sliding with a twist, complementation slide and mirror slidex.
Furthermore, we show that some quantum slide attacks can be composed with other quantum attacks to perform efficient key-recoveries even when the round founction is a strong function classically.
Finally, we analyze the case of quantum slide attacks exploiting cycle-finding, that were thought to enjoy an exponential speed up in a paper by Bar-On et al. in 2015, where these attacks were introduced. We show that the speed-up is smaller than expected and less impressive than the above variants, but nevertheless provide improved complexities on the previous known quantum attacks in the superposition model for some self-similar SPN and Feistel constructions.

Since the celebrated work of Impagliazzo and Rudich (STOC 1989), a number of black-box impossibility results have been established. However, these works only ruled out classical black-box reductions among cryptographic primitives. Therefore it may be possible to overcome these impossibility results by using quantum reductions. To exclude such a possibility, we have to extend these impossibility results to the quantum setting.
In this paper, we initiate the study of black-box impossibility in the quantum setting. We first formalize a quantum counterpart of fully-black-box reduction following the formalization by Reingold, Trevisan and Vadhan (TCC 2004). Then we prove that there is no quantum fully-black-box reduction from collision-resistant hash function to one-way permutation (or even trapdoor permutation). This is an extension to the quantum setting of the work of Simon (Eurocrypt 1998) who showed a similar result in the classical setting.

Homomorphic secret sharing (HSS) allows $n$ clients to secret-share data to $m$ servers, who can then homomorphically evaluate public functions over the shares. A natural application is outsourced computation over private data. In this work, we present the first plain-model homomorphic secret sharing scheme that supports the evaluation of polynomials with degree higher than 2. Our construction relies on any degree-$k$ (multi-key) homomorphic encryption scheme and can evaluate degree-$\left( (k+1)m -1 \right)$ polynomials, for any polynomial number of inputs $n$ and any sub-logarithmic (in the security parameter) number of servers $m$. At the heart of our work is a series of combinatorial arguments on how a polynomial can be split into several low-degree polynomials over the shares of the inputs, which we believe is of independent interest.

Similar to digital circuits, analog and mixed-signal (AMS) circuits are also susceptible to supply-chain attacks such as piracy, overproduction, and Trojan insertion. However, unlike digital circuits,
supply-chain security of AMS circuits is less explored. In this work,
we propose to perform “logic locking” on digital section of the AMS
circuits. The idea is to make the analog design intentionally suffer
from the effects of process variations, which impede the operation of the circuit. Only on applying the correct key, the effect of process
variations are mitigated, and the analog circuit performs as desired.
We provide the theoretical guarantees of the security of the circuit,
and along with simulation results for the band-pass filter, low-noise
amplifier, and low-dropout regulator, we also show experimental
results of our technique on a band-pass filter.

A large number of studies on passwords make use of passwords leaked by attackers
who compromised online services. Frequently, these leaks contain only
the passwords themselves, or basic information such as usernames or email addresses.
While metadata-rich leaks exist, they are often limited in the variety of demographics they cover.
In this work, we analyze a meta-data rich data leak from a Middle Eastern
bank with a demographically-diverse user base. We provide an analysis of passwords
created by groups of people of different cultural backgrounds, some of
which are under-represented in existing data leaks, e.g., Arab, Filipino, Indian,
and Pakistani.
The contributions provided by this work are many-fold. First, our results
contribute to the existing body of knowledge regarding how users include personal
information in their passwords. Second, we illustrate the differences that
exist in how users from different cultural/linguistic backgrounds create passwords.
Finally, we study the (empirical and theoretical) guessability of the
dataset based on two attacker models, and show that a state of the art password
strength estimator inflates the strength of passwords created by users from
non-English speaking backgrounds. We improve its estimations by training it
with contextually relevant information.

Unspent Transaction Outputs (UTXOs) are the internal mechanism used in many cryp-
tocurrencies to represent coins. Such representation has some clear benefits, but also entails some complexities that, if not properly handled, may leave the system in an inefficient state. Specifically, inefficiencies arise when wallets (the software responsible for transferring coins between parties) do not manage UTXOs properly when performing payments. In this paper, we study three cryptocurrencies: Bitcoin, Bitcoin Cash and Litecoin, by analyzing the actual state of their UTXO sets, that is, the status of their sets of spendable coins. These three cryptocurrencies are the top-3 UTXO based cryptocurrencies by market capitalization. Our analysis shows that the usage of each cryptocurrency is quite different, and let to different results. Furthermore, it also points out that the management of the transactions has not been always performed efficiently and then the actual state of the UTXO set for each cryptocurrency is far from ideal.

The threat of side-channels is becoming increasingly prominent for resource-constrained internet-connected devices. While numerous power side-channel countermeasures have been proposed, a promising approach to protect the non-invasive electromagnetic side-channel attacks has been relatively scarce. Today's availability of high-resolution electromagnetic (EM) probes mandates the need for a low-overhead solution to protect EM side-channel analysis (SCA) attacks. This work, for the first time, performs a white-box analysis to root-cause the origin of the EM leakage from an integrated circuit. System-level EM simulations with Intel 32 nm CMOS technology interconnect stack, as an example, reveals that the EM leakage from metals above layer 8 can be detected by an external non-invasive attacker with the commercially available state-of-the-art EM probes. Equipped with this `white-box' understanding, this work proposes \textit{STELLAR}: Signature aTtenuation Embedded CRYPTO with Low-Level metAl Routing, which is a two-stage solution to eliminate the critical signal radiation from the higher-level metal layers. Firstly, we propose routing of the entire cryptographic cores power traces using the local lower-level metal layers, whose leakage cannot be picked up by an external attacker. Then, the entire crypto IP is embedded within a Signature Attenuation Hardware (SAH) which in turn suppresses the critical encryption signature before it routes the current signature to the highly radiating top-level metal layers. System-level implementation of the STELLAR hardware with local lower-level metal routing in TSMC 65 nm CMOS technology, with an AES-128 encryption engine (as an example cryptographic block) operating at 40 MHz, shows that the system remains secure against EM SCA attack even after $1 M$ encryptions, with $67\%$ energy efficiency and $1.23\times$ area overhead compared to the unprotected AES.