It is well established that the method of choice for implementing a side-channel secure modular inversion, is to use Fermat's little theorem. So $1/x = x^{p-2} \bmod p$. This can be calculated using any multiply-and-square method safe in the knowledge that no branching or indexing with potentially secret data (such as $x$) will be required. However in the case where the modulus $p$ is a pseudo-Mersenne, or Mersenne, prime of the form $p=2^n-c$, where $c$ is small, this process can be optimized to greatly reduce the number of multiplications required. Unfortunately an optimal solution must it appears be tailored specifically depending on $n$ and $c$. What appears to be missing from the literature is a near-optimal heuristic
method that works reasonably well in all cases.

Signal is a famous secure messaging protocol used by billions of people, by virtue of many secure text messaging applications including Signal itself, WhatsApp, Facebook Messenger, Skype, and Google Allo. At its core it uses the concept of "double ratcheting," where every message is encrypted and authenticated using a fresh symmetric key; it has many attractive properties, such as forward security, post-compromise security, and "immediate (no-delay) decryption," which had never been achieved in combination by prior messaging protocols.
While the formal analysis of the Signal protocol, and ratcheting in general, has attracted a lot of recent attention, we argue that none of the existing analyses is fully satisfactory. To address this problem, we give a clean and general definition of secure messaging, which clearly indicates the types of security we expect, including forward security, post-compromise security, and immediate decryption. We are the first to explicitly formalize and model the immediate decryption property, which implies (among other things) that parties seamlessly recover if a given message is permanently lost---a property not achieved by any of the recent "provable alternatives to Signal." We build a modular "generalized Signal protocol" from the following components: (a) continuous key agreement (CKA), a clean primitive we introduce and which can be easily and generically built from public-key encryption (not just Diffie-Hellman as is done in the current Signal protocol) and roughly models "public-key ratchets;" (b) forward-secure authenticated encryption with associated data (FS-AEAD), which roughly captures "symmetric-key ratchets;" and (c) a two-input hash function that is a pseudorandom function (resp. generator with input) in its first (resp. second) input, which we term PRF-PRNG. As a result, in addition to instantiating our framework in a way resulting in the existing, widely-used Diffie-Hellman based Signal protocol, we can easily get post-quantum security and not rely on random oracles in the analysis.
We further show that our design can be elegantly extended to include other forms of "fine-grained state compromise" recently studied at CRYPTO'18, but without sacrificing the immediate decryption property. However, we argue that the additional security offered by these modifications is unlikely to justify the efficiency hit of using much heavier public-key cryptography in place of symmetric-key cryptography.

Whether there exist Almost Perfect Non-linear permutations (APN) operating on an even number of bit is the so-called Big APN Problem. It has been solved in the 6-bit case by Dillon et al. in 2009 but, since then, the general case has remained an open problem.
In 2016, Perrin et al. discovered the butterfly structure which contains Dillon et al.'s permutation over $\mathbb{F}_{2^6}$. Later, Canteaut et al. generalised this structure and proved that no other butterflies with exponent $3$ can be APN. Recently, Yongqiang et al. further generalized the structure with Gold exponent and obtained more differentially 4-uniform permutations with the optimal nonlinearity. However, the existence of more APN permutations in their generalization was left as an open problem.
In this paper, we adapt the proof technique of Canteaut et al. to handle all Gold exponents and prove that a generalised butterfly with Gold exponents over $\mathbb{F}_{2^{2n}}$ can never be APN when $n>3$. More precisely, we prove that such a generalised butterfly being APN implies that the branch size is strictly smaller than 5. Hence, the only APN butterflies operate on 3-bit branches, i.e. on 6 bits in total.

In this paper we focus on Polynomial Learning with Errors (PLWE). This problem is parametrized by a polynomial and we are interested in relating the hardness of the $\text{PLWE}^f$ and $\text{PLWE}^h$ problems for different polynomials $f$ and $h$. More precisely, our main result shows that for a fixed monic polynomial $f$, $\text{PLWE}^{f\circ g}$ is at least as hard as $\text{PLWE}^f$, in both search and decision variants, for any monic polynomial $g$. As a consequence, $\text{PLWE}^{\phi_n}$ is harder than $\text{PLWE}^{f},$ for a minimal polynomial $f$ of an algebraic integer from the cyclotomic field $\mathbb{Q}(\zeta_n)$ with specific properties. Moreover, we prove in decision variant that in the case of power-of-2 polynomials, $\text{PLWE}^{\phi_n}$ is at least as hard as $\text{PLWE}^f,$ for a minimal polynomial $f$ of algebraic integers from the $n$th cyclotomic field with weaker specifications than those from the previous result.

We propose two one-round authenticated group-key exchange protocols from newly employed cryptographic invariant maps (CIMs): one is
secure under the quantum random oracle model and the other resists against maximum exposure where a non-trivial combination of secret
keys is revealed. The security of the former (resp. latter) is proved under the n-way decisional Diffie-Hellman (resp. n-way gap Diffie-Hellman) assumption on the CIMs in the quantum random (resp. random) oracle model.
We instantiate the proposed protocols on the hard homogeneous spaces with limitation where the number of the user group is two. In particular, the protocols instantiated by using the CSIDH, commutative supersingular isogeny Diffie-Hellman, key exchange are
currently more realistic than the general n-party CIM-based ones due
to its implementability. Our two-party one-round protocols are secure against quantum adversaries.

Homomorphic encryption has the purpose to allow computations on
encrypted data, without the need for decryption other than that
of the final result. This could provide an elegant solution to the problem of privacy
preservation in data-based applications, such as those provided
and/or facilitated by machine learning techniques, but several
limitations and open issues hamper the fulfillment of this plan.
In this work we assess the possibility for homomorphic
encryption to fully implement its program without the need
to rely on other techniques, such as multiparty computation, which
may be impossible in many actual use cases
(for instance due to the high level of communication required).
We proceed in two steps:
i) on the basis of the well-known structured program theorem
[Bohm and Jacopini] we identify the relevant minimal
set of operations homomorphic encryption must be able to perform
to implement any algorithm; and ii) we analyse the possibility to
solve -and propose an implementation for- the most fundamentally
relevant issue as it emerges from our analysis, that is, the
implementation of conditionals (which in turn require comparison
and selection/jump operations) in full homomorphic encryption.
We show how this issue has a serious impact and clashes with the
fundamental requirements of homomorphic encryption.
This could represent a drawback for its use as a
complete solution in data analysis applications,
in particular machine learning. It will thus possibly require
a deep re-thinking of the homomorphic encryption program for
privacy preservation.
We note that our approach to comparisons is novel, and for the first
time completely embedded in homomorphic encryption,
differently from what proposed in previous studies
(and beyond that, we supplement it with the necessary
selection/jump operation).
A number of studies have indeed dealt with comparisons,
but have not managed to perform them in pure homomorphic encryption.
Typically their comparison protocols do not utilise homomorphic
encryption for the comparison itself, but rely on other
cryptographic techniques, such as secure multiparty
computation, which
a) require a high level of communication
between parties (each single comparison in a machine
learning training and prediction process must be
performed by exchanging several messages), which
may not be possible in various use cases, and
b) required the data owner to decrypt
intermediate results, extract significant
bits for the comparison, re-encrypt and send the
result back to the other party for the accomplishment
of the algorithm. Such ``decryption'' in the middle
foils the purpose of homomorphic encryption.
Beside requiring only homomorphic encryption,
and not any intermediate decryption, our protocol
is also provably safe (as it shares the same safety as
the homomorphic encryption schemes), differently
from other techniques such as OPE/ORE and variations,
which have been proved not secure.

The efficient verification of the security of masked hardware implementations is an important issue that hinders the development and deployment of randomness-efficient masking techniques. At EUROCRYPT 2018, Bloem et al. [6] introduced the first practical formal tool to prove the side-channel resilience of masked circuits in the probing model with glitches. Most recently Barthe et al.[2] introduced a more efficient formal tool that builds upon the findings of Bloem et al. for modeling the effects of glitches. While Barthe et al.'s approach greatly improves the first-order verification performance, it shows that higher-order verification in the probing model with glitches is still enormously time-consuming for larger circuits like a second-order AES S-box, for instance. Furthermore, the results of Barthe et al. underline the discrepancy between state-of-the-art formal security notions that allow for faster verification of circuits. Namely the strong non-interference (SNI) notion, and existing masked hardware implementations that are secure in the probing model with glitches. In this work, we extend and improve the formal approaches of Bloem et al. and Barthe et al. on manifold levels. We first introduce a so-called sharing independence notion which helps to reason about the independence of shared variables. We then show how to use this notion to test for the independence of input and output sharings of a module which allows speeding up the formal verification of circuits that do not fulfill the SNI notion. With this extension, we are for the time able to verify the security of a second-order masked DOM AES S-box which takes about 3 seconds, and up to a fifth-order AES S-box which requires about 47 days for verification. Furthermore, we discuss in which case the independence of input and output sharings lead to composability.

Indistinguishability obfuscation has become one of the most exciting cryptographic primitives due to its far reaching applications in cryptography and other fields. However, to date, obtaining a plausibly secure construction has been an illusive task, thus motivating the study of seemingly weaker primitives that imply it, with the possibility that they will be easier to construct.
In this work, we provide a systematic study of compressing obfuscation, one of the most natural and simple to describe primitives that is known to imply indistinguishability obfuscation when combined with other standard assumptions. A compressing obfuscator is roughly an indistinguishability obfuscator that outputs just a slightly compressed encoding of the truth table. This generalizes notions introduced by Lin et al.~(PKC 2016) and Bitansky et al.~(TCC 2016) by allowing for a broader regime of parameters.
We view compressing obfuscation as an independent cryptographic primitive and show various positive and negative results concerning its power and plausibility of existence, demonstrating significant differences from full-fledged indistinguishability obfuscation.
First, we show that as a cryptographic building block, compressing obfuscation is weak. In particular, when combined with one-way functions, it cannot be used (in a black-box way) to achieve public-key encryption, even under (sub-)exponential security assumptions. This is in sharp contrast to indistinguishability obfuscation, which together with one-way functions implies almost all cryptographic primitives.
Second, we show that to construct compressing obfuscation with perfect correctness, one only needs to assume its existence with a very weak correctness guarantee and polynomial hardness. Namely, we show a correctness amplification transformation with optimal parameters that relies only on polynomial hardness assumptions. This implies a universal construction assuming only polynomially secure compressing obfuscation with approximate correctness. In the context of indistinguishability obfuscation, we know how to achieve such a result only under sub-exponential security assumptions together with derandomization assumptions.
Lastly, we characterize the existence of compressing obfuscation with \emph{statistical} security. We show that in some range of parameters and for some classes of circuits such an obfuscator exists, whereas it is unlikely to exist with better parameters or for larger classes of circuits. These positive and negative results reveal a deep connection between compressing obfuscation and various concepts in complexity theory and learning theory.

The CCA-secure lattice-based post-quantum key encapsulation scheme Saber is a candidate in the NIST's post-quantum cryptography standardization process. In this paper, we study the implementation aspects of Saber in resource-constrained microcontrollers from the ARM Cortex-M series which are very popular for realizing IoT applications. In this work, we carefully optimize various parts of Saber for speed and memory. We exploit digital signal processing instructions and efficient memory access for a fast implementation of polynomial multiplication. We also use memory efficient Karatsuba and just-in-time strategy for generating the public matrix of the module lattice to reduce the memory footprint. We also show that our optimizations can be combined with each other seamlessly to provide various speed-memory trade-offs. Our speed optimized software takes just 1,147K, 1,444K, and 1,543K clock cycles on a Cortex-M4 platform for key generation, encapsulation and decapsulation respectively. Our memory efficient software takes 4,786K, 6,328K, and 7,509K clock cycles on an ultra resource-constrained Cortex-M0 platform for key generation, encapsulation, and decapsulation respectively while consuming only 6.2 KB of memory at most. These results show that lattice-based key encapsulation schemes are perfectly practical for securing IoT devices from quantum computing attacks.

SM9 is a Chinese official cryptography standard which defines a set of identity-based cryptographic schemes from pairings. This report describes the technical specification of SM9. The security of schemes is also analyzed.

It is well-known that non-comparison-based techniques
can allow us to sort $n$ elements in $o(n \log n)$ time
on a Random-Access Machine (RAM). On the other hand, it is a long-standing open
question whether
(non-comparison-based) circuits can sort
$n$ elements from the domain $[1..2^k]$
with $o(k n \log n)$ boolean gates.
We consider weakened forms of this question: first, we consider
a restricted class of sorting where the number of distinct keys
is much smaller than the input length; and second, we
explore Oblivious RAMs and probabilistic circuit families, i.e.,
computational models that are
somewhat more powerful than circuits but much weaker than RAM.
We show that Oblivious RAMs and probabilistic circuit families
can sort $o(\log n)$-bit keys in $o(n \log n)$ time or $o(k n \log n)$ circuit
complexity where $n$ is the input length.
Our algorithms work in the balls-and-bins model, i.e., not only can they
sort an array of numerical keys --- if each key additionally
carries an opaque ball, our algorithms
can also move the balls into the correct order.
We further show that
in such a balls-and-bins model, it is impossible
to sort $\Omega(\log n)$-bit keys
in $o(n \log n)$ time, and thus
the $o(\log n)$-bit-key assumption
is necessary for overcoming the $n \log n$ barrier.
Finally, we optimize the IO efficiency of our oblivious algorithms
for RAMs --- we show that even the $1$-bit special
case of our algorithm can solve open questions
regarding whether there exist oblivious
algorithms for tight compaction and selection in linear IO.

Database-as-a-service (DBaaS) allows the client to store and manage structured data on the cloud remotely. Despite its merits, DBaaS also brings significant privacy issues. Existing encryption techniques (e.g., SQL-aware encryption) can mitigate privacy concerns, but they still leak information through access patterns, which are vulnerable to statistical inference attacks. Oblivious Random Access Machine (ORAM) can seal such leakages; however, the recent studies showed significant
challenges on the integration of ORAM into databases. That is, the direct usage of ORAM on databases is not only costly but also permits very limited query functionalities. In this paper, we
propose new oblivious data structures called Oblivious Matrix Structure (OMAT) and Oblivious Tree Structure (OTREE), which allow tree-based ORAM to be integrated into database systems in a more efficient manner with diverse query functionalities supported. OMAT provides special ORAM packaging strategies for table structures, which not only offers a significantly better performance but also enables a broad range of query types that may not be efficient in existing frameworks. On the
other hand, OTREE allows oblivious conditional queries to be performed on tree-indexed databases more efficiently than existing techniques. We implemented our proposed techniques and evaluated
their performance on a real cloud database with various metrics, compared with state-of-the-art counterparts.

The GGH15 multilinear maps have served as the foundation for a number of cutting-edge cryptographic proposals. Unfortunately, many schemes built on GGH15 have been explicitly broken by so-called ``zeroizing attacks,'' which exploit leakage from honest zero-test queries. The precise settings in which zeroizing attacks are possible have remained unclear. Most notably, none of the current indistinguishability obfuscation (iO) candidates from GGH15 have any formal security guarantees against zeroizing attacks.
In this work, we demonstrate that all known zeroizing attacks on GGH15 implicitly construct algebraic relations between the results of zero-testing and the encoded plaintext elements. We then propose a ``GGH15 zeroizing model" as a new general framework which greatly generalizes known attacks.
Our second contribution is to describe a new GGH15 variant, which we formally analyze in our GGH15 zeroizing model. We then construct a new iO candidate using our multilinear map, which we prove secure in the GGH15 zeroizing model. This implies resistance to all known zeroizing strategies. The proof relies on the Branching Program Un-Annihilatability (BPUA) Assumption of Garg et al. [TCC 16-B] (which is implied by PRFs in NC^1 secure against P/Poly) and the complexity-theoretic p-Bounded Speedup Hypothesis of Miles et al. [ePrint 14] (a strengthening of the Exponential Time Hypothesis).

All known multilinear map candidates have suffered from a class of attacks known as ``zeroizing'' attacks, which render them unusable for many applications. We provide a new construction of polynomial-degree multilinear maps and show that our scheme is provably immune to zeroizing attacks under a strengthening of the Branching Program Un-Annihilatability Assumption (Garg et al., TCC 2016-B).
Concretely, we build our scheme on top of the CLT13 multilinear maps (Coron et al., CRYPTO 2013). In order to justify the security of our new scheme, we devise a weak multilinear map model for CLT13 that captures zeroizing attacks and generalizations, reflecting all known classical polynomial-time attacks on CLT13. In our model, we show that our new multilinear map scheme achieves ideal security, meaning no known attacks apply to our scheme. Using our scheme, we give a new multiparty key agreement protocol that is several orders of magnitude more efficient that what was previously possible.
We also demonstrate the general applicability of our model by showing that several existing obfuscation and order-revealing encryption schemes, when instantiated with the CLT13 maps, are secure against known attacks. These are schemes that are actually being implemented for experimentation, but until our work had no rigorous justification for security.

Secure multi-party computation (MPC) allows mutually distrusting parties to compute securely over their private data.
The hardness of MPC, essentially, lies in performing secure multiplications over suitable algebras. Parties use diverse cryptographic resources, like computational hardness assumptions or physical resources, to securely compute these multiplications.
There are several cryptographic resources that help securely compute one multiplication over a large finite field, say $\mathbb{G}\mathbb{F}[2^n]$, with linear communication complexity. For example, the computational hardness assumption like noisy Reed-Solomon codewords are pseudorandom. However, it is not known if we can securely compute, say, a linear number of AND-gates from such resources, i.e., a linear number of multiplications over the base field $\mathbb{G}\mathbb{F}[2]$. Before our work, we could only perform $o(n)$ secure AND-evaluations. This example highlights the general inefficiency of multiplying over the base field using one multiplication over the extension field. Our objective is to remove this hurdle and enable secure computation of boolean circuits while incurring a constant communication overhead based on more diverse cryptographic resources.
Technically, we construct a perfectly secure protocol that realizes a linear number of multiplication gates over the base field using one multiplication gate over a degree-$n$ extension field. This construction relies on the toolkit provided by algebraic function fields.
Using this construction, we obtain the following results.
If we can perform one multiplication over $\mathbb{G}\mathbb{F}[2^n]$ with linear communication using a particular cryptographic resource, then we can also evaluate linear-size boolean circuits with linear communication using the same cryptographic resource. In particular, we provide the first construction that computes a linear number of oblivious transfers with linear communication complexity from the computational hardness assumptions like noisy Reed-Solomon codewords are pseudorandom, or arithmetic-analogues of LPN-style assumptions. Next, we highlight the potential of our result for other applications to MPC by constructing the first correlation extractor that has $1/2$ resilience and produces a linear number of oblivious transfers.

This paper presents efficient formulas to compute Miller doubling and Miller addition utilizing degree-3 twists on curves with j-invariant 0 written in Hessian form. We give the formulas for both odd and even embedding degrees and for pairings on both $\mathbb{G}_1\times\mathbb{G}_2$ and $\mathbb{G}_2\times\mathbb{G}_1$. We propose the use of embedding degrees 15 and 21 for 128-bit and 192-bit security respectively in light of the NFS attacks and their variants. We give a comprehensive comparison with other curve models; our formulas give the fastest known pairing computation for embedding degrees 15, 21, and 24.

As genome sequencing technology develops rapidly, there has lately been an increasing need to keep genomic data secure even when stored in the cloud and still used for research. In this paper, we are interested in designing a protocol for the secure outsourcing matching problem on encrypted data. We propose an efficient method to securely search a matching position with the query data and extract some information at the position. After decryption, we only perform a small amount of comparison with the query information in plaintext. We apply this method to find a set of biomarkers in encrypted genomes.
The important feature of our method is to encode a genomic database as a single element of polynomial ring. It also requires only a single homomorphic multiplication for query computation. Thus this method has the advantage over the previous methods in parameter size, computational complexity, and communication cost.
We evaluate the performance of our method and verify that computation on large-scale personal data can be securely and practically outsourced to a cloud environment during data analysis. It takes about 3.9 seconds to search-and-extract the reference and alternate sequences of the queried position in a database of size 4M.

Remote hacks are the most common threat in the Internet. We therefore initiate the study of incorporating very simple remotely unhackable hardware modules, such as air-gap switches and data diodes, into the field of multi-party computation. As a result, we are able to construct MPC protocols with very strong and composable security guarantees against remote hacks. Our application of remotely unhackable hardware modules is motivated by the fact that hardware modules with very limited functionality can be implemented securely as fixed-function circuits and verified for correctness.
Such hardware modules can therefore not be hacked remotely.
Using only very few and very simple remotely unhackable hardware modules, we construct protocols where mounting remote attacks does not enable an adversary to learn or modify a party's inputs and outputs unless he hacks a party via the input port before it has received its (first) input (or gains control over all parties).
Hence, our protocols protect against all remote attacks, except for hacks via the input port while a party is waiting for input. To achieve this level of security, the parties' inputs and outputs are authenticated, masked and shared in our protocols in such a way that an adversary is unable to learn or modify them when gaining control over a party via a remote hack. For simplicity we assume erasing parties in our constructions. This is, however, not necessary and we show that this assumption can be dropped.
The remotely unhackable hardware modules applied in this work are based on substantially weaker assumptions than the hardware tokens proposed by Katz at EUROCRYPT `07. In particular, they are not assumed to be physically tamper-proof, can thus not be passed to other (possibly malicious) parties, and are therefore not sufficient to circumvent the impossibility results in the Universal Composability (UC) framework. Therefore, our protocols still rely on additional, well-established setup assumptions.
Since the advantages provided by unhackable hardware modules, e.g. isolation properties, cannot be adequately captured in existing composable security frameworks, we have conceived a new security framework based on the UC framework. We call our framework Fortified UC.

The notion of Registration-Based Encryption (RBE) was recently introduced by Garg, Hajiabadi, Mahmoody, and Rahimi [TCC'18] with the goal of removing the private-key generator (PKG) from IBE. Specifically, RBE allows encrypting to identities using a (compact) master public key, like how IBE is used, with the benefit that the PKG is substituted with a weaker entity called "key curator" who has no knowledge of any secret keys. Here individuals generate their secret keys on their own and then publicly register their identities and their corresponding public keys to the key curator. Finally, individuals obtain "rare" decryption-key updates from the key curator as the population grows. In their work, they gave a construction of RBE schemes based on the combination of indistinguishability obfuscation and somewhere statistically binding hash functions. However, they left open the problem of constructing RBE schemes based on standard assumptions.
In this work, we resolve the above problem and construct RBE schemes based on standard assumptions (e.g., CDH or LWE). Furthermore, we show a new application of RBE in a novel context. In particular, we show that anonymous variants of RBE (which we also construct under standard assumptions) can be used for realizing abstracts forms of anonymous messaging tasks in simple scenarios in which the parties communicate by writing messages on a shared board in a synchronized way.

In this paper, we propose a new general construction to reduce the public key size of McEliece-based schemes based on Goppa codes. In particular, we generalize the ideas of automorphism-induced Goppa codes by considering nontrivial subsets of automorphism groups to construct Goppa codes with a nice block structure. By considering additive and multiplicative automorphism subgroups, we provide explicit constructions to demonstrate our technique. In addition, we show that our technique can be applied to automorphism-induced Goppa codes to further reduce their key sizes.