We show that it is possible to achieve perfect forward secrecy in two-message or one-round key exchange (KE) protocols that satisfy even stronger security properties than provided by the extended Canetti-Krawczyk (eCK) security model. In particular, we consider perfect forward secrecy in the presence of adversaries that can reveal ephemeral secret keys and the long-term secret keys of the actor of a session (similar to Key Compromise Impersonation).
We propose two new game-based security models for KE protocols. First, we formalize a slightly stronger variant of the eCK security model that we call eCKw. Second, we integrate perfect forward secrecy into eCKw, which gives rise to the even stronger eCK-PFS model. We propose a security-strengthening transformation (i.e., a compiler) between our new models. Given a two-message Diffie-Hellman type protocol secure in eCKw, our transformation yields a two-message protocol that is secure in eCK-PFS. As an example, we show how our transformation can be applied to the NAXOS protocol.

Private set intersection (PSI) is a cryptographic technique that is applicable to many privacy-sensitive scenarios. For decades, researchers have been focusing on improving its efficiency in both communication and computation. However, most of the existing solutions are inefficient for an unequal number of inputs, which is common in conventional client-server settings.
In this paper, we analyze and optimize the efficiency of existing PSI protocols to support precomputation so that they can efficiently deal with such input sets. We transform four existing PSI protocols into the precomputation form such that in the setup phase the communication is linear only in the size of the larger input set, while in the online phase the communication is linear in the size of the smaller input set. We implement all four protocols and run experiments between two PCs and between a PC and a smartphone and give a systematic comparison of their performance. Our experiments show that a protocol based on securely evaluating a garbled AES circuit achieves the fastest setup time by several orders of magnitudes, and the fastest online time in the PC setting where AES-NI acceleration is available. In the mobile setting, the fastest online time is achieved by a protocol based on the Diffie-Hellman assumption.

This paper presents a post-quantum secure, efficient, and tunable
FPGA implementation
of the key-generation algorithm for the Niederreiter cryptosystem
using binary Goppa codes.
Our key-generator implementation requires as few as 896,052 cycles
to produce both public and private portions of a key,
and can achieve an estimated frequency Fmax
of over 240 MHz when synthesized for Stratix V FPGAs.
To the best of our knowledge,
this work is the first hardware-based implementation
that works with parameters equivalent to, or exceeding,
the recommended 128-bit ``post-quantum security'' level.
The key generator can produce a key pair
for parameters $m=13$, $t=119$, and $n=6960$ in only
$3.7$ ms when no systemization failure occurs,
and in $3.5 \cdot 3.7$ ms on average.
To achieve such performance,
we implemented an optimized and parameterized
Gaussian systemizer for matrix systemization,
which works for any large-sized matrix over any binary field GF$(2^m)$.
Our work also presents an FPGA-based implementation
of the Gao-Mateer additive FFT,
which only takes about 1000 clock cycles
to finish the evaluation of a degree-119 polynomial
at $2^{13}$ data points.
The Verilog HDL code of our key generator
is parameterized and
partly code-generated using Python and Sage.
It can be synthesized for different parameters,
not just the ones shown in this paper.
We tested the design using a Sage reference implementation,
iVerilog simulation, and
on real FPGA hardware.

Privacy for arbitrary encrypted remote computation in the cloud depends on the running code on the server being obfuscated from the standpoint of the operator in the computer room. This paper shows formally as well as practically that that may be arranged on a platform with the appropriate machine code architecture, given the obfuscating compiler described.

We provide an alternative method for constructing lattice-based digital signatures which does not use the ``hash-and-sign'' methodology of Gentry, Peikert, and Vaikuntanathan (STOC 2008). Our resulting signature scheme is secure, in the random oracle model, based on the worst-case hardness of the $\tilde{O}(n^{1.5})-SIVP$ problem in general lattices. The secret key, public key, and the signature size of our scheme are smaller than in all previous instantiations of the hash-and-sign signature, and our signing algorithm is also much simpler, requiring just a few matrix-vector multiplications and rejection samplings. We then also show that by slightly changing the parameters, one can get even more efficient signatures that are based on the hardness of the Learning With Errors problem. Our construction naturally transfers to the ring setting, where the size of the public and secret keys can be significantly shrunk, which results in the most practical to-date provably secure signature scheme based on lattices.

In this paper, we will study some possible generalizations of the famous Diffie-Hellman algorithm. As we will see,
at the end, most of these generalizations will not be secure or will be equivalent to some classical schemes.
However, these results are not always obvious and moreover our analysis will present some interesting connections
between the concepts of commutativity, associativity, and public key cryptography.

We explore a new security model for secure computation on large datasets. We assume that two servers have been employed to compute on private data that was collected from many users, and, in order to improve the efficiency of their computation, we establish a new tradeoff with privacy. Specifically, instead of claiming that the servers learn nothing about the input values, we claim that what they do learn from the computation preserves the differential privacy of the input. Leveraging this relaxation of the security model allows us to build a protocol that leaks some information in the form of access patterns to memory, while also providing a formal bound on what is learned from the leakage.
We then demonstrate that this leakage is useful in a broad class of computations. We show that computations such as histograms, PageRank and matrix
factorization, which can be performed in common graph-parallel frameworks such as MapReduce or Pregel, benefit from our relaxation. We implement a
protocol for securely executing graph-parallel computations, and evaluate the performance on the three examples just mentioned above. We demonstrate marked improvement over prior implementations for these computations.

Since its introduction by Jao and De Feo in 2011, the supersingular isogeny Diffie-Hellman (SIDH) key exchange protocol has positioned itself as a promising candidate for post-quantum cryptography. One salient feature of the SIDH protocol is that it requires exceptionally short key sizes. However, the latency associated to SIDH is higher than the ones reported for other post-quantum cryptosystem proposals. Aiming to accelerate the SIDH runtime performance, we present in this work several algorithmic optimizations targeting both elliptic-curve and field arithmetic operations. We introduce in the context of the SIDH protocol a more efficient approach for calculating the elliptic curve operation P + [k]Q. Our strategy achieves a factor 1.4 speedup compared with the popular variable-three-point ladder algorithm regularly used in the SIDH shared secret phase. Moreover, profiting from pre-computation techniques our algorithm yields a factor 1.7 acceleration for the computation of this operation in the SIDH key generation phase. We also present an optimized evaluation of the point tripling formula, and discuss several algorithmic and implementation techniques that lead to faster field arithmetic computations. A software implementation of the above improvements on an Intel Skylake Core i7-6700 processor gives a factor 1.33 speedup against the state-of-the-art software implementation of the SIDH protocol reported by Costello-Longa-Naehrig in CRYPTO 2016.

Many digital signature schemes rely on random numbers that are unique and non-predictable per signature. Failures of random number generators may have catastrophic effects such as compromising private signature keys. In recent years, many widely-used cryptographic technologies adopted deterministic signature schemes because they are presumed to be safer to implement.
In this paper, we analyze the security of deterministic ECDSA and EdDSA signature schemes and show that the elimination of random number generators in these schemes enables new kinds of fault attacks. We formalize these attacks and introduce practical attack scenarios against EdDSA using the Rowhammer fault attack. EdDSA is used in many widely used protocols such as TLS, SSH and IPSec, and we show that these protocols are not vulnerable to our attack. We formalize the necessary requirements of protocols using these deterministic signature schemes to be vulnerable, and discuss mitigation strategies and their effect on fault attacks against deterministic signature schemes.

In 2014, Smart and Vercauteren introduced a packing technique for homomorphic encryption schemes by decomposing the plaintext space using the Chinese Remainder Theorem. This technique allows to encrypt multiple data values simultaneously into one ciphertext and execute Single Instruction Multiple Data operations homomorphically. In this paper we improve and generalize their results by introducing a flexible Laurent polynomial encoding technique and by using a more fine-grained CRT decomposition of the plaintext space. The Laurent polynomial encoding provides a convenient common framework for all conventional ways in which input data types can be represented, e.g.finite field elements, integers, rationals, floats and complex numbers. Our methods greatly increase the packing capacity of the plaintext space, as well as one's flexibility in optimizing the system parameters with respect to efficiency and/or security.

This paper evaluates the security level of the River Keyak against the cube-like attack. River Keyak is the only lightweight scheme of the Keccak-permutation-based Authenticated Encryption Cipher Keyak, which is one of the 16 survivors of the 3rd round CAESAR competition. Dinur et al. gave the seven-round cube-like attack on Lake Keyak (1600-bit) using the divide-and-conquer method at EUROCRYPT 2015, then Huang et al. improved the result to 8-round using a new conditional cube attack at EUROCRYPT 2017. While for River Keyak, the 800-bit state is so small that the equivalent key (256-bit capacity) occupy double lanes, the attacks can not be applied to the River Keyak trivially.
In this paper, we comprehensively explore the conditional cube attack on the small state (800-bit) River Keyak. Firstly, we find a new conditional cube variable which has a much weaker diffusion than Huang et al.'s, this makes the conditional cube attack possible for small state (800-bit) River Keyak. Then we find enough cube variables for 6/7-round River Keyak and successfully launch the key recovery attacks on 6/7-round River Keyak with the time complexity $2^{33}$ and $2^{49}$ respectively. We also verify the 6 and 7-round attack on a laptop. Finally, by using linear structure technique with our new conditional cube variable, we greatly increase the freedom degree to find more cube variables for conditional cube attacks as it is complex for 800-bit state to find enough cube variables for 8-round attack. And then we use the new variables by this new method to launch 8-round conditional cube attack with the time complexity $2^{81}$. These are the first cryptanalysis results on round-reduced River Keyak. Our attacks do not threaten the full-round (12) River Keyak.

Bitcoin has not only attracted many users but also been considered as a technical breakthrough by academia. However, the expanding potential of Bitcoin is largely untapped due to its limited throughput. The Bitcoin community is now facing its biggest crisis in history as the community splits on how to increase the throughput. Among various proposals, Bitcoin Unlimited recently became the most popular candidate, as it allows miners to collectively decide the block size limit according to the real network capacity. However, the security of BU is heatedly debated and no consensus has been reached as the issue is discussed in different miner incentive models. In this paper, we systematically evaluate BU's security with three incentive models via testing the two major arguments of BU supporters: the block validity consensus is not necessary for BU's security; such consensus would emerge in BU out of economic incentives. Our results invalidate both arguments and therefore disprove BU's security claims. Our paper further contributes to the field by addressing the necessity of a prescribed block validity consensus for cryptocurrencies.

We propose privacy-preserving protocols for computing linear regression models, in the setting where the training dataset is vertically distributed among several parties. Our main contribution is a hybrid multi-party computation protocol that combines Yao's garbled circuits with tailored protocols for computing inner products. Like many machine learning tasks, building a linear regression model involves solving a system of linear equations. We conduct a comprehensive evaluation and comparison of different techniques for securely performing this task, including a new Conjugate Gradient Descent (CGD) algorithm. This algorithm is suitable for secure computation because it uses an efficient fixed-point representation of real numbers while maintaining accuracy and convergence rates comparable to what can be obtained with a classical solution using floating point numbers. Our technique improves on Nikolaenko et al.'s method for privacy-preserving ridge regression (S&P 2013), and can be used as a building block in other analyses. We implement a complete system and demonstrate that our approach is highly scalable, solving data analysis problems with one million records and one hundred features in less than one hour of total running time.

A secure comparison protocol allows players to evaluate the greater-than predicate on hidden values; it addresses a problem that belongs to the field of multiparty computation, in which players wish to jointly and privately evaluate a function on secret inputs. Introduced by Yao under the name millionaires' problem, secure comparisons have received a great deal of attention. They have proven to be one of the most fundamental building block in a large variety of multiparty computation protocols. However, due to their inherent non-arithmetic structure, they are in general less efficient than other fundamental primitives, and as such, are often a major bottleneck in multiparty computation protocols.
In this work, we design new two-party protocols for the greater-than functionality, secure against honest-but-curious adversaries (who follow the specifications of the protocol), improving over the state of the art. They can be readily used in a large variety of applications in which secure comparisons constitute the main efficiency bottleneck. Our protocols are defined in the preprocessing model, and are extremely efficient during the online phase. They are based solely on oblivious transfers, and can therefore use oblivious transfer extensions to get rid of all but a constant amount of expensive computations. Toward our goal of secure comparison, we also design protocols for testing equality between private inputs, which improve similarly over the state of the art. The latter contribution is of independent interest.

Subversion zero knowledge for non-interactive proof systems demands that zero knowledge (ZK) be maintained even when the common reference string (CRS) is chosen maliciously. SNARKs are proof systems with succinct proofs, which are at the core of the cryptocurrency Zcash, whose anonymity relies on ZK-SNARKs, and they are used for ZK contingent payments in Bitcoin.
We show that under a plausible hardness assumption, the most efficient SNARK schemes proposed in the literature, including the one underlying Zcash and contingent payments, satisfy subversion ZK or can be made to at very little cost. In particular, we prove subversion ZK of the original SNARKs by Gennaro et al. and the most efficient construction by Groth from last year; for the Pinocchio scheme implemented in libsnark we show that it suffices to add 4 group elements to the CRS. We also argue that Zcash is anonymous even if its parameters were set up maliciously.

Fuzzy extractors have been proposed in 2004 by Dodis et al. as a secure way to generate cryptographic keys from noisy sources. In recent years, fuzzy extractors have become an important building block in hardware security due to their use in secure key generation based on Physical Unclonable Functions (PUFs). Fuzzy extractors are provably secure against passive attackers. A year later Boyen et al. introduced robust fuzzy extractors which are also provably secure against active attackers, i.e., attackers that can manipulate the helper data.
In this paper we show that the original provable secure robust fuzzy extractor construction by Boyen et al. actually does not fulfill the error-correction requirements for practical PUF applications. The fuzzy extractors proposed for PUF-based key generation on the other hand that fulfill the error-correction requirements cannot be extended to such robust fuzzy extractors, due to a strict bound $t$ on the number of correctable errors. While it is therefore tempting to simply ignore this strict bound, we present novel helper data manipulation attacks on fuzzy extractors that also work if a ``robust fuzzy extractor-like'' construction without this strict bound is used.
Hence, this paper can be seen as a call for action to revisit this seemingly solved problem of building robust fuzzy extractors. The new focus should be on building more efficient solutions in terms of error-correction capability, even if this might come at the costs of a proof in a weaker security model.

Oblivious Transfer (OT) is a simple, yet fundamental primitive which suffices to achieve almost every cryptographic application. In a recent work (Latincrypt `15), Chou and Orlandi (CO) present the most efficient, fully UC-secure OT protocol to date and argue its security under the CDH assumption. Unfortunately, a subsequent work by Genc et al. (Eprint `17) exposes a flaw in their proof which renders the CO protocol insecure. In this work, we make the following contributions: We first point out two additional, previously undiscovered flaws in the CO protocol and then show how to patch the proof with respect to static and malicious corruptions in the UC model under the stronger Gap Diffie-Hellman (GDH) assumption. With the proof failing for adaptive corruptions even under the GDH assumption, we then present a novel OT protocol which builds on ideas from the CO protocol and can be proven fully UC-secure under the CDH assumption. Interestingly, our new protocol is actually significantly more efficient (roughly by a factor of two) than the CO protocol. This improvement is made possible by avoiding costly redundancy in the symmetric encryption scheme used in the CO protocol. Our ideas can also be applied to the original CO protocol, which yields a similar gain in efficiency.

In this work, we study unconditionally-secure multi-party computation (MPC) tolerating $t < n/3$ corruptions, where
$n$ is the total number of parties involved. In this setting, it is well known that if the underlying network is completely asynchronous,
then one can achieve only statistical security; moreover it is impossible to ensure input provision and consider
inputs of all the honest parties. The best known statistically-secure asynchronous MPC (AMPC) with $t<n/3$
requires a communication of $O(n^5)$ field elements per multiplication.
We consider a partially synchronous setting, where the parties are assumed to be globally synchronized
initially for few rounds and then the network becomes completely asynchronous. In such a setting, we present a
MPC protocol, which requires $O(n^2)$ communication per multiplication while ensuring input provision.
Our MPC protocol relies on a new four round, communication efficient statistical verifiable secret-sharing (VSS) protocol with
broadcast communication complexity independent of the number of secret-shared values.

Digital rights management is an important technique to protect digital contents from abuse. Usually it is confronted with severe challenges because of the untrusted environment its application executed in. This condition is formally described as white-box attack model. White-box cryptography aims at protecting software implementation of cryptographic algorithms from white-box attack, hence can be employed to provide security guarantee for digital rights management. Key extraction, code lifting, and illegal distribution are three major threats in digital rights management application, they extremely compromise the benefit of content producer. In this paper, we propose the first solution based on white-box cryptography against the three threats above simultaneously, by implementing traceability of a white-box scheme which has unbreakability and incompressibility. Specifically, We constructively confirm there exists secure white-box compiler which can generate traceable white-box programs, by hiding slight perturbations in the lookup-table based white-box implementation. Security and performance analyses show our solution can be effectively achieved in practice.

On the basis of a software implementation of Kummer based HECC over Fp presented in 2016, we propose new hardware architectures. Our main objectives are: definition of architecture parameters (type, size and number of units for arithmetic operations, memory and internal communications); architecture style optimization to exploit internal par-allelism. Several architectures have been designed and implemented on FPGAs for scalar multiplication acceleration in embedded systems. Our results show significant area reduction for similar computation time than best state of the art hardware implementations of curve based solutions.