Post-quantum cryptography (PQC), as a popular cryptography terminology aiming at providing ‘Post-Quantum’ alternative to the currently existing number theory cryptography 1–3, has obtained great attention in recent years. In essence, PQC overlaps many existing cryptography branches including coding-based cryptography 4, 5, lattice-based cryptography, hash-based cryptography, and multivariate-quadratic-equations cryptography 6. However, driven largely by the possible invention of a large quantum computer in the near future, PQC becomes a new buzz word in cryptography communities and these non-number theory cryptography branches, especially the coding-based cryptography, have brought renewed attention.
In coding-based cryptography, there are two well-known public key encryption schemes, namely McEliece and Niederreiter cryptosystems 4, 5. McEliece cryptosystem was first proposed in 1978 4, which represents the first public key encryption scheme based on linear error-correcting codes. Compared with the classical RSA cryptosystem 7, the McEliece cryptosystem has two advantages: (i) the speeds of both encryption and decryption algorithms are faster; and (ii) with the increase of the key size, the security level also grows much faster. Niederreiter cryptosystem 5 is a dual encryption scheme proposed in 1986, which is not only ten times faster than the McEliece cryptosytem in terms of encryption speed, but also equivalent to the McEliece cryptosystem in terms of security. Following these two seminal works, over the past years, many efforts have been put in coding-based cryptography 8–14. For example, Stern has proposed a coding-based zero knowledge identification scheme in 1993 14; Courtois, Finiasz, and Sendrier have presented the first practical coding-based signature scheme in 2001 10. More recently, how to reduce the public key size and how to secure the parameter choice in coding-based cryptography are also deeply explored 15–19.
The semantic security (a.k.a indistinguishability) against adaptive chosen ciphertext attacks (IND-CCA2) is the strongest known notion of security for the public key encryption schemes. However, in coding-based cryptography, ‘IND-CCA2’ has not been widely discussed. To the best of our knowledge, only a few papers have touched this research issue 20–22. Because McEliece cryptosystem has some special architecture, some general IND-CCA2 conversions 23, 24, though they achieve IND-CCA2 versions of McEliece cryptosystem, may incur some redundancy. Therefore, Kobta and Imai have proposed two specific conversions to reduce the redundancy 20. Recently, Nojima et al. 21 have studied the semantic for the McEliece cryptosystem without random oracles. However, they only achieve the semantic security against the chosen plaintext attacks and the tight reductions are also questionable, especially for the Niederreiter cryptosystem. In Reference 22, Dowsley et al. have also discussed the CCA2 secure public key encryption scheme based on the McEliece assumption in the standard model, but their scheme needs some special constructions. Therefore, how to design an efficient and IND-CCA2 secure coding-based cryptosystem with/without random oracles is still worth of investigation.
In this paper, we propose an efficient IND-CCA2 secure public key encryption scheme based on coding theory. Concretely, we design our scheme based on the syndrome decoding (SD) problem, and use the provable security technique to get a tight reduction in the random oracle model 25. Compared with Niederreiter cryptosystem, only two additional hash operations are required in the proposed scheme. Thus, our scheme achieves fast encryption speed.
The remainder of this paper is organized as follows. In Section 2, we formalize the definition of public key encryption and the corresponding security model. In Section 3, we review the coding theory and the complexity assumption, the base of our proposed scheme. In Section 4, we present our efficient public key encryption scheme based on coding theory, following by its formal security proof and parameter selection in Section 5 and Section 6, respectively. Finally, we draw our conclusions in Section 7.
DEFINITION AND SECURITY MODEL
Let denote the set of natural numbers, and be a security parameter. An event is said to be negligible if it happens with probability less than the inverse of any polynomial in . If , then denotes the string of zeros. If are strings, then denotes the length of , denotes the least significant bits of , denotes the most significant bits of , and denotes the bit XOR if , while if is a finite set, then is its cardinality, and indicates the process of selecting uniformly and at random in . If is a randomized algorithm, then denotes the processing of on inputs , and letting denote its output.
In general, a public key encryption scheme consists of four algorithms:
The randomized setup algorithm Setup takes a security parameter as input, and returns the system public parameters in a polynomial time of ; we write .
The randomized key generation algorithm Kgen takes the system public parameters params as input, and returns a pair consisting of a public key and a corresponding private key in a polynomial time of , we write .
The randomized encryption algorithm Enc takes a public key , a random number , and a plaintext as input, and returns a ciphertext in a polynomial time of ; we write .
The deterministic decryption algorithm Dec takes the private key and a ciphertext as input, and returns the corresponding plaintext or a special symbol indicating that the ciphertext was invalid in a polynomial time of ; we write , where .
All algorithms should satisfy the standard consistency constraint of public key encryption, i.e., for any message , .
We recall the standard notion of security of public key encryption schemes in terms of indistinguishability 26. Concretely, we consider the security notion for a public key encryption scheme is indistinguishable against the adaptive chosen ciphertext attacks, call it the ‘IND-CCA2’ security model for brevity.
Definition 1.(IND-CCA2) Letandbe integers anda real number in, and PKE a secure public key encryption scheme with the security parameter. Letbe an IND-CCA2 adversary, which is allowed to access the decryption oracle(and some random oraclesin the random oracle model), against the indistinguishability of PKE. We consider the following random experiment:
We define the success probability of via
is said to be -IND-CCA2 secure, if no adversary running in time has a success .
CODING THEORY AND COMPLEXITY ASSUMPTION
Let be the finite field with 2 elements , be a security parameter, and denote an -binary linear code of length and dimension , i.e., a subspace of dimension of the vector space . Elements of are called words, and elements of are called codewords. An -binary linear code is usually given in the form of a binary matrix , lines of which form a basis of the code. We call the syndrome of a word is the quantity computed by
If the quantity , i.e., , the word is a codeword. The Hamming weight of a word is referred to the number of its non-zero positions, denoted as ; the Hamming distance between two words and is the number of positions where they differ, and denoted as ; and the minimal distance of an -binary linear code is defined by . Then, the -binary linear code is called code. All codes satisfy the Singleton bound which states that 27. A binary linear code is ensured to exist as long as
This is called the Gilbert-Varshamov (GV) bound. Note that, random binary codes are known to meet the GV bound, in the sense that the above inequality comes very close to being an equality 28, and no available family of binary codes can be decoded in subexponential time up to the GV bound 27.
Syndrome Decoding Problem27: Let , we know that, for any syndrome , there exists at most one word such that and . A syndrome is said to be -decodable in the -binary linear code defined by if there exists such a word . The SD problem is stated as follows: given an binary matrix and a syndrome , compute a word such that and . Note that, to ensure the hardness of SD problem, the parameters should be carefully chosen 27, 29, 30.
Definition 2.(SD Assumption) Letbe an-binary linear code defined by abinary matrixwith the minimal distance, and. An adversary that takes an input of a syndrome, returns a word. We consider the following random experiment on SD problem.
We define the corresponding success probability of in solving the SD problem via
Let and . We call SD to be -secure if no polynomial algorithm running in time has success .
Parameters of Goppa Codes: Goppa codes are subfield subcodes of particular alternant codes. For given integers , binary Goppa codes are of length and with the dimension of . Let denote the family of such Goppa codes, then we have . Since their algebraic structure can be efficiently hidden and provide a good -decoding algorithm, are good candidates for constructing efficient cryptographic algorithms. In the next section, we will use the Goppa codes to designed our efficient and provably secure public key encryption scheme.
PROPOSED PUBLIC KEY ENCRYPTION SCHEME BASED ON CODING THEORY
In this section, we present our public key encryption PKE scheme based on coding theory, which can be regarded as the CCA2 version of Niederreiter cryptosystem 5 and mainly consists of four algorithms, namely Setup, Kgen, Enc, and Dec, as shown in Figure 1.
Setup. Given the security parameter , four integers are chosen such that the -decoding in a Goppa code of length , of dimension has complexity at least 10. In addition, two secure cryptographic hash functions are also chosen, where and . In the end, the system parameters are published.
Kgen. Given the system parameters , choose a random binary Goppa code from the Goppa code family . Let be a parity check matrix of and be a -decoding algorithm in . In addition, a random non-singular binary matrix and a random permutation matrix of size are also chosen. Set the private key and the corresponding public key as .
Enc. Given a message and the public key , choose a random number , and execute the following steps:
compute , where ,
compute such that , ,
set the ciphertext .
Dec. Given a ciphertext and the private key , the following steps are executed:
compute , , and ,
compute , if is , parse as , i.e., ; otherwise output indicating an invalid ciphertext.
Compared with Niederreiter cryptosystem 5, only two additional hash operations are required. As a result, the encryption speed of the proposed scheme is as fast as Niederreiter cryptosystem.
In this section, we prove that the proposed scheme is IND-CCA2-secure in the random oracle model, where the hash functions and are modelled as random oracles 25.
Theorem 1. Letbe an adversary against the proposedscheme in the random oracle model, where the hash functionsandbehave as random oracles. Assume thathas the success probabilityto break the indistinguishability of the ciphertextwithin the running time, afterandqueries to the random oracles, and the decryption oracle, respectively. Then, there existandas follows
such that the SD problem can be solved with probability within time , where is the time complexity for the simulation.
Proof. We define a sequence of games , , of modified attacks starting from the actual adversary 31, 32. All the games operate on the same underlying probability space: the system parameters and public key , the coin tosses of . Let be a random instance of SD problem, we will use these incremental games to reduce the SD instance to the adversary against the IND-CCA2 security of the ciphertext in the proposed scheme.
This is the real attack game. In the game, the adversary is fed with the system parameters and public key . In the first phase, the adversary can access to the random oracles , and the decryption oracle for any input. At some point, the adversary chooses a pair of messages . Then, we randomly choose a bit and produce the message 's ciphertext as the challenge to the adversary . The challenge comes from the public key and one random number , and , with . In the second stage, the adversary is still allowed to access to the random oracles , and the decryption oracle for any input, except the challenge to . Finally, the adversary outputs a bit . In any , we denote by the event . Then, we have
In this game, we simulate the random oracles , , and the decryption oracle , by maintaining the lists -List, -List and -List to deal with the identical queries. In addition, we also simulate the way that the challenge is generated as the challenger would do. The detailed simulation in this game is described in Figure 2. Because the distribution of is unchanged in the eye of the adversary , the simulation is perfect, and we have
In this game, we modify the simulation of the decryption oracle by outputting a random message when the ciphertext has not been ‘correctly’ encrypted.
The two games and are perfectly indistinguishable unless is already in . Because is queried from and behaves uniformly, we can consider a uniform random variable as well. So, the probability that has already been queried to is bounded to , then,
In this game, we modify the simulation of the decryption oracle without resorting to the random oracle .
The two games and are perfectly indistinguishable unless is already in . Because is randomly chosen, we consider as a uniform random variable, So, the probability that has been queried to is bounded to , then,
In this game, we modify the rule Dec-noR in the decryption oracle simulation without resorting to the random oracle .
The two games and are perfectly indistinguishable unless is already in . Because is known to the adversary due to , we consider as a uniform random variable, then the probability that has been queried to is bounded to , then,
In this game, we modify the rule Dec-Init in the decryption oracle simulation.
The two games and are perfectly indistinguishable. If is found in -List, the answer of the decryption oracle is the same as that in . If is not found, i.e., , and , the answer of the decryption oracle is returning a random message as that in . Therefore, we have
In this game, we manufacture the challenge by first choosing the random value of ahead of time.
The two games and are perfectly indistinguishable unless has been asked for . We define this event , then we have
In this game, is only used in , but does not appear in the computation since is not defined to be . Then, the distribution of doesn't depend on . As a result, we have
In this game, instead of defining from , we randomly choose firstly and define from . Because is randomly chosen, we give a random answer for the question to .
The two games and are perfectly indistinguishable unless has been asked for . We define this event , then we have
In this game, is uniformly distributed, and independently of the view of the adversary , since hasn't been revealed. Therefore, we have
In this game, instead of defining from , we randomly choose and then we define from .
In this game, the distribution of is unchanged. Therefore, we have
In this game, we embed the SD challenge in the game by setting .
Clearly, the distribution of is still unchanged. Therefore, we have
In this game, when the event takes place, i.e., there exists an such that has been queried to . Then, such an is just the SD challenge. As a result, we have
Summarizing all the above cases, we have
and the running time , where is the time complexity for the simulation. This completes the proof.
SELECTION OF PARAMETERS
Parameter selection is imperative for the security of coding-based cryptography. If the parameters are not properly chosen, a coding-based system could suffer from threatening attacks based on either Information Set Decoding (ISD) or Generalized Birthday Algorithm (GBA) 18, 33. Since the decryption of the proposed scheme requires knowing one and only one solution for an SD problem, the GBA-based attacks can be ruled out 18. Therefore, to resist the possible ISD-based attacks, some typical parameters used in McEliece be chosen for the proposed scheme. Then, the sizes of the public key, plaintext, and ciphertext can be calcuated, as shown in Table 1.
Table 1. The sizes of plaintext/ciphertext and public key under the typical McEliece/Niederreiter parameters.
n = 2m
k = n−mt
Plaintext size M
Ciphertext size C
Public key size
k1 < 1024 bits
k1 < 2048 bits
k1 < 4096 bits
From the table, we can see that, though the sizes of the public keys are relatively large, the construction of the proposed scheme makes the speed of CCA2-secure encryption almost as fast as that of McEliece/Nederreiter cryptosystems. Furthermore, the recent works by Bender et al. 15 and Misoczki and Barreto 19 can be used to reduce the sizes of public key coding-based cryptography, which can make the coding-based cryptosystems more practical.
In this paper, we have proposed an efficient public key encryption scheme based on coding theory, and formally shown its IND-CCA2 security in the random oracle model. Since the size of the public key in the proposed scheme is relatively large, our future work will focus on reducing the key size 15, 19.
This work was supported in part by the Natural Sciences and Engineering Research Council (NSERC) Strategic Projects of Canada.