• Open Access

An efficient and provably secure public key encryption scheme based on coding theory

Authors


Abstract

Although coding-based public key encryption schemes such as McEliece and Niederreiter cryptosystems have been well studied, it is not a trivial task to design an efficient coding-based cryptosystem with semantic security against adaptive chosen ciphertext attacks (IND-CCA2). To tackle this challenging issue, in this paper, we first propose an efficient IND-CCA2-secure public key encryption scheme based on coding theory. We then use the provable security technique to formally prove the security of the proposed scheme is tightly related to the syndrome decoding (SD) problem in the random oracle model. Compared with the previously reported schemes, the proposed scheme is merited with simple construction and fast encryption speed. Copyright © 2010 John Wiley & Sons, Ltd.

INTRODUCTION

Post-quantum cryptography (PQC), as a popular cryptography terminology aiming at providing ‘Post-Quantum’ alternative to the currently existing number theory cryptography 1–3, has obtained great attention in recent years. In essence, PQC overlaps many existing cryptography branches including coding-based cryptography 4, 5, lattice-based cryptography, hash-based cryptography, and multivariate-quadratic-equations cryptography 6. However, driven largely by the possible invention of a large quantum computer in the near future, PQC becomes a new buzz word in cryptography communities and these non-number theory cryptography branches, especially the coding-based cryptography, have brought renewed attention.

In coding-based cryptography, there are two well-known public key encryption schemes, namely McEliece and Niederreiter cryptosystems 4, 5. McEliece cryptosystem was first proposed in 1978 4, which represents the first public key encryption scheme based on linear error-correcting codes. Compared with the classical RSA cryptosystem 7, the McEliece cryptosystem has two advantages: (i) the speeds of both encryption and decryption algorithms are faster; and (ii) with the increase of the key size, the security level also grows much faster. Niederreiter cryptosystem 5 is a dual encryption scheme proposed in 1986, which is not only ten times faster than the McEliece cryptosytem in terms of encryption speed, but also equivalent to the McEliece cryptosystem in terms of security. Following these two seminal works, over the past years, many efforts have been put in coding-based cryptography 8–14. For example, Stern has proposed a coding-based zero knowledge identification scheme in 1993 14; Courtois, Finiasz, and Sendrier have presented the first practical coding-based signature scheme in 2001 10. More recently, how to reduce the public key size and how to secure the parameter choice in coding-based cryptography are also deeply explored 15–19.

The semantic security (a.k.a indistinguishability) against adaptive chosen ciphertext attacks (IND-CCA2) is the strongest known notion of security for the public key encryption schemes. However, in coding-based cryptography, ‘IND-CCA2’ has not been widely discussed. To the best of our knowledge, only a few papers have touched this research issue 20–22. Because McEliece cryptosystem has some special architecture, some general IND-CCA2 conversions 23, 24, though they achieve IND-CCA2 versions of McEliece cryptosystem, may incur some redundancy. Therefore, Kobta and Imai have proposed two specific conversions to reduce the redundancy 20. Recently, Nojima et al. 21 have studied the semantic for the McEliece cryptosystem without random oracles. However, they only achieve the semantic security against the chosen plaintext attacks and the tight reductions are also questionable, especially for the Niederreiter cryptosystem. In Reference 22, Dowsley et al. have also discussed the CCA2 secure public key encryption scheme based on the McEliece assumption in the standard model, but their scheme needs some special constructions. Therefore, how to design an efficient and IND-CCA2 secure coding-based cryptosystem with/without random oracles is still worth of investigation.

In this paper, we propose an efficient IND-CCA2 secure public key encryption scheme based on coding theory. Concretely, we design our scheme based on the syndrome decoding (SD) problem, and use the provable security technique to get a tight reduction in the random oracle model 25. Compared with Niederreiter cryptosystem, only two additional hash operations are required in the proposed scheme. Thus, our scheme achieves fast encryption speed.

The remainder of this paper is organized as follows. In Section 2, we formalize the definition of public key encryption and the corresponding security model. In Section 3, we review the coding theory and the complexity assumption, the base of our proposed scheme. In Section 4, we present our efficient public key encryption scheme based on coding theory, following by its formal security proof and parameter selection in Section 5 and Section 6, respectively. Finally, we draw our conclusions in Section 7.

DEFINITION AND SECURITY MODEL

Notation

Let equation image denote the set of natural numbers, and equation image be a security parameter. An event is said to be negligible if it happens with probability less than the inverse of any polynomial in equation image. If equation image, then equation image denotes the string of equation image zeros. If equation image are strings, then equation image denotes the length of equation image, equation image denotes the equation image least significant bits of equation image, equation image denotes the equation image most significant bits of equation image, and equation image denotes the bit XOR if equation image, while if equation image is a finite set, then equation image is its cardinality, and equation image indicates the process of selecting equation image uniformly and at random in equation image. If equation image is a randomized algorithm, then equation image denotes the processing of equation image on inputs equation image, and letting equation image denote its output.

Definition

In general, a public key encryption scheme equation image consists of four algorithms:

  • The randomized setup algorithm Setup takes a security parameter equation image as input, and returns the system public parameters equation image in a polynomial time of equation image; we write equation image.

  • The randomized key generation algorithm Kgen takes the system public parameters params as input, and returns a pair equation image consisting of a public key and a corresponding private key in a polynomial time of equation image, we write equation image.

  • The randomized encryption algorithm Enc takes a public key equation image, a random number equation image, and a plaintext equation image as input, and returns a ciphertext equation image in a polynomial time of equation image; we write equation image.

  • The deterministic decryption algorithm Dec takes the private key equation image and a ciphertext equation image as input, and returns the corresponding plaintext equation image or a special symbol equation image indicating that the ciphertext was invalid in a polynomial time of equation image; we write equation image, where equation image.

All algorithms should satisfy the standard consistency constraint of public key encryption, i.e., for any message equation image, equation image.

Security model

We recall the standard notion of security of public key encryption schemes in terms of indistinguishability 26. Concretely, we consider the security notion for a public key encryption scheme is indistinguishable against the adaptive chosen ciphertext attacks, call it the ‘IND-CCA2’ security model for brevity.

Definition 1.(IND-CCA2) Letequation imageandequation imagebe integers andequation imagea real number inequation image, and PKE a secure public key encryption scheme with the security parameterequation image. Letequation imagebe an IND-CCA2 adversary, which is allowed to access the decryption oracleequation image(and some random oraclesequation imagein the random oracle model), against the indistinguishability of PKE. We consider the following random experiment:

Experimentequation image

equation image
equation image
equation image
equation image
equation image

ifequation imagethen returnequation imageelseequation image

returnequation image

We define the success probability of equation imagevia

equation image

equation image is said to be equation image-IND-CCA2 secure, if no adversary equation image running in time equation image has a success equation image.

CODING THEORY AND COMPLEXITY ASSUMPTION

Let equation image be the finite field with 2 elements equation image, equation image be a security parameter, and equation image denote an equation image-binary linear code of length equation image and dimension equation image, i.e., a subspace of dimension equation image of the vector space equation image. Elements of equation image are called words, and elements of equation image are called codewords. An equation image-binary linear code is usually given in the form of a equation image binary matrix equation image, lines of which form a basis of the code. We call the syndrome of a word equation image is the quantity equation image computed by

equation image

If the quantity equation image, i.e., equation image, the word equation image is a codeword. The Hamming weight of a word equation image is referred to the number of its non-zero positions, denoted as equation image; the Hamming distance between two words equation image and equation image is the number of positions where they differ, and denoted as equation image; and the minimal distance of an equation image-binary linear code equation image is defined by equation image. Then, the equation image-binary linear code equation image is called equation image code. All equation image codes satisfy the Singleton bound which states that equation image27. A binary linear equation image code is ensured to exist as long as

equation image

This is called the Gilbert-Varshamov (GV) bound. Note that, random binary codes are known to meet the GV bound, in the sense that the above inequality comes very close to being an equality 28, and no available family of binary codes can be decoded in subexponential time up to the GV bound 27.

Syndrome Decoding Problem27: Let equation image, we know that, for any syndrome equation image, there exists at most one word equation image such that equation image and equation image. A syndrome equation image is said to be equation image-decodable in the equation image-binary linear code equation image defined by equation image if there exists such a word equation image. The SD problem is stated as follows: given an equation image binary matrix equation image and a syndrome equation image, compute a word equation image such that equation image and equation image. Note that, to ensure the hardness of SD problem, the parameters should be carefully chosen 27, 29, 30.

Definition 2.(SD Assumption) Letequation imagebe anequation image-binary linear code defined by aequation imagebinary matrixequation imagewith the minimal distanceequation image, andequation image. An adversary that takes an input of a syndromeequation image, returns a wordequation image. We consider the following random experiment on SD problem.

Experimentequation image

equation image
equation image

thenequation imageelseequation image

returnequation image

We define the corresponding success probability of equation image in solving the SD problem via

equation image

Let equation image and equation image. We call SD to be equation image-secure if no polynomial algorithm equation image running in time equation image has success equation image.

Parameters of Goppa Codes: Goppa codes are subfield subcodes of particular alternant codes. For given integers equation image, binary Goppa codes are of length equation image and with the dimension of equation image. Let equation image denote the family of such Goppa codes, then we have equation image. Since their algebraic structure can be efficiently hidden and provide a good equation image-decoding algorithm, equation image are good candidates for constructing efficient cryptographic algorithms. In the next section, we will use the Goppa codes to designed our efficient and provably secure public key encryption scheme.

PROPOSED PUBLIC KEY ENCRYPTION SCHEME BASED ON CODING THEORY

In this section, we present our public key encryption PKE scheme based on coding theory, which can be regarded as the CCA2 version of Niederreiter cryptosystem 5 and mainly consists of four algorithms, namely Setup, Kgen, Enc, and Dec, as shown in Figure 1.

Figure 1.

Proposed public key encryption scheme based on coding theory.

Setup. Given the security parameter equation image, four integers equation image are chosen such that the equation image-decoding in a Goppa code of length equation image, of dimension equation image has complexity at least equation image10. In addition, two secure cryptographic hash functions equation image are also chosen, where equation image and equation image. In the end, the system parameters equation image are published.

Kgen. Given the system parameters equation image, choose a random binary Goppa code equation image from the Goppa code family equation image. Let equation image be a parity check matrix of equation image and equation image be a equation image-decoding algorithm in equation image. In addition, a random non-singular equation image binary matrix equation image and a random permutation matrix equation image of size equation image are also chosen. Set the private key equation image and the corresponding public key equation image as equation image.

Enc. Given a message equation image and the public key equation image, choose a random number equation image, and execute the following steps:

  • compute equation image, where equation image,

  • compute equation image such that equation image, equation image,

  • set the ciphertext equation image.

Dec. Given a ciphertext equation image and the private key equation image, the following steps are executed:

  • compute equation image,

  • compute equation image, equation image, and equation image,

  • compute equation image, if equation image is equation image, parse equation image as equation image, i.e., equation image; otherwise output equation image indicating an invalid ciphertext.

Compared with Niederreiter cryptosystem 5, only two additional hash operations are required. As a result, the encryption speed of the proposed scheme is as fast as Niederreiter cryptosystem.

SECURITY PROOF

In this section, we prove that the proposed scheme is IND-CCA2-secure in the random oracle model, where the hash functions equation image and equation image are modelled as random oracles 25.

Theorem 1. Letequation imagebe an adversary against the proposedequation imagescheme in the random oracle model, where the hash functionsequation imageandequation imagebehave as random oracles. Assume thatequation imagehas the success probabilityequation imageto break the indistinguishability of the ciphertextequation imagewithin the running timeequation image, afterequation imageandequation imagequeries to the random oraclesequation image, equation imageand the decryption oracleequation image, respectively. Then, there existequation imageandequation imageas follows

equation image(1)

such that the SD problem can be solved with probability equation image within time equation image, where equation image is the time complexity for the simulation.

Proof. We define a sequence of games equation image, equation image, equation image of modified attacks starting from the actual adversary equation image31, 32. All the games operate on the same underlying probability space: the system parameters equation image and public key equation image, the coin tosses of equation image. Let equation image be a random instance of SD problem, we will use these incremental games to reduce the SD instance to the adversary equation image against the IND-CCA2 security of the ciphertext equation image in the proposed equation image scheme.

equation image This is the real attack game. In the game, the adversary equation image is fed with the system parameters equation image and public key equation image. In the first phase, the adversary equation image can access to the random oracles equation image, equation image and the decryption oracle equation image for any input. At some point, the adversary equation image chooses a pair of messages equation image. Then, we randomly choose a bit equation image and produce the message equation image's ciphertext equation image as the challenge to the adversary equation image. The challenge comes from the public key equation image and one random number equation image, and equation image, equation image with equation image. In the second stage, the adversary equation image is still allowed to access to the random oracles equation image, equation image and the decryption oracle equation image for any input, except the challenge equation image to equation image. Finally, the adversary equation image outputs a bit equation image. In any equation image, we denote by equation image the event equation image. Then, we have

equation image(2)

equation image In this game, we simulate the random oracles equation image, equation image, and the decryption oracle equation image, by maintaining the lists equation image-List, equation image-List and equation image-List to deal with the identical queries. In addition, we also simulate the way that the challenge equation image is generated as the challenger would do. The detailed simulation in this game is described in Figure 2. Because the distribution of equation image is unchanged in the eye of the adversary equation image, the simulation is perfect, and we have

equation image(3)
Figure 2.

Formal simulation of the IND-CCA2 game against the proposed PKE based on coding theory.

equation image In this game, we modify the simulation of the decryption oracle equation image by outputting a random message equation image when the ciphertext equation image has not been ‘correctly’ encrypted.

equation image

The two games equation image and equation image are perfectly indistinguishable unless equation image is already in equation image. Because equation image is queried from equation image and behaves uniformly, we can consider equation image a uniform random variable as well. So, the probability that equation image has already been queried to equation image is bounded to equation image, then,

equation image(4)

equation image In this game, we modify the simulation of the decryption oracle equation image without resorting to the random oracle equation image.

equation image

The two games equation image and equation image are perfectly indistinguishable unless equation image is already in equation image. Because equation image is randomly chosen, we consider equation image as a uniform random variable, So, the probability that equation image has been queried to equation image is bounded to equation image, then,

equation image(5)

equation image In this game, we modify the rule Dec-noR in the decryption oracle equation image simulation without resorting to the random oracle equation image.

equation image

The two games equation image and equation image are perfectly indistinguishable unless equation image is already in equation image. Because equation image is known to the adversary equation image due to equation image, we consider equation image as a uniform random variable, then the probability that equation image has been queried to equation image is bounded to equation image, then,

equation image(6)

equation image In this game, we modify the rule Dec-Init in the decryption oracle equation image simulation.

equation image

The two games equation image and equation image are perfectly indistinguishable. If equation image is found in equation image-List, the answer of the decryption oracle equation image is the same as that in equation image. If equation image is not found, i.e., equation image, and equation image, the answer of the decryption oracle equation image is returning a random message equation image as that in equation image. Therefore, we have

equation image(7)

equation image In this game, we manufacture the challenge equation image by first choosing the random value of equation image ahead of time.

equation image

The two games equation image and equation image are perfectly indistinguishable unless equation image has been asked for equation image. We define this event equation image, then we have

equation image(8)

In this game, equation image is only used in equation image, but does not appear in the computation since equation image is not defined to be equation image. Then, the distribution of equation image doesn't depend on equation image. As a result, we have

equation image(9)

equation image In this game, instead of defining equation image from equation image, we randomly choose equation image firstly and define equation image from equation image. Because equation image is randomly chosen, we give a random answer for the question equation image to equation image.

equation image

The two games equation image and equation image are perfectly indistinguishable unless equation image has been asked for equation image. We define this event equation image, then we have

equation image(10)

In this game, equation image is uniformly distributed, and independently of the view of the adversary equation image, since equation image hasn't been revealed. Therefore, we have

equation image(11)

equation image In this game, instead of defining equation image from equation image, we randomly choose equation image and then we define equation image from equation image.

equation image

In this game, the distribution of equation image is unchanged. Therefore, we have

equation image(12)

equation image In this game, we embed the SD challenge equation image in the game by setting equation image.

equation image

Clearly, the distribution of equation image is still unchanged. Therefore, we have

equation image(13)

In this game, when the event equation image takes place, i.e., there exists an equation image such that equation image has been queried to equation image. Then, such an equation image is just the SD challenge. As a result, we have

equation image(14)

Summarizing all the above cases, we have

equation image(15)

and the running time equation image, where equation image is the time complexity for the simulation. This completes the proof.

SELECTION OF PARAMETERS

Parameter selection is imperative for the security of coding-based cryptography. If the parameters are not properly chosen, a coding-based system could suffer from threatening attacks based on either Information Set Decoding (ISD) or Generalized Birthday Algorithm (GBA) 18, 33. Since the decryption of the proposed scheme requires knowing one and only one solution for an SD problem, the GBA-based attacks can be ruled out 18. Therefore, to resist the possible ISD-based attacks, some typical parameters used in McEliece be chosen for the proposed scheme. Then, the sizes of the public key, plaintext, and ciphertext can be calcuated, as shown in Table 1.

Table 1. The sizes of plaintext/ciphertext and public key under the typical McEliece/Niederreiter parameters.
(m, t)n = 2mk = nmtPlaintext size MCiphertext size CPublic key size equation image
(10, 50)1024 bits524 bitsk1 < 1024 bits1524 bits62.5 kbytes
(11, 32)2048 bits1696 bitsk1 < 2048 bits2400 bits88 kbytes
(12, 41)4096 bits3604 bitsk1 < 4096 bits4588 bits246 kbytes

From the table, we can see that, though the sizes of the public keys are relatively large, the construction of the proposed scheme makes the speed of CCA2-secure encryption almost as fast as that of McEliece/Nederreiter cryptosystems. Furthermore, the recent works by Bender et al. 15 and Misoczki and Barreto 19 can be used to reduce the sizes of public key coding-based cryptography, which can make the coding-based cryptosystems more practical.

CONCLUSIONS

In this paper, we have proposed an efficient public key encryption scheme based on coding theory, and formally shown its IND-CCA2 security in the random oracle model. Since the size of the public key equation image in the proposed scheme is relatively large, our future work will focus on reducing the key size 15, 19.

Acknowledgements

This work was supported in part by the Natural Sciences and Engineering Research Council (NSERC) Strategic Projects of Canada.

Ancillary