Efficient lattice ‐ based authenticated key exchange based on key encapsulation mechanism and signature

Authenticated key exchange protocol is widely applied in the Internet services. Lattice ‐ based key exchange protocols turn out to be quantum ‐ resistant and hence have attracted tremendous attention. A lattice ‐ based explicit authenticated key exchange protocol is constructed by combining an IND ‐ CPA key encapsulation mechanism with a strong EUF ‐ CMA digital signature under the message ‐ recovery mode. Concrete parameter specifications are suggested under the 102 and 218 bits post ‐ quantum security, respectively. Compared with the implicit authenticated key exchange derived directly from the key encapsulation mechanism, our proposals reduce the communication costs by 21.7% and 25.7%, respectively, under the same post ‐ quantum security level. Rough analysis shows that the proposals also obtain some advantages over previous constructions in term of computational efficiency. Moreover, our scheme achieves perfect forward secrecy while the original scheme only satisfies weak forward secrecy.


| INTRODUCTION
Key exchange (KE) protocol is a fundamental cryptographic primitive establishing a shared session key for two communication parties via a public channel in a single round. Since the first proposal by Diffie and Hellman in 1976, a variety of KE protocols based on classical difficult problems have been proposed, for example, the elliptic curve Diffie-Hellman protocol, which is widely used in modern Internet protocols. However, the Diffie-Hellman-like KE protocol only provides passive security. It means that this kind of protocol can neither resist active attacks (such as man-in-the-middle attack, replay attack, etc.) nor provide identity authentication. Therefore, an authenticated key exchange (AKE) protocol is desirable, which enables communicating parties to authenticate each other's identities, and is able to resist active attacks such as deletion, replay, injection, tampering and so on. So, the AKE protocol remains secure even under the malicious attack model.
Plenty of AKE protocols were proposed since the concept of AKE was introduced. In general, AKEs can be classified into explicit protocols and implicit protocols according to the method to achieve authentication. The explicit AKE needs extra cryptographic primitives such as signatures, message authentication codes or hash functions to provide authentication, which brings additional computational and communicated overhead and makes the protocol more complicated. These AKEs include several commonly used protocols such as IKE [1], SA [2], SSL [3] and TLS [4]. The implicit AKE provides authentication by directly utilizing the algebraic structure of the scheme. Typical implicit AKEs include HMQV [5] and OAKE [6].
To measure the security of different AKE protocols, several security models were proposed in which the ability of adversary is well-defined. Normally, an AKE protocol is secure means that no adversary can extract any useful information about the session key. The most commonly used security models for AKE include the BR model [7] and the CK model [8]. The security of an AKE in the BR model requires that no adversary can distinguish the actual session key from a random string. The CK model considers the scenarios in which the adversary can obtain information about static secret key of participants or a session state.
A significant security property of the AKE protocol is perfect forward secrecy (PFS). The PFS property requires that an adversary corrupting one of the parties fails to destroy the security of previously completed sessions. However, Krawzcyk [5] illustrated that two-pass implicit AKE protocols cannot achieve PFS. Therefore, the notion of weak PFS (wPFS) is usually considered for two-pass implicit AKEs. It means that the session key of an honest session remains secure even if the static keys are compromised after the session is completed.
With the development of quantum computers and the proposal of quantum algorithms such as Grover algorithm and Shor algorithm, classical difficult problems such as large integer factorization problem and discrete logarithm problem can be effectively solved under the quantum computing model. Thus the public-key cryptosystems based on the classical number theoretic problems are threatened. Due to the widespread use of the KE protocol in modern Internet security protocols, the requirement of post-quantum KE protocol becomes very urgent. There are several approaches to construct post-quantum KE protocols, and the lattice-based scheme is a strong candidate due to its advantages in efficiency and flexibility.

| Related work
Lattice-based KE protocols are generally constructed by using learning with errors problems (LWE) problem and its variants. In 1996, Ajtai [9] proved that the hardness of the problem over lattices in average-case is equivalent to the hardness in worstcase, which built the foundation for designing the lattice-based public-key cryptosystem. Regev [10] proposed the LWE in 2005, and proved that it is related to the difficult problems over lattices. The proposal of the LWE problem is a revolutionary progress since the subsequent lattice-based schemes basically rely on hard problems like LWE, which makes constructing schemes much easier. Lyubashevsky et al. [11] introduced the ring-LWE which is the ring-based variant of LWE and proved that its hardness is related to the hardness of the problem over ideal lattices. Langlois A et al. [12] had done research on module lattices. Lattice-based KE protocols usually consist of two kinds of protocols: protocols using the error reconciliation mechanism and protocols using the key encapsulation mechanism (KEM). KE protocols based on the LWE problem and its variants are mostly constructed by the error reconciliation mechanism, such as Ding KE [13], BCNS [14], NewHope [15], Frodo [16] so on. KEM can be used to build KE protocols, which makes constructing protocols more efficient and convenient. Protocols using KEM includes NewHope-simple [17], Kyber.KE [18] so on. The communication cost of latticebased post-quantum KE protocols is generally larger than classical protocols such as RSA and ECDH; under the same post-quantum security level, the communication cost is much larger than the post-quantum protocol SIDH [19].
Lattice-based AKE protocols consist of explicit protocols and implicit protocols. Explicit lattice-based protocols include the BCNS [14] protocol and the protocol proposed by Peikert [20]. Referring to the SIGMA protocol [2], Peikert combined passively secure KEM with a digital signature and message authenticated code to construct a post-quantum AKE protocol. Bos et al. embedded the lattice-based passive security protocol into the traditional authentication system (such as TLS protocol) and used RSA signature or elliptic curve digital signature and other traditional authentication mechanisms to achieve identity authentication. This is not a completely latticebased AKE protocol. Specially, Rafael del Pino et al. [21] used the signature scheme with message-recovery mode to construct explicit AKE, which saves the communication cost while achieving authentication.
Implicit protocols are mainly constructed by using a general construction of AKE which was proposed by Fujioka et al. [22]. The general construction of AKE can transform a CCA-secure KEM to an AKE which is proved to be secure in the CK model. Many practical schemes are constructed by this method. Bos et al. [18] proposed a lattice-based cryptography toolkit called Kyber, which consists of public-key encryption algorithm, KEM and KE protocol. In 2018, D'Anvers et al. [23] proposed the Saber scheme which is based on the module-LWR problem. Both of them are very practical and efficient. The protocol obtained by the general construction is not symmetric, and the communication cost is huge. Besides, the security of this type of protocol relies on the random oracle model. On the other hand, Zhang et al. [24] constructed an AKE over ideal lattices which is similar to HMQV. This method does not involve other cryptographic primitives, such as signature, which simplifies the protocol and shows that the security directly relies on the hardness of the ring learning with errors problem. Inspired by KEA, Wang et al. [25] proposed a KEA-style lattice-based AKE. In 2018, Ding et al. [26] proposed an RLWE-based KE against signal leakage attack. In 2019, Zhang et al. [27] proposed asymmetric variants of the module-LWE and module-SIS assumptions, which yield further size-optimized KEM and signature schemes than those from standard counterparts. Most of the exiting lattice-based AKEs are constructed by an implicit method. A competitive candidate among them is the Kyber [18] scheme, which has entered the second submission of NIST. Kyber is a highly practical scheme due to its flexibility and efficiency.
Lattice-based cryptosystems are less efficient than their number-theoretic competitors in terms of key and ciphertext sizes. For adequate security, the former usually needs thousands of bytes while in contrast the latter only requires at most hundreds of bytes. As for KE protocols, this difference has caused more communication overhead, which put off the deployment of lattice-based protocols.

| Our contribution
We construct a lattice-based explicit AKE protocol. By combining the KEM with the digital signature scheme, the communicating parties can obtain a common session key while authenticating the identity of the other party. Our KEM is obtained by the IND-CPA-secure public-key encryption algorithm which was proposed in Kyber [18]. Its security relies on the hardness of the module-LWE problem. The digital signature which was proposed by Del Pino [21] is based on the NTRU problem, and its message-recovery mode can be used to reduce the communication complexity. Compared with the authentication KE protocol proposed in Kyber, our scheme reduces the communication overhead and has additional security advantages.

| Communication overhead
There are three given parameter sets in Kyber [18], while our scheme aims at the parameter set 'Light' and 'Paranoid', which provides 102 and 218 bits of post-quantum security respectively. With our two given parameter sets, one can reduce the communication cost by 21.7% and 25.7%, respectively, under the same post-quantum security level compared with Kyber. AKE. This mainly results due to the message-recovery mode of the signature, which avoids sending messages, and makes Huffman encoding possible.

| Computational efficiency
The Kyber.AKE is constructed following the FSXY transformation [22]. This transformation can build an AKE based on a chosen-ciphertext secure KEM, which is usually obtained by Fujisaki-Okamoto transformation from a CPA-secure public-key encryption scheme. Our KEM is directly derived from PKE without using Fujisaki-Okamoto transformation, which avoids efficiency loss caused by the re-encryption procedure. This makes our scheme more efficient than Kyber.

| Security benefits
A special security property of AKE protocols is called PFS, which means an adversary cannot compromise session keys after a completed session, even if it obtains parties' static secret keys. Based on the research by Krawczyk in Ref. [5], PFS is not achievable for two-pass implicit AKEs (but this may not be true for two-pass AKEs with explicit authentication). Kyber. AKE follows a generic construction from CCA-secure KEM, which achieves wPFS in the Canetti-Krawczyk model. Our scheme is proved to achieve PFS, which requires that the KEM is IND-CPA secure and the signature is strong EUF-CMA secure.

| Paper organization
Section 2 presents some basic notations we use. In Section 3, we introduce a module-LWE-based KEM. In Section 4, we describe the signature with message recovery mode. In Section 5, we construct an AKE with a KEM and a digital signature scheme and analyse the AKE's performance. Furthermore, we give a security proof of that AKE. Section 6 is the conclusion of our work.

| Rings and polynomials
Throughout this paper, we use N to denote a power of 2, and Q a prime subject to Q ≡ 1 mod 2 N. The 2-tuple (N, Q) is the public parameter of the signature scheme, and (n, q) is the public parameters of the KEM.
In the KEM, the symbols R and R q represent the rings Zx�= x n þ 1 ð Þ and Z q x�= x n þ 1 ð Þ, respectively. We use the italic lower-case letters to denote the elements in R or R q . The bold lower-case letters represent column vectors with coefficients in R or R q . Bold upper-case letters denote matrices. We write v T (or A T ) for the transpose of a vector v (or a matrix A).
In the signature scheme, R denotes ring Zx�= fg denotes polynomial multiplication in Zx�. (f) is the vector whose coefficients are f 0 , …, f NÀ 1 . ðf ; gÞ ∈ Z 2 N is the concatenation of (f) and (g).

| Module-LWE
The module-LWE problem was proposed by Langlois and Stele [12] in 2015. As an algebraic structure, the module is a generalization of the concepts of ring and vector space. Thus, the module-LWE problem is an extension of LWE and ring-LWE. The hard problem underlying the security of our KEM is the decisional module-LWE problem.
Definition 1 Let n, k, q are positive integers, s ← β k η , a PPT algorithm A cannot distinguish uniform samples

| Gaussian sampling
Gaussian sampling was introduced in Ref. [28] as a method to sample a short lattice vector using a short basis as a trapdoor. The discrete Gaussian distribution is defined as follows.
, we obtain the probability mass function of the discrete Gaussian distribution D Λ,σ,c .

| Compression algorithm
Ciphertext compression technology is an effective tool to reduce communication cost. By dropping several least XUE ET AL.
-109 significant bits, one can compress the elements of the ciphertext, meanwhile ensuring the recovery of ciphertext. This following compression algorithm transfers elements of ciphertext from Z q to Z 2 d , d < log 2 q.

Definition 3
The compression algorithm consists two polynomial algorithm: Compress q x; d ð Þ and Decompress q x; d ð Þ. These two functions are defined as: ⌋, the distribution of |x 0 À x mod q| is uniform over the integers of magnitude at most B q .

| Anticirculant matrices
When there is no misunderstanding in the context, we simply write A ðf Þ. Anticirculant matrices satisfy the following property: Shoup [29] proposed the first KEM in 2000, and in 2004, Shoup and Cramer [30] proposed a hybrid encryption scheme. This encryption scheme is divided into a KEM and a data encapsulation mechanism (DEM). These two parts are combined for data transmission and encryption, which is called the KEM-DEM mode. Due to the simplicity and efficiency of the KEM, it is very convenient to construct a KE protocol using this modularized approach.

| KEY ENCAPSULATION MECHANISM
A KEM KEM consists of three probabilistic algorithms: KEMKeyGen, Encaps and Decaps. The key generation algorithm KEMKeyGen takes as input a string whose length is the security parameter, and outputs an encapsulation key K e and a decapsulation key K d . The encapsulation algorithm Encaps takes a public key K e to produce a ciphertext c and a key k. The deterministic decapsulation algorithm Decaps takes a secret key K d and a ciphertext c, and outputs either a key or a special symbol ⊥ to indicate rejection. We note that where the probability is taken over K d ; K e ð Þ ← KEMKeyGen and the random coins of Encaps. We will say such a key encapsulation c, which can be decapsulated to the expected key k, is valid.

| IND-CPA-secure KEM
We slightly modify the Kyber.PKE to an IND-CPA-secure KEM. The public-key encryption scheme Kyber.CPA is based on the Module-LWE problem. It was proposed in Ref. [18] and transformed into a CCA-secure KEM by using Fujisaki-Okamoto transformation. This transformation is convenient to build a CCA-secure KEM for further construction, but it is less efficient because there is a re-encryption procedure during the decapsulation process with the transformation. Since our scheme only requires IND-CPA-secure KEM, we abandon the F-O transformation to gain some efficiency benefit. We give a concrete description of the KEM. The correctness and the security proof are given in Ref. [18].
Supposing that k, d t , d u and d v are public parameters, which are all positive integers. Note that n ¼ 256. Let every message m ∈{0,1} 256 be bitstring, which can also be viewed as a polynomial in R with coefficients in {0, 1}. Consider the public-key encryption scheme Kyber.CPA ¼ (KeyGen, Enc, Dec) as described in Algorithms 1-3.

| Parameters and security analysis
The Kyber scheme proposed three recommended parameter sets; the default parameter set aims at 128 bits of post-quantum (and classical) security. The paranoid parameter set aims at more than 196 bits post-quantum security and the light one is designed for the 96-bit security level. The hard problem underlying the security of the KEM is Module-LWE. We measure the hardness of the MLWE problems as an LWE problem. The best-known attacks against the LWE problems which are relevant to our parameter sets is the primal attack and the dual attack. The concrete analysis is introduced in Ref. [18]. It shows that the attack to the default parameter set would invoke BKZ with blocksize 610 to 615. The cost of BKZ with blocksize 610 is dominated by a polynomial number of calls to a dimension 610 SVP solver. According to very conservative analysis, our KEM with default parameter set offers 161 bits of security against the best-known quantum attacks targeting the underlying lattice problem. All the parameter sets and their performances are listed in Table 1.

| DIGITAL SIGNATURE FROM NTRU
The second component of our AKE is a digital signature scheme. There are two ways to construct lattice-based signature schemes: Fiat-Shamir and hash-and-sign. The second approach which was proposed by Gentry et al. [28] is more commonly used due to its efficiency. We introduce the message-recovery signature scheme [21] which is based on the hardness of finding short vectors in NTRU lattices. A digital signature scheme SIG consists of three probabilistic algorithms: SigKeyGen, Sig and Ver. The key generation algorithm SigKeyGen takes as input a string whose length is the security parameter, and outputs a signing key K s and a verification key K v . The signing algorithm Sig takes a signing key K s and a message m to produce a signature σ. The deterministic verification algorithm Ver takes a verification key K v and a signature σ, and outputs either a message m or a special symbol ⊥ to indicate rejection.
The correctness of the digital signature scheme is satisfied if for all K s ; K v ð Þ pairs output by the key generation algorithm, and all messages m in the message space, we have Ver K v ; Sig K s ; m ð Þ ð Þ ¼ m. The security of a digital signature scheme is usually defined according to the standard existential unforgeability against adaptive chosen message attacks [31]: The adversary is allowed to query a signing oracle and then attempts to produce a valid signature of a new message (this message has not been queried for its signature during the signing query stage).

| Signature with Gaussian sampling
We briefly introduce the Gaussian sampling algorithm as SampleD which can sample vectors in arbitrary lattice Λ according to the Gaussian distribution D Λ,σ,c . The input of the algorithm consists of basis B of n-dimensional lattice Λ, Gaussian parameter σ > 0 and centre c ∈ R n . Supposing SampleD can invoke an algorithm SampleZ as a subroutine which can sample from 1-dimensional Gaussian distribution D Z;σ 0 ;c 0 .
The output of SampleD follows Gaussian distribution and the distance of v 0 to the centre depends on the Gram-Schmidt norm of basis B. SampleD cannot output lattice vectors near the centre without trapdoor basis which have small norm as input. We use the preimage sampling function (PSF) to construct a signature. We give a brief introduction of PSF. According to Ref. [28], we have the following lemma.

Lemma 3 Let m ≥ n, q is prime, Λ ⊥ (A) is a lattice defined by matrix
ffi ffi ffi ffi ffi ffi ffi ffi ffi logn p Þ, then for all u ∈ Z n q , there exists a PPT algorithm SamplePre (A, B, s, u) which can sample a vector v ∈ Λ u ðAÞ. v satisfies ‖v‖ ≤ s ffi ffi ffi ffi m p in overwhelming probability.

| Message-recovery signature
Message-recovery signature was first proposed by Nyberg and Rueppel [32]. Compared with the standard digital signature scheme, the message-recovery signature scheme does not require the signer to send the message while sending the signature, since the message has been embedded in the Algorithm 3 Decaps(sk = s, c = (u, v)): Decapsulation signature, anyone who receives the signature can recover the message and then verify the correctness of the signature with the public key. Since the message recovery signature scheme has this characteristic, it can reduce the communicational complexity required and achieve the goal of reducing energy expenditure. It can be applied to energy-limited devices such as handheld terminals or smart-cards. Ducas et al. [33] constructed an identity-based encryption scheme and obtained a hash-and-sign digital signature scheme as a by-product. Del Pino et al. [21] presented its message-recovery variant and constructed an AKE with that message-recovery signature and a NTRU-based KEM. Message-recovery signature has a benefit of smaller communication complexity. Instead of sending a signature and a message, one can simply send a larger signature which allows for the entire message to be recovered. We give a brief outline about the message-recovery signature scheme in Ref. [21]. Supposing that N, Q are public parameters. N is the power of 2. Q is prime and satisfies Q ≡ 1 mod 2 N. Consider the message-recovery signature scheme SIG ¼ (SigKeyGen, Sig, Ver) as described in Algorithm 5 to 7.

| Compressing the signature
The signature of the message-recovery scheme consists of a pair of polynomials s 1 ; s 2 ð Þ which belong to Z q x�= x N þ 1 À � . These coefficients of two polynomials follow the Gaussian distribution D s . Simply sending the signature would require 2n log(q) bits for communication cost. Each coefficient of s 1 and s 2 is distributed according to a distribution that is very close to one dimensional Gaussian distribution of parameter s. As suggested in Ref. [34], the entropy of coefficients is much lower than log(q), which allows us to use Huffman coding to encode the coefficients. For any vector X samples from the Gaussian distribution D s , the entropy is TA B L E 1 Parameter sets and their performance: key size, ciphertext size and post-quantum security   Experiments show that Huffman encoding of Gaussian distributions is much closer to the lower bound, which means E (|C(X)|) ¼ 10.06. According to the security analysis in Ref. [21], these two parameter sets and performance are as shown in Table 2.

| AKE PROTOCOL
Once there is a KEM and a signature scheme, we can generally construct an AKE. In this section, we present a two-round forward-secure ɛ À AKE built from an ɛ À KEM and a digital signature. The security of an AKE which is combined by diverse cryptographic primitives follows the 'cast principle', which means that a cask's volume is decided on the shortest wood plate that consists the cask. In our scenario, the scheme's security strength depends on the weaker one of the components, thus we require the security level of the KEM and the digital signature are matched. We give two parameter sets and their security levels in Table 3.
The concrete description of our AKE is in Figure 1. Note that we use an extra hash function H 2 to verify if the decapsulation has failed. Once there is a decapsulation failure, we restart the protocol. For an ɛ-KEM, the error rate (1 À ɛ) will not be an issue with that mechanism. In paper [21], the probability of decapsulation failure is raised to 2 À 30 for gaining extra security. We can reduce the communication cost as shown in Sec. 6 of [18]. It shows that we can set parameters d u ¼ d t ¼ 10 to further reduce the public-key size and ciphertext key size while increasing the failure probability to 2 À 71.9 . However, if the message we are signing is short, then our technique of sending a longer signature instead of a message will be counterproductive and should not be used. If we keep reducing the public-key size and ciphertext key size, the communication cost will not decrease anymore. Since the message we are sending under given parameter sets is short enough, it is not necessary to increase the probability of decapsulation failures.
Since we restart the protocol when decapsulation failure happens, we can further reduce the communication overhead with raising the error rate. For example, one can reduce the KEM's rounding parameters to d u ¼ d t ¼ 10 which results in shorter public-key and ciphertext size while increasing the failure probability. Under this circumstance we cannot reduce the communication cost with message-recovery signature since the message is too short. Therefore, simply sending message with signature will be more efficient. It only reduces the communication overhead by less than 10% with that method. We do not think it makes too much sense for this little benefit.

| Security analysis
The security of an AKE can be measured by several security properties. Our scheme satisfies the property which is called PFS. PFS means an adversary cannot distinguish the actual session key from a random bitstring after a completed session, even if the static secret key is compromised. We define the following security game. An honest player P can run several sessions of the protocol, each session has a unique identifier sid. The adversary has access to the following queries. The adversary can obtain the session key of the terminated session sid by querying this oracle 3. Corrupt (P): This models the destruction of the long-term authenticated means. The adversary can get the static secret key of the player P We define a session sid is fresh, if the players run the protocol honestly, and neither sid nor its matched session sid have been asked a Reveal-query. Moreover, we call a session sid is forward-secure fresh, if neither sid nor its matched session sid have been asked for a Reveal-query, and neither of them have been asked for a Corrupt-query before terminating the protocol execution. The adversary can initialize a session and query any oracle above. At some point, the adversary can ask a Test-query to a forward-secure fresh and terminated session.
Test (sid): one chooses a random bit b ∈ 0; 1 f g, if b ¼ 1, the actual session key sk is sent back; otherwise returns a random bitstring.
This Test-query can be asked only once. At the end of the game, the adversary outputs his guess b 0 for the bit b in the Test-query. The quality of the adversary A is measured by the advantage Adv fsind As shown in Ref. [21], an AKE is forward-secure if it is constructed as Figure 1 and all components satisfies the security requirement.

Theorem 1
The authenticated key exchange protocol AKE described in Figure 1 is a forward-secure AKE, when H 1 and H 2 are modelled by random oracles onto f0; 1g l 1 and f0; 1g l 2 respectively, if SIG is a strong EUF-CMA signature scheme and KEM is a IND-CPAsecure KEM: -113

Adv f sÀ ind
where n is the number of players involved in the protocols, q s is the number of Send-queries, and q h is the number of hashqueries.
Proof. The security analysis is performed with a sequence of games. It starts with the real security game, between an adversary A and a challenger. After a series of small modifications that do not change the advantage of the adversary, we have a final game in which the advantage of adversary is nearly 0.
Game G 0 : The initial game corresponds to the real attack game in which all the honest players have signing key pairs K s ; K v ð Þ. The Test-query to a forward-secure fresh session sid is answered, after having flipped a coin b, by either the real session key or a random bitstring. According to the definition we have:

Adv f sÀ ind
Game G 1 : This game is basically the same as Game G 0 , the only difference is: one does not check the validity of the signature anymore, but abort if the signature is not generated by the simulation of the player. If the adversary itself generate a valid signature, it will be rejected because the signature is not generated by the simulator. Supposing there are n players in this system, the probability that the above event occurs is n � Succ suf À cma sig A ð Þ, then we have: