"> "> 密码学-消息验证码 | Yufei Luo's Blog

密码学-消息验证码

Basic concepts

Message integrity

Message integrity (Message authentication): each party should be able to identify when a message it receives was sent by the party claiming to send it, and it was not modified in transit.

E.g. A bank should make sure a transfer request is sent by a legitimate user, and the received request should not be modified.

Remember that encryption does not privide any integrity unless it is specifically designed with that purpose!

E.g. For a stream cipher, the attacker can filp one or more bits of the ciphertext \(c\) to modify the decrypted plaintext. Since the attacker do not know the plaintext, it does not contradict the concept of secrecy.

Message Authentication Codes (MAC)

Definition

A message authentication code (MAC) consists of three probabilistic polynomial-time algorithms \((Gen,Mac,Vrfy)\) such that

  1. The key-generation algorithm \(Gen\) takes as input the security parameter \(1^n\) and outputs a key \(k\) with \(|k|\ge n\).
  2. The tag-generation algorithm \(Mac\) takes as input a key \(k\) and a message \(m\in \{0,1\}^*\), and outputs a tag \(t\). We write it as \(t\leftarrow Mac_{k}(m)\) since the algorithm might be randomized.
  3. The deterministic verification algorithm \(Vrfy\) takes as input a key \(k\), a message \(m\) and a tag \(t\). It outputs a bit \(b\), with \(b=1\) meaning valid and \(b=0\) meaning invalid. We write this as \(b:=Vrfy_{k}(m,t)\).

It is required that for every \(n\), every key \(k\) output by \(Gen(1^{n})\) and every \(m\in \{0,1\}^*\), it holds that \(Vrfy_{k}(m,Mac_{k}(m))=1\).

Security of MACs

The message authentication experiment \(Mac-forge_{\mathcal{A},\pi}(n)\):

  1. A key \(k\) is generated by running \(Gen(1^{n})\).
  2. The adversary \(\mathcal{A}\) is given input \(1^n\) and oracle access to \(Mac_{k}(\cdot)\). The adversary eventually outputs \((m,t)\) (\(m\) could be any messages of its choice). Let \(\mathcal{Q}\) denote the set of all queries that \(\mathcal{A}\) asked its oracle.
  3. \(\mathcal{A}\) succeeds if and only if (1) \(Vrfy_{k}(m,t)=1\) and (2) \(m \notin \mathcal{Q}\). In that case the output of the experiment is defined to be 1.

A message authentication code \(\Pi =(Gen, Mac, Vrfy)\) is existentilly unforgeable under an adaptive chosen-message attack, or just secure, if for all probabilistic polynomial-time adversaries \(\mathcal{A}\), there is a negligible function \(negl\) such that \[ Pr[Mac-forge_{\cal{A},\pi}(n)=1]\le negl(n) \]

This definition of security is strong in two respects. First, the adversary is allowed to request \(MAC\) tags for any messages of its choice. Second, the adversary is considered to have broken the scheme if it can output a valid tag on any previously unauthenticated message.

By making the definition of security for MACs as strong as possible, we ensure that secure MACs can be used for a wide range of purposes without worring about compatibility of the MAC with the semantics of the application.

Remember the MAC itself offers no protection against replay attacks! To prevent replay attacks, two common techniques are used:

  • Sequence numbers (or counters): the communicating users maintain(synchronize) a state. It can be problematic when users communicate over a lossy channel.
  • Time stamps: the sender prepends the current time \(T\) to the message before authenticating, and sends \(T\) along with the mesage and the and the resulting tag \(t\). The receiver verifies that \(t\) is a valid tag for \(T||m\) and \(T\) is within acceptable clock skew from current time. This method requires maintaining closely synchronized clocks. And a replay attack can still take place if it is done quickly enough.

Strong MACs

Consider a modified experiment \(Mac-sforge\) that defined the same way as \(Mac-forge\), except that the set \(\mathcal{Q}\) contains pairs of oracle queries and their associated responses (i.e. \((m,t)\in \mathcal{Q}\)). The adversary \(\mathcal{A}\) succeeds if and only if \(\mathcal{A}\) outputs \((m,t)\) such that \(Vrfy_{k}(m,t)=1\) and \((m,t)\notin \mathcal{Q}\).

A message authentication code \(\Pi=(Gen,Mac,Vrfy)\) is strongly secure, or a strong MAC, if for all probabilistic polynomial-time adversaries \(\mathcal{A}\), there is a negligible function \(negl\) such that: \[ Pr[Mac-sforge_{\cal{A},\pi}(n)=1]\le negl(n) \]

Let \(\Pi=(Gen,Mac,Vrfy)\) be a secure MAC that uses canonical verification (the receiver re-compute the tag and check for equality). Then \(\Pi\) is a strong MAC. (Attention that the definition of the MAC do not have any restriciton to how \(Vrfy\) algorithm works.)

A potential timing attack (an example of side channel attack): Consider an adversary who can send message/tag pairs to the receiver. The adversary can learn not only whether the receiver accepts or not, but also the time to make decisions. If there is a MAC using canonical verification, the time to reject differs depending on the position of the first unequal byte. The attacker can send \((m,t_{0}),\cdots,(m,t_{255})\) to the receiver, where \(t_{j}\) is a string with the first \(i\) bytes set correctly and the \((i+1)st\) byte equal to \(j\). The tag with slightly longer time for rejection has the correct \((i+1)st\) byte.

MAC verification should use time-independent string comparison that always compares all bytes!

Construct secure MACs

A fixed-length MAC

Construction

Let \(F\) be a pseudorandom function. Define a fixed-length MAC for messages of length \(n\) as follows:

  • Mac: on input a key \(k\in \{0,1\}^n\) and a message \(m\in \{0,1\}^n\), output the tag \(t:=F_{k}(m)\). (If \(|m|\ne |k|\) then output nothing.)
  • Vrfy: on input a key \(k\in \{0,1\}^n\), a message \(m\in \{0,1\}^n\) and a tag \(t\in \{0,1\}^n\), output 1 if and only if \(t=F_{k}(m)\). (If \(|m|\ne |k|\) then output 0.)

If F is a pseudorandom function, then the above construction is a secure fixed-length MAC for messages of length n.

Proof

Consider the MAC \(\tilde{\Pi}\) using a truly random function \(f\) instead of pseudorandom function \(F_{k}\). For any message \(m\notin \mathcal{Q}\), the value \(t=f(m)\) is uniformly distributed in \(\{0,1\}^n\) from the view of adversary \(\mathcal{A}\), so we can get: \[ Pr[Mac-forge_{\cal{A},\tilde{\pi}}(n)=1]\le 2^{-n} \]

Then construct a distinguisher \(D\) that is given input \(1^n\) and access to an oracle \(\mathcal{O}:\{0,1\}^{n}\rightarrow \{0,1\}^{n}\). (1) Run \(\mathcal{A}(1^{n})\). Whenever \(\mathcal{A}\) queries its MAC oracle on a message \(m\), \(\mathcal{A}\) obtain a response \(t\). (2) When \(\mathcal{A}\) outputs \((m,t)\) at the end of its execution, query \(\mathcal{O}\) with \(m\) and obtain response \(\hat{t}\). If \(\hat{t}=t\) and \(\mathcal{A}\) never queried its MAC oracle on \(m\), then output 1; otherwise output 0.

If \(D\)'s oracle is a pseudorandom function, then the view of \(\mathcal{A}\) when run as a sub-routine by \(D\) is distributed identically to the view of \(\mathcal{A}\) in experiment \(Mac-forge_{\mathcal{A},\pi}(n)\). Therefore, \[ Pr[D^{F_{k}(\cdot)}(1^{n})=1]=Pr[Mac-forge_{\cal{A},\pi}(n)=1] \]

If \(D\)'s oracle is a truly random function, then the view of \(\mathcal{A}\) when run as a sub-routine by \(D\) is distributed identically to the view of \(\mathcal{A}\) in experiment \(Mac-forge_{\mathcal{A},\tilde{\pi}}(n)\). Therefore, \[ Pr[D^{f(\cdot)}(1^{n})=1]=Pr[Mac-forge_{\cal{A},\tilde{\pi}}(n)=1] \]

Since \(F\) is a pseudorandom function and \(D\) runs in polynomial time, so there exists a negligible function \(negl\) that \[ |Pr[D^{F_{k}(\cdot)}(1^{n})=1]-Pr[D^{f(\cdot)}(1^{n})=1]|\le negl(n) \]

Combining the above four equations, we finally get \[ Pr[Mac-forge_{\cal{A},\pi}(n)=1]\le 2^{-n}+negl(n) \] which proves the theorem.

Domain extension for MACs

Construction

Let \(\Pi'=(Mac',Vrfy')\) be a fixed-length MAC for messages of length \(n\). Define a MAC as follows:

  • \(Mac\): on input a key \(k\in \{0,1\}^n\) and a message \(m\in \{0,1\}^*\) of nonzero length \(\ell < 2^{n/4}\), parse \(m\) into \(d\) blocks \(m_{1},\cdots,m_{d}\), each of length \(n/4\).(The final block is padded with 0s if necessary.) Choose a uniform identifier \(r\in \{0,1\}^{n/4}\). For \(i=1,\cdots,d\), compute \(t_{i}\leftarrow Mac_{k}'(r||\ell||i||m_{i})\), where \(i,\ell\) are encoded as strings of length \(n/4\). Output the tag \(t:=<r,t_{1},\cdots,t_{d}>\).
  • \(Vrfy\): on input a key \(k\in \{0,1\}^n\), a message \(m\in \{0,1\}^*\) of nonzero length \(\ell < 2^{n/4}\), and a tag \(t:=<r,t_{1},\cdots,t_{d'}>\), parse \(m\) into \(d\) blocks \(m_{1},\dots,m_{d}\), each of length \(n/4\).(The final block is padded with 0s if necessary.) Output 1 if and only if \(d'=d\) and \(Vrfy_{k}'(r||\ell||i||m_{i},t_{i})=1\) for \(1\le i \le d\).

If \(\Pi'\) is a secure fixed-length MAC for messages of length \(n\), then the following construction is a secure MAC (for arbitrary-length messages).

Note:

  1. \(i\) is used for preventing block reordering attack (the attacker shuffles the order of blocks in an authenticated message).

  2. \(\ell\) is intended to thwart truncation attack (the attacker drop blocks and corresponding tags from the end of the message).

  3. \(r\) aims at preventing "mix-and-match" attack (the attacker combines blocks from different messages).

Proof

Let \(Repeat\) denote the event that the same random identifier appears in two of the tags returned by the MAC oracle in experiment \(MAC-forge_{\mathcal{A},\pi'}(n)\). \(NewBlock\) denote the event that \(\mathcal{A}\) tries to output a valid tag on a block that was never previously authenticated by \(Mac'\) in the course of answering \(\mathcal{A}\)'s \(Mac'\) queries.

We have \[ \begin{align} Pr[MAC-forge_{\cal{A},\pi'}(n)=1]=& Pr[MAC-forge_{\cal{A},\pi'}(n)=1 \wedge Repeat] \notag \\ + & Pr[MAC-forge_{\cal{A},\pi'}(n)=1 \wedge \overline{Repeat}\wedge NewBlock] \notag \\ + & Pr[MAC-forge_{\cal{A},\pi'}(n)=1 \wedge \overline{Repeat}\wedge \overline{NewBlock}] \notag \\ \le & Pr[Repeat]+ \notag \\ & Pr[MAC-forge_{\cal{A},\pi'}(n)=1 \wedge NewBlock] + \notag \\ & Pr[MAC-forge_{\cal{A},\pi'}(n)=1 \wedge \overline{Repeat}\wedge \overline{NewBlock}] \end{align} \]

  1. \(Pr[Repeat]\) is negligible. Applying the "birthday bound" (search "birthday problem" for detail), we have \(Pr[Repeat]\le \frac{q(n)^{2}}{2^{n/4}}\). Since \(\mathcal{A}\) makes only polynomially many queries (i.e., \(q(n)\) is polynomial), this value is negligible.

  2. \(Pr[MAC-forge_{\mathcal{A},\pi'}(n)=1 \wedge \overline{Repeat}\wedge \overline{NewBlock}]=0\). If \(MAC-forge_{\mathcal{A},\pi'}(n)=1\) and \(Repeat\) did not occur, then it must be the case that \(NewBlock\) happens. So \(\overline{Repeat}\) and \(\overline{NewBlock}\) can not occur at the same time.

  3. \(Pr[MAC-forge_{\mathcal{A},\pi'}(n)=1 \wedge NewBlock]\) is negligible. Construct a PPT adversary \(\mathcal{A}'\) attacks the fixed-length MAC \(\Pi\). \(\mathcal{A'}\) runs as a subroutine of \(\mathcal{A}\): answering the request by \(\mathcal{A}\) for a tag on \(m\) by choosing \(r\leftarrow \{0,1\}^{n/4}\), parsing \(m\) and making necessary queries to \(Mac_{k}'(\cdot)\); check whether \(NewBlock\) occur when outputs. If \(MAC-forge_{\mathcal{A},\pi'}(n)=1\) then the tag for every block is valid; if \(NewBlock\) occurs then \(\mathcal{A}'\) outputs a block that was never previously authenticated. So \(\mathcal{A'}\) succeeds in outputting a valid forgery on a previously unauthenticated message with probability \[ Pr[Mac-forge_{\cal{A'},\pi}(n)=1]\ge Pr[MAC-forge_{\cal{A},\pi'}(n)=1 \wedge NewBlock] \] The left-hand side is negligible, so proving the claim.

So from the above we have proved that \(Pr[MAC-forge_{\mathcal{A},\pi'}(n)=1]\) is negligible.

CBC-MAC

Let \(F\) be a pseudorandom function, and fix a length function \(\ell >0\). The basic CBC-MAC construction is as follows:

  • \(Mac\): on input a key \(k\in \{0,1\}^n\) and a message \(m\) of length \(\ell(n)\cdot n\), parse \(m\) as \(m=m_{1},\cdots,m_{\ell}\) where each \(m_{i}\) is of length \(n\), and set \(t_{0}:=0^n\) and \(t_{i}:=F_{k}(T_{k-1}~XOR~m_{i})\). Output \(t_{\ell}\) as the tag.
  • \(Vrfy\): on input a key \(k\in \{0,1\}^n\), a message \(m\), a tag \(t\), if \(m\) is not of length \(\ell(n)\cdot n\) the output 0. Otherwise, output 1 if and only if \(t=Mac_{k}(m)\).

Let \(\ell\) be a polynomial. If \(F\) is a pseudorandom function, then the above construction is a secure MAC for messages of length \(\ell(n)\cdot n\). Note that \(t_{0}:=0^n\) is crucial for security, and it is no longer secure when using a random \(t_0\).

For arbitrary-length messages, the fixed-length CBC-MAC can be modified in two ways:

  • Prepend the message \(m\) with its length \(|m|\) (encoded as an n-bit string) and compute basic CBC-MAC on the result.
  • Choose two independent, uniform keys \(k_1 \in \{0,1\}^n\) and \(k_2 \in \{0,1\}^n\). Then to authenticate a message \(m\), first compute the result \(t\) of basic CBC-MAC of \(m\) using \(k_1\), and the tag is computed as \(\hat{t}=F_{k_2}(t)\).

Authenticated encryption

Definitions

Let \(\Pi=(Gen,Enc,Dec)\) be a private-key encryption scheme, \(\mathcal{A}\) be an adversary, and \(n\) be the security parameter. The unforgeable encryption experiment \(Enc-Forge_{\mathcal{A},\pi}(n)\) is defined as follows:

  1. Run \(Gen(1^n)\) to obtain a key \(k\).
  2. The adversary \(\mathcal{A}\) is given input \(1^n\) and access to an encryption oracle \(Enc_{k}(\cdot)\). The adversary outputs a ciphertext \(c\).
  3. Let \(m=Dec_{k}(c)\), and let \(\mathcal{Q}\) denote the set of all queries that \(\mathcal{A}\) asked its encryption oracle. The output of the experiment is 1 if and only if (1) \(m\ne \text{error}\) and (2) \(m\notin \mathcal{Q}\).

A private-key encryption scheme \(\Pi\) is unforgeable if for all probabilistic polynomial-time adversaries \(\mathcal{A}\), there is a negligible function \(negl\) such that: \[ Pr[Enc-Forge_{\mathcal{A},\pi}(n)=1]\le negl(n) \] And a private-key encryption scheme \(\Pi\) is authenticated encryption scheme if it is CCA-secure and unforgeable.

Generic constructions

Let \(\Pi_{E}=(Enc,Dec)\) be a CPA-secure encryption scheme and \(\Pi_{M}=(Mac,Vrfy)\) denote a message authentication code, where key generation in both schemes simply involves choosing a uniform n-bit key. There are three natural approaches to combining encryption and message authentication using independent keys \(k_E\) and \(k_M\) for \(\Pi_{E}\) and \(\Pi_{M}\):

  1. Encrypt-and-authenticate: Given a message \(m\), the transmitted value is \(<c,t>\) where \[c\leftarrow Enc_{k_E}(m)~\text{and}~t\leftarrow Mac_{k_M}(m)\].

This approach may not achieve even the most basic level of secrecy, since MAC itself does not auarantee any secrecy so it is possible that \(t\) might leak information about \(m\) to an eavesdropper. Also, it is not CPA-secure.

  1. Authenticate-then-encrypt: A MAC tag \(t\leftarrow MAC_{k_M}(m)\) is first computed, then \(m||t\) is encrypted and the resulting value \(Enc_{k_E}(m||t)\) is transmitted. This combination does not necessarily yield an authenticated encryption scheme, since the decryption has two steps.

  2. Encrypt-then-authenticate: The message \(m\) is first encrypted by \(c\leftarrow Enc_{k_E}(m)\), then the a MAC is computed by \(t\leftarrow Mac_{k_M}(c)\), and the ciphertext is \(<c,t>\). Decryption of \(<c,t>\) is done by outputting error if \(Vrfy_{k_M}(c,t)\ne 1\), and otherwise outputting \(Dec_{k_E}(c)\).

If \(\Pi_E\) is a CPA-secure private-key encryption scheme, and \(\Pi_{M}\) is a strongly secure MAC, then the encrypt-then-authenticate approach is an authenticated encryption scheme. Note that the \(k_E\) and \(k_M\) must be independent to ensure CCA-secure, and this is a basic principle of cryptography: different instances of cryptographic primitives should always use independent keys.

Secure communication sessions

The authenticated encryption scheme does not suffice the following potential attacks:

  • Re-ordering attack: An attacker can swap the order of the messages.
  • Replay attack: An attacker can replay a valid ciphertext \(c\) sent previously by one of the parties.
  • Reflection attack: An attacker can take a ciphertext \(c\) sent from A to B and send it back to A.

To prevent the above attacks, a counter is used to prevent the first two and a directionality bit is used to prevent the third.

References

  1. Introduction to Modern Cryptography, Second Edition.
  2. Understanding Cryptography: A Textbook for Students and Practitioners (Chinese Version).