DQ

avidu

Chapter11PPT.pptx

Home >Computer Science homework help >DQ

Cryptography and Network Security: Principles and Practice

Eighth Edition

Chapter 11

Cryptographic Hash Functions

Lecture slides prepared for “Cryptography and Network Security”, 8/e, by William Stallings. Chapter 11, “Cryptographic Hash Functions”.

This chapter begins with a discussion of the wide variety of applications for

cryptographic hash functions. Next, we look at the security requirements for such

functions. Then we look at the use of cipher block chaining to implement a cryptographic hash function. The remainder of the chapter is devoted to the most important and widely used family of cryptographic hash functions, the Secure Hash Algorithm (SHA) family.

Learning Objectives

Summarize the applications of cryptographic hash functions.

Explain why a hash function used for message authentication needs to be secured.

Understand the differences among preimage resistant, second preimage resistant, and collision resistant properties.

Present an overview of the basic structure of cryptographic hash functions.

Describe how cipher block chaining can be used to construct a hash function.

Understand the operation of SHA-512.

Hash Functions

A hash function H accepts a variable-length block of data M as input and produces a fixed-size hash value

h = H(M)

Principal object is data integrity

Cryptographic hash function

An algorithm for which it is computationally infeasible to find either:

(a) a data object that maps to a pre-specified hash result (the one-way property)

(b) two data objects that map to the same hash result (the collision-free property)

A hash function H accepts a variable-length block of data M as input and produces a fixed-size result h = H(M), referred to as a hash value or a hash code. A “good” hash function has the property that the results of applying the function to a large set of inputs will produce outputs that are evenly distributed and apparently random. In general terms, the principal object of a hash function is data integrity. A change to any bit or bits in M results, with high probability, in a change to the hash value.

The kind of hash function needed for security applications is referred to as a cryptographic hash function. A cryptographic hash function is an algorithm for which it is computationally infeasible (because no attack is significantly more efficient than brute force) to find either (a) a data object that maps to a pre-specified hash result (the one-way property) or (b) two data objects that map to the same hash result (the collision-free property). Because of these characteristics, hash functions are often used to determine whether or not data has changed.

Figure 11.1 Cryptographic Hash Function; h = H(M)

Figure 11.1 depicts the general operation of a cryptographic hash function.

Typically, the input is padded out to an integer multiple of some fixed length

(e.g., 1024 bits), and the padding includes the value of the length of the original message in bits. The length field is a security measure to increase the difficulty for an attacker to produce an alternative message with the same hash value.

Figure 11.2 Attack Against Hash Function

Message authentication is a mechanism or service used to verify the integrity of a message. Message authentication assures that data received are exactly as sent (i.e., there is no modification, insertion, deletion, or replay). In many cases, there is a requirement that the authentication mechanism assures that the purported identity of the sender is valid. When a hash function is used to provide message authentication, the hash function value is often referred to as a message digest.

The essence of the use of a hash function for message authentication is as

follows. The sender computes a hash value as a function of the bits in the message and transmits both the hash value and the message. The receiver performs the same hash calculation on the message bits and compares this value with the incoming hash value. If there is a mismatch, the receiver knows that the message (or possibly the hash value) has been altered (Figure 11.2a).

The hash function must be transmitted in a secure fashion. That is, the hash

function must be protected so that if an adversary alters or replaces the message,

it is not feasible for adversary to also alter the hash value to fool the receiver. This

type of attack is shown in Figure 11.2b. In this example, Alice transmits a data block and attaches a hash value. Darth intercepts the message, alters or replaces the data block, and calculates and attaches a new hash value. Bob receives the altered data with the new hash value and does not detect the change. To prevent this attack, the hash value generated by Alice must be protected.

Figure 11.3 Simplified Examples of the Use of a Hash Function for Message Authentication

Figure 11.3 illustrates a variety of ways in which a hash code can be used to

provide message authentication, as follows:

a. The message plus concatenated hash code is encrypted using symmetric

encryption. Because only A and B share the secret key, the message must

have come from A and has not been altered. The hash code provides the structure or redundancy required to achieve authentication. Because encryption is

applied to the entire message plus hash code, confidentiality is also provided.

b. Only the hash code is encrypted, using symmetric encryption. This reduces the

processing burden for those applications that do not require confidentiality.

c. It is possible to use a hash function but no encryption for message authentication. The technique assumes that the two communicating parties share a common secret value S. A computes the hash value over the concatenation of M and S and appends the resulting hash value to M. Because B possesses S, it can recompute the hash value to verify. Because the secret value itself is not sent, an opponent cannot modify an intercepted message and cannot generate a false message.

d. Confidentiality can be added to the approach of method (c) by encrypting the

entire message plus the hash code.

When confidentiality is not required, method (b) has an advantage over

methods (a) and (d), which encrypts the entire message, in that less computation

is required. Nevertheless, there has been growing interest in techniques that

avoid encryption (Figure 11.3c). Several reasons for this interest are pointed out in

[TSUD92].

• Encryption software is relatively slow. Even though the amount of data to be

encrypted per message is small, there may be a steady stream of messages into

and out of a system.

• Encryption hardware costs are not negligible. Low-cost chip implementations

of DES are available, but the cost adds up if all nodes in a network must have

this capability.

• Encryption hardware is optimized toward large data sizes. For small blocks

of data, a high proportion of the time is spent in initialization/invocation

overhead.

• Encryption algorithms may be covered by patents, and there is a cost associated with licensing their use.

Message Authentication Code (MAC)

Also known as a keyed hash function

Typically used between two parties that share a secret key to authenticate information exchanged between those parties

Takes as input a secret key and a data block and produces a hash value (MAC) which is associated with the protected message

If the integrity of the message needs to be checked, the MAC function can be applied to the message and the result compared with the associated MAC value

An attacker who alters the message will be unable to alter the associated MAC value without knowledge of the secret key

More commonly, message authentication is achieved using a message authentication code (MAC), also known as a keyed hash function. Typically, MACs are used between two parties that share a secret key to authenticate information exchanged between those parties. A MAC function takes as input a secret key and a data block and produces a hash value, referred to as the MAC, which is associated with the protected message. If the integrity of the message needs to be checked, the MAC function can be applied to the message and the result compared with the associated MAC value. An attacker who alters the message will be unable to alter the associated MAC value without knowledge of the secret key. Note that the verifying party also knows who the sending party is because no one else knows the secret key.

Note that the combination of hashing and encryption results in an overall function that is, in fact, a MAC (Figure 11.3b). That is, E(K, H(M)) is a function of a variable-length message M and a secret key K, and it produces a fixed-size out- put that is secure against an opponent who does not know the secret key. In prac- tice, specific MAC algorithms are designed that are generally more efficient than an encryption algorithm.

Digital Signature

Operation is similar to that of the MAC

The hash value of a message is encrypted with a user’s private key

Anyone who knows the user’s public key can verify the integrity of the message

An attacker who wishes to alter the message would need to know the user’s private key

Implications of digital signatures go beyond just message authentication

Another important application, which is similar to the message authentication

application, is the digital signature . The operation of the digital signature is similar

to that of the MAC. In the case of the digital signature, the hash value of a message

is encrypted with a user’s private key. Anyone who knows the user’s public key

can verify the integrity of the message that is associated with the digital signature.

In this case, an attacker who wishes to alter the message would need to know the

user’s private key. As we shall see in Chapter 14, the implications of digital signatures go beyond just message authentication.

Figure 11.4 Simplified Examples of Digital Signatures

Figure 11.4 illustrates, in a simplified fashion, how a hash code is used to provide

a digital signature.

a. The hash code is encrypted, using public-key encryption with the sender’s private key. As with Figure 11.3b, this provides authentication. It also provides a

digital signature, because only the sender could have produced the encrypted

hash code. In fact, this is the essence of the digital signature technique.

b. If confidentiality as well as a digital signature is desired, then the message

plus the private-key-encrypted hash code can be encrypted using a symmetric

secret key. This is a common technique.

Other Hash Function Uses

Commonly used to create a one-way password file

When a user enters a password, the hash of that password is compared to the stored hash value for verification

This approach to password protection is used by most operating systems

Can be used for intrusion and virus detection

Store H(F) for each file on a system and secure the hash values

One can later determine if a file has been modified by recomputing H(F)

An intruder would need to change F without changing H(F)

Can be used to construct a pseudorandom function (PRF) or a pseudorandom number generator (PRNG)

A common application for a hash-based PRF is for the generation of symmetric keys

Hash functions are commonly used to create a one-way password file. Chapter 24 explains a scheme in which a hash of a password is stored by an operating system rather than the password itself. Thus, the actual password is not retrievable by a hacker who gains access to the password file. In simple terms, when a user enters a password, the hash of that password is compared to the stored hash value for verification. This approach to password protection is used by most operating systems.

Hash functions can be used for intrusion detection and virus detection. Store H(F) for each file on a system and secure the hash values (e.g., on a CD-R that is kept secure). One can later determine if a file has been modified by recomputing H(F). An intruder would need to change F without changing H(F).

A cryptographic hash function can be used to construct a pseudorandom function (PRF) or a pseudorandom number generator (PRNG). A common application for a hash-based PRF is for the generation of symmetric keys. We discuss this application in Chapter 12.

Two Simple Hash Functions

Consider two simple insecure hash functions that operate using the following general principles:

The input is viewed as a sequence of n-bit blocks

The input is processed one block at a time in an iterative fashion to produce an n-bit hash function

Bit-by-bit exclusive-OR (XOR) of every block

Ci = bi1 xor bi2 xor . . . xor bim

Produces a simple parity for each bit position and is known as a longitudinal redundancy check

Reasonably effective for random data as a data integrity check

Perform a one-bit circular shift on the hash value after each block is processed

Has the effect of randomizing the input more completely and overcoming any regularities that appear in the input

To get some feel for the security considerations involved in cryptographic hash functions, we present two simple, insecure hash functions in this section. All hash functions operate using the following general principles. The input (message, file, etc.) is viewed as a sequence of n-bit blocks. The input is processed one block at a time in an iterative fashion to produce an n-bit hash function.

One of the simplest hash functions is the bit-by-bit exclusive-OR (XOR) of every block, which can be expressed as shown.

This operation produces a simple parity for each bit position and is known as a longitudinal redundancy check. It is reasonably effective for random data as a data integrity check. Each n-bit hash value is equally likely. Thus, the probability that a data error will result in an unchanged hash value is 2–n. With more predictably formatted data, the function is less effective. For example, in most normal text files, the high-order bit of each octet is always zero. So if a 128-bit hash value is used, instead of an effectiveness of 2–128, the hash function on this type of data has an effectiveness of 2–112.

A simple way to improve matters is to perform a one-bit circular shift, or rotation, on the hash value after each block is processed.

Figure 11.5 Two Simple Hash Functions

Figure 11.5 illustrates these two types of hash functions for 16-bit hash values.

Although the second procedure provides a good measure of data integrity,

it is virtually useless for data security when an encrypted hash code is used with a

plaintext message, as in Figures 11.3b and 11.4a. Given a message, it is an easy matter to produce a new message that yields that hash code: Simply prepare the desired alternate message and then append an n-bit block that forces the new message plus block to yield the desired hash code.

Although a simple XOR or rotated XOR (RXOR) is insufficient if only the

hash code is encrypted, you may still feel that such a simple function could be useful when the message together with the hash code is encrypted (Figure 11.3a). But you must be careful.

Requirements and Security

Preimage

x is the preimage of h for a hash value h = H(x)

Is a data block whose hash function, using the function H, is h

Because H is a many-to-one mapping, for any given hash value h, there will in general be multiple preimages

Collision

Occurs if we have x ≠ y and H(x) = H(y)

Because we are using hash functions for data integrity, collisions are clearly undesirable

Before proceeding, we need to define two terms. For a hash value h = H(x), we say that x is the preimage of h. That is, x is a data block whose hash value, using the function H, is h. Because H is a many-to-one mapping, for any given hash value h, there will in general be multiple preimages. A collision occurs if we have x ≠ y and H(x) = H(y). Because we are using hash functions for data integrity, collisions are clearly undesirable.

Table 11.1 Requirements for a Cryptographic Hash Function H

Requirement	Description
Variable input size	H can be applied to a block of data of any size.
Fixed output size	H produces a fixed-length output.
Efficiency	H(x) is relatively easy to compute for any given x, making both hardware and software implementations practical.
Preimage resistant (one-way property)	For any given hash value h, it is computationally infeasible to find y such that H(y) = h.
Second preimage resistant (weak collision resistant)	For any given block x, it is computationally Infeasible to find y x with H(y) = H(x).
Collision resistant (strong collision resistant)	It is computationally infeasible to find any pair (x, y) with x y, such that H(x) = H(y).
Pseudorandomness	Output of H meets standard tests for pseudorandomness.

Table 11.1 lists the generally accepted requirements for a cryptographic hash function. The first three properties are requirements for the practical application of a hash function.

The fourth property, preimage resistant, is the one-way property: it is easy to generate a code given a message, but virtually impossible to generate a message given a code. This property is important if the authentication technique involves the use of a secret value (Figure 11.3c). The secret value itself is not sent. However, if the hash function is not one way, an attacker can easily discover the secret value: If the attacker can observe or intercept a transmission, the attacker obtains the message M, and the hash code h = H(S } M). The attacker then inverts the hash function to obtain S } M = H-1(MDM). Because the attacker now has both M and S } M, it is a trivial matter to recover S.

The fifth property, second preimage resistant, guarantees that it is infeasible to find an alternative message with the same hash value as a given message. This prevents forgery when an encrypted hash code is used (Figures 11.3b and 11.4a). If this property were not true, an attacker would be capable of the following sequence: First, observe or intercept a message plus its encrypted hash code; second, generate an unencrypted hash code from the message; third, generate an alternate message with the same hash code.

A hash function that satisfies the first five properties in Table 11.1 is referred to as a weak hash function. If the sixth property, collision resistant, is also satisfied, then it is referred to as a strong hash function. A strong hash function protects against an attack in which one party generates a message for another party to sign. For example, suppose Bob writes an IOU message, sends it to Alice, and she signs it. Bob finds two messages with the same hash, one of which requires Alice to pay a small amount and one that requires a large payment. Alice signs the first message, and Bob is then able to claim that the second message is authentic.

The final requirement in Table 11.1, pseudorandomness, has not tradition- ally been listed as a requirement of cryptographic hash functions but is more or less implied. [JOHN05] points out that cryptographic hash functions are commonly used for key derivation and pseudorandom number generation, and that in message integrity applications, the three resistant properties depend on the output of the hash function appearing to be random. Thus, it makes sense to verify that in fact a given hash function produces pseudorandom output.

Figure 11.6 Relationship Among Hash Function Properties

Figure 11.6 shows the relationships among the three resistant properties.

A function that is collision resistant is also second preimage resistant, but the

reverse is not necessarily true. A function can be collision resistant but not preimage

resistant and vice versa. A function can be preimage resistant but not second

preimage resistant and vice versa. See [MENE97] for a discussion.

Table 11.2 Hash Function Resistance Properties Required for Various Data Integrity Applications

Blank	Preimage Resistant	Second Preimage Resistant	Collision Resistant
Hash + digital signature	yes	yes	yes*
Intrusion detection and virus detection	Blank	Blank	Blank
Hash + symmetric encryption	Blank	Blank	Blank
One-way password file	yes	Blank	Blank
MAC	yes	yes	yes*

*Resistance required if attacker is able to mount a chosen message attack

Table 11.2 shows the resistant properties required for various hash function

applications.

Attacks on Hash Functions

Brute-Force Attacks

Does not depend on the specific algorithm, only depends on bit length

In the case of a hash function, attack depends only on the bit length of the hash value

Method is to pick values at random and try each one until a collision occurs

Cryptanalysis

An attack based on weaknesses in a particular cryptographic algorithm

Seek to exploit some property of the algorithm to perform some attack other than an exhaustive search

As with encryption algorithms, there are two categories of attacks on hash

functions: brute-force attacks and cryptanalysis. A brute-force attack does not depend on the specific algorithm but depends only on bit length. In the case of a hash function, a brute-force attack depends only on the bit length of the hash value. A cryptanalysis, in contrast, is an attack based on weaknesses in a particular cryptographic algorithm.

Collision Resistant Attacks (1 of 2)

For a collision resistant attack, an adversary wishes to find two messages or data blocks that yield the same hash function

The effort required is explained by a mathematical result referred to as the birthday paradox

Yuval proposed the following strategy to exploit the birthday paradox in a collision resistant attack:

The source (A) is prepared to sign a legitimate message x by appending the appropriate m-bit hash code and encrypting that hash code with A’s private key

Opponent generates 2m/2 variations x’ of x, all with essentially the same meaning, and stores the messages and their hash values

Opponent prepares a fraudulent message y for which A’s signature is desired

For a collision resistant attack, an adversary wishes

to find two messages or data blocks, x and y , that yield the same hash function:

H(x) = H(y). This turns out to require considerably less effort than a preimage or

second preimage attack. The effort required is explained by a mathematical result

referred to as the birthday paradox. In essence, if we choose random variables from a uniform distribution in the range 0 through N - 1, then the probability that a

repeated element is encountered exceeds 0.5 after √N choices have been made.

Thus, for an m-bit hash value, if we pick data blocks at random, we can expect to

find two data blocks with the same hash value within √2m = 2m/2 attempts. The

mathematical derivation of this result is found in Appendix E.

Yuval proposed the following strategy to exploit the birthday paradox in a

Collision resistant attack [YUVA79].

The source, A, is prepared to sign a legitimate message x by appending the appropriate m-bit hash code and encrypting that hash code with A’s private key (Figure 11.4a).

2. The opponent generates 2m/2 variations x’ of x, all of which convey essentially

the same meaning, and stores the messages and their hash values.

3. The opponent prepares a fraudulent message y for which A’s signature is

desired.

4. The opponent generates minor variations y’ of y, all of which convey essentially

the same meaning. For each y’, the opponent computes H(y’), checks

for matches with any of the H(x’) values, and continues until a match is found.

That is, the process continues until a y’ is generated with a hash value equal to

the hash value of one of the x’ values.

5. The opponent offers the valid variation to A for signature. This signature can

then be attached to the fraudulent variation for transmission to the intended

recipient. Because the two variations have the same hash code, they will produce

the same signature; the opponent is assured of success even though the

encryption key is not known.

Thus, if a 64-bit hash code is used, the level of effort required is only on the

order of 232 [see Appendix E, Equation (E.7)].

The generation of many variations that convey the same meaning is not difficult.

For example, the opponent could insert a number of “space-space-backspace”

character pairs between words throughout the document. Variations could then

be generated by substituting “space-backspace-space” in selected instances.

Alternatively, the opponent could simply reword the message but retain the

Meaning. Figure 11.7 provides an example.

To summarize, for a hash code of length m, the level of effort required, as we

have seen, is proportional to the following.

Preimage resistant 2m

Second preimage resistant 2m

Collision resistant 2m/2

Collision Resistant Attacks (2 of 2)

Opponent generates minor variations y’ of y, all of which convey essentially the same meaning. For each y’, the opponent computes H (y’), checks for matches with any of the H (x’) values, and continues until a match is found. That is, the process continues until a y’ is generated with a hash value equal to the hash value of one of the x’ values

The opponent offers the valid variation to A for signature which can then be attached to the fraudulent variation for transmission to the intended recipient

Because the two variations have the same hash code, they will produce the same signature and the opponent is assured of success even though the encryption key is not known