Cryptography

vars
Chapter8PPT.pptx

Cryptography and Network Security: Principles and Practice

Eighth Edition

Chapter 8

Random Bit Generation and Stream Ciphers

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Slide in this Presentation Contain Hyperlinks. JAWS users should be able to get a list of links by using INSERT+F7

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Lecture slides prepared for “Cryptography and Network Security”, 8/e, by William Stallings, Chapter 8 – “Random Bit Generation and Stream Ciphers”.

An important cryptographic function is the generation of random bit streams. Random bits streams are used in a wide variety of contexts, including key generation and encryption. In essence, there are two fundamentally different strategies for generating random bits or random numbers. One strategy, which until recently dominated in cryptographic applications, computes bits deterministically using an algorithm. This class of random bit generators is known as pseudorandom number generators (PRNGs) or deterministic random bit generators (DRBGs). The other strategy is to produce bits non-deterministically using some physical source that produces some sort of random output. This latter class of random bit generators is known as true random number generators (TRNGs) or non-deterministic random bit generators (NRBGs).

The chapter begins with an analysis of the basic principles of PRNGs. Next, we look at some common PRNGs, including PRNGs based on the use of a symmetric block cipher. The chapter then moves on to the topic of symmetric stream ciphers, which are based on the use of a PRNG.

The remainder of the chapter is devoted to TRNGs. We look first at the basic principles and structure of TRNGs, and then examine a specific product, the Intel Digital Random Number Generator.

Throughout this chapter, reference is made to four important NIST documents:

■ SP 800-90A (Recommendation for Random Number Generation Using Deterministic Random Bit Generators, June 2015): Specifies mechanisms for the generation of random bits using deterministic methods.

■ SP 800-90B (Recommendation for the Entropy Sources Used for Random Bit Generation, January 2018): Covers design principles and requirements for entropy sources (ES), the devices from which we get unpredictable randomness and NRNGs.

■ SP 800-90C (Recommendation for Random Bit Generator (RBG)

Constructions, April 2016): Discusses how to combine the entropy sources in 90B with the DRNG’s from 90A to provide large quantities of unpredictable bits for cryptographic applications.

■ SP 800-22 (A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications, April 2010) discusses the selection and testing of NRBGs and DRBGs.

These specifications have heavily influenced the implementation of random bit generators in industry both in the U.S. and worldwide.

1

Learning Objectives

Explain the concepts of randomness and unpredictability with respect to random numbers.

Understand the differences among true random number generators, pseudorandom number generators, and pseudorandom functions.

Present an overview of requirements for pseudorandom number generators.

Explain how a block cipher can be used to construct a pseudorandom number generator.

Present an overview of stream ciphers and RC4.

Explain the significance of skew.

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Random Numbers

A number of network security algorithms and protocols based on cryptography make use of random binary numbers:

Key distribution and reciprocal authentication schemes

Session key generation

Generation of keys for the R S A public-key encryption algorithm

Generation of a bit stream for symmetric stream encryption

There are two distinct requirements for a sequence of random numbers:

Randomness

Unpredictability

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

A number of network security algorithms and protocols based on cryptography make use of random binary numbers. For example,

• Key distribution and reciprocal (mutual) authentication schemes, such as those discussed in Chapters 14 and 15. In such schemes, two communicating parties cooperate by exchanging messages to distribute keys and/or authenticate each other. In many cases, nonces are used for handshaking to prevent replay attacks. The use of random numbers for the nonces frustrates an opponent’s efforts to determine or guess the nonce, in order to repeat an obsolete transaction.

• Session key generation. We will see a number of protocols in this book where a secret key for symmetric encryption is generated for use for a particular transaction (or session) and is valid for a short period of time. This key is generally called a session key.

• Generation of keys for the RSA public-key encryption algorithm (described in Chapter 9).

• Generation of a bit stream for symmetric stream encryption (described in this chapter).

These applications give rise to two distinct and not necessarily compatible requirements for a sequence of random numbers: randomness and unpredictability.

3

Randomness

The generation of a sequence of allegedly random numbers being random in some well-defined statistical sense has been a concern

Two criteria are used to validate that a sequence of numbers is random:

Uniform distribution

The frequency of occurrence of ones and zeros should be approximately equal

Independence

No one subsequence in the sequence can be inferred from the others

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Traditionally, the concern in the generation of a sequence of allegedly

random numbers has been that the sequence of numbers be random in some well-defined statistical sense. The following two criteria are used to validate that a sequence of numbers is random:

• Uniform distribution: The distribution of bits in the sequence should be uniform; that is, the frequency of occurrence of ones and zeros should be approximately equal.

• Independence: No one subsequence in the sequence can be inferred from the others.

Although there are well-defined tests for determining that a sequence of bits matches a particular distribution, such as the uniform distribution, there is no such test to “prove” independence. Rather, a number of tests can be applied to demonstrate if a sequence does not exhibit independence. The general strategy is to apply a number of such tests until the confidence that independence exists is sufficiently strong. That is, if each of a number of tests fails to show that a sequence of bits is not independent, then we can have a high level of confidence that the sequence is in fact independent.

4

Unpredictability

The requirement is not just that the sequence of numbers be statistically random, but that the successive members of the sequence are unpredictable

With “true” random sequences each number is statistically independent of other numbers in the sequence and therefore unpredictable

True random numbers have their limitations, such as inefficiency, so it is more common to implement algorithms that generate sequences of numbers that appear to be random

Care must be taken that an opponent not be able to predict future elements of the sequence on the basis of earlier elements

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

In applications such as reciprocal authentication, session key

generation, and stream ciphers, the requirement is not just that the sequence of numbers be statistically random but that the successive members of the sequence are unpredictable. With “true” random sequences, each number is statistically independent of other numbers in the sequence and therefore unpredictable. Although true random numbers are used in some applications, they have their limitations, such as inefficiency, as is discussed shortly. Thus, it is more common to implement algorithms that generate sequences of numbers that appear to be random. In this latter case, care must be taken that an opponent not be able to predict future elements of the sequence on the basis of earlier elements.

5

Pseudorandom Numbers

Cryptographic applications typically make use of algorithmic techniques for random number generation

These algorithms are deterministic and therefore produce sequences of numbers that are not statistically random

If the algorithm is good, the resulting sequences will pass many tests of randomness and are referred to as pseudorandom numbers

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Cryptographic applications typically make use of algorithmic techniques for random number generation. These algorithms are deterministic and therefore produce sequences of numbers that are not statistically random. However, if the algorithm is good, the resulting sequences will pass many tests of randomness. Such numbers are referred to as pseudorandom numbers.

You may be somewhat uneasy about the concept of using numbers generated

by a deterministic algorithm as if they were random numbers. Despite what might be called philosophical objections to such a practice, it generally works. That is, under most circumstances, pseudorandom numbers will perform as well as if they were random for a given use. The phrase “as well as” is unfortunately subjective, but the use of pseudorandom numbers is widely accepted. The same principle applies in statistical applications, in which a statistician takes a sample of a population and assumes that the results will be approximately the same as if the whole population were measured.

6

Figure 8.1 Random and Pseudorandom Number Generators

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Figure 8.1 contrasts a true random number generator (TRNG) with two forms

of pseudorandom number generators.

Figure 8.1 shows two different forms of PRNGs, based on application.

7

True Random Number Generator (T R N G)

Takes as input a source that is effectively random

The source is referred to as an entropy source and is drawn from the physical environment of the computer

Includes things such as keystroke timing patterns, disk electrical activity, mouse movements, and instantaneous values of the system clock

The source, or combination of sources, serve as input to an algorithm that produces random binary output

The T R N G may simply involve conversion of an analog source to a binary output

The T R N G may involve additional processing to overcome any bias in the source

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

A TRNG takes as input a source that is effectively

random; the source is often referred to as an entropy source.

In essence, the entropy source is drawn from the physical

environment of the computer and could include things such as keystroke timing patterns, disk electrical activity, mouse movements, and instantaneous values of the system clock.

The source, or combination of sources, serve as input to an algorithm

that produces random binary output. The TRNG may simply involve conversion of an analog source to a binary output. The TRNG may involve additional processing to overcome any bias in the source; this is discussed in Section 8.6.

8

Pseudorandom Number Generator (P R N G) (1 of 2)

Takes as input a fixed value, called the seed, and produces a sequence of output bits using a deterministic algorithm

Quite often the seed is generated by a T R N G

The output bit stream is determined solely by the input value or values, so an adversary who knows the algorithm and the seed can reproduce the entire bit stream

Other than the number of bits produced there is no difference between a P R N G and a P R F

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

In contrast, a PRNG takes as input a fixed value, called the seed, and produces a sequence of output bits using a deterministic algorithm. Quite often, the seed is generated by a TRNG. Typically, as shown, there is some feedback path by which some of the results of the algorithm are fed back as input as additional output bits are produced. The important thing to note is that the output bit stream is determined solely by the input value or values, so that an adversary who knows the algorithm and the seed can reproduce the entire bit stream.

• Pseudorandom number generator: An algorithm that is used to produce an open-ended sequence of bits is referred to as a PRNG. A common application for an open-ended sequence of bits is as input to a symmetric stream cipher, as discussed in Section 8.4.

• Pseudorandom function (PRF): A PRF is used to produced a pseudorandom string of bits of some fixed length. Examples are symmetric encryption keys and nonces. Typically, the PRF takes as input a seed plus some context specific values, such as a user ID or an application ID. A number of examples of PRFs will be seen throughout this book, notably in Chapters 19 and 20.

Other than the number of bits produced, there is no difference between a PRNG and a PRF. The same algorithms can be used in both applications. Both require a seed and both must exhibit randomness and unpredictability. Further, a PRNG application may also employ context-specific input. In what follows, we make no distinction between these two applications.

9

Pseudorandom Number Generator (P R N G) (2 of 2)

Pseudorandom number generator

An algorithm that is used to produce an open-ended sequence of bits

Input to a symmetric stream cipher is a common application for an open-ended sequence of bits

Pseudorandom function (P R F)

Used to produce a pseudorandom string of bits of some fixed length

Examples are symmetric encryption keys and nonces

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

In contrast, a PRNG takes as input a fixed value, called the seed, and produces a sequence of output bits using a deterministic algorithm. Quite often, the seed is generated by a TRNG. Typically, as shown, there is some feedback path by which some of the results of the algorithm are fed back as input as additional output bits are produced. The important thing to note is that the output bit stream is determined solely by the input value or values, so that an adversary who knows the algorithm and the seed can reproduce the entire bit stream.

• Pseudorandom number generator: An algorithm that is used to produce an open-ended sequence of bits is referred to as a PRNG. A common application for an open-ended sequence of bits is as input to a symmetric stream cipher, as discussed in Section 8.4.

• Pseudorandom function (PRF): A PRF is used to produced a pseudorandom string of bits of some fixed length. Examples are symmetric encryption keys and nonces. Typically, the PRF takes as input a seed plus some context specific values, such as a user ID or an application ID. A number of examples of PRFs will be seen throughout this book, notably in Chapters 19 and 20.

Other than the number of bits produced, there is no difference between a PRNG and a PRF. The same algorithms can be used in both applications. Both require a seed and both must exhibit randomness and unpredictability. Further, a PRNG application may also employ context-specific input. In what follows, we make no distinction between these two applications.

10

P R N G Requirements

The basic requirement when a P R N G or P R F is used for a cryptographic application is that an adversary who does not know the seed is unable to determine the pseudorandom string

The requirement for secrecy of the output of a P R N G or P R F leads to specific requirements in the areas of:

Randomness

Unpredictability

Characteristics of the seed

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

When a PRNG or PRF is used for a cryptographic application, then the basic

requirement is that an adversary who does not know the seed is unable to determine the pseudorandom string. For example, if the pseudorandom bit stream is used in a stream cipher, then knowledge of the pseudorandom bit stream would enable the adversary to recover the plaintext from the ciphertext. Similarly, we wish to protect the output value of a PRF. In this latter case, consider the following scenario. A 128-bit seed, together with some context-specific values, are used to generate a 128-bit secret key that is subsequently used for symmetric encryption. Under normal circumstances, a 128-bit key is safe from a brute-force attack. However, if the PRF does not generate effectively random 128-bit output values, it may be possible for an adversary to narrow the possibilities and successfully use a brute force attack.

This general requirement for secrecy of the output of a PRNG or PRF leads to

specific requirements in the areas of randomness, unpredictability, and the characteristics of the seed. We now look at these in turn.

11

Randomness

The generated bit stream needs to appear random even though it is deterministic

There is no single test that can determine if a P R N G generates numbers that have the characteristic of randomness

If the P R N G exhibits randomness on the basis of multiple tests, then it can be assumed to satisfy the randomness requirement

N I S T S P 800-22 specifies that the tests should seek to establish three characteristics:

Uniformity

Scalability

Consistency

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

In terms of randomness, the requirement for a PRNG is that the generated

bit stream appear random even though it is deterministic. There is no single

test that can determine if a PRNG generates numbers that have the characteristic of randomness. The best that can be done is to apply a sequence of tests to the PRNG. If the PRNG exhibits randomness on the basis of multiple tests, then it can be assumed to satisfy the randomness requirement. NIST SP 800-22 specifies that the tests should seek to establish the following three characteristics.

• Uniformity: At any point in the generation of a sequence of random or pseudorandom bits, the occurrence of a zero or one is equally likely, i.e., the probability of each is exactly 1/2. The expected number of zeros (or ones) is n /2, where n = the sequence length.

• Scalability: Any test applicable to a sequence can also be applied to subsequences extracted at random. If a sequence is random, then any such extracted subsequence should also be random. Hence, any extracted subsequence should pass any test for randomness.

• Consistency: The behavior of a generator must be consistent across starting values (seeds). It is inadequate to test a PRNG based on the output from a single seed or an TRNG on the basis of an output produced from a single physical output.

12

Randomness Tests (1 of 2)

S P 800-22 lists 15 separate tests of randomness

Three tests

Frequency test

The most basic test and must be included in any test suite

Purpose is to determine whether the number of ones and zeros in a sequence is approximately the same as would be expected for a truly random sequence

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

SP 800-22 lists 15 separate tests of randomness. An understanding of these

tests requires a basic knowledge of statistical analysis, so we don’t attempt a

technical description here. Instead, to give some flavor for the tests, we list three of the tests and the purpose of each test, as follows.

• Frequency test: This is the most basic test and must be included in any test

suite. The purpose of this test is to determine whether the number of ones and

zeros in a sequence is approximately the same as would be expected for a truly random sequence.

• Runs test: The focus of this test is the total number of runs in the sequence,

where a run is an uninterrupted sequence of identical bits bounded before

and after with a bit of the opposite value. The purpose of the runs test is to

determine whether the number of runs of ones and zeros of various lengths is

as expected for a random sequence.

• Maurer’s universal statistical test: The focus of this test is the number of bits between matching patterns (a measure that is related to the length of a compressed sequence). The purpose of the test is to detect whether or not the

sequence can be significantly compressed without loss of information. A significantly compressible sequence is considered to be non-random.

13

Randomness Tests (2 of 2)

Runs test

Focus of this test is the total number of runs in the sequence, where a run is an uninterrupted sequence of identical bits bounded before and after with a bit of the opposite value

Purpose is to determine whether the number of runs of ones and zeros of various lengths is as expected for a random sequence

Maurer’s universal statistical test

Focus is the number of bits between matching patterns

Purpose is to detect whether or not the sequence can be significantly compressed without loss of information. A significantly compressible sequence is considered to be non-random

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

SP 800-22 lists 15 separate tests of randomness. An understanding of these

tests requires a basic knowledge of statistical analysis, so we don’t attempt a

technical description here. Instead, to give some flavor for the tests, we list three of the tests and the purpose of each test, as follows.

• Frequency test: This is the most basic test and must be included in any test

suite. The purpose of this test is to determine whether the number of ones and

zeros in a sequence is approximately the same as would be expected for a truly random sequence.

• Runs test: The focus of this test is the total number of runs in the sequence,

where a run is an uninterrupted sequence of identical bits bounded before

and after with a bit of the opposite value. The purpose of the runs test is to

determine whether the number of runs of ones and zeros of various lengths is

as expected for a random sequence.

• Maurer’s universal statistical test: The focus of this test is the number of bits between matching patterns (a measure that is related to the length of a compressed sequence). The purpose of the test is to detect whether or not the

sequence can be significantly compressed without loss of information. A significantly compressible sequence is considered to be non-random.

14

Unpredictability

A stream of pseudorandom numbers should exhibit two forms of unpredictability:

Forward unpredictability

If the seed is unknown, the next output bit in the sequence should be unpredictable in spite of any knowledge of previous bits in the sequence

Backward unpredictability

It should not be feasible to determine the seed from knowledge of any generated values

No correlation between a seed and any value generated from that seed should be evident

Each element of the sequence should appear to be the outcome of an independent random event whose probability is 1/2

The same set of tests for randomness also provides a test of unpredictability

A random sequence will have no correlation with a fixed value (the seed)

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

A stream of pseudorandom numbers should exhibit two forms

of unpredictability:

• Forward unpredictability : If the seed is unknown, the next output bit in the

sequence should be unpredictable in spite of any knowledge of previous bits in

the sequence.

• Backward unpredictability : It should also not be feasible to determine the

seed from knowledge of any generated values. No correlation between a seed

and any value generated from that seed should be evident; each element of the sequence should appear to be the outcome of an independent random event whose probability is 1/2.

The same set of tests for randomness also provide a test of unpredictability. If the generated bit stream appears random, then it is not possible to predict some bit or bit sequence from knowledge of any previous bits. Similarly, if the bit sequence appears random, then there is no feasible way to deduce the seed based on the bit sequence. That is, a random sequence will have no correlation with a fixed value (the seed).

15

Seed Requirements

The seed that serves as input to the P R N G must be secure and unpredictable

The seed itself must be a random or pseudorandom number

Typically the seed is generated by T R N G

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

For cryptographic applications, the seed that serves as input to

the PRNG must be secure. Because the PRNG is a deterministic algorithm, if the

adversary can deduce the seed, then the output can also be determined. Therefore, the seed must be unpredictable. In fact, the seed itself must be a random or pseudorandom number.

16

Figure 8.2 Generation of Seed Input to P R N G

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Typically, the seed is generated by a TRNG, as shown in Figure 8.2. This is the

scheme recommended by SP800-90. The reader may wonder, if a TRNG is available, why it is necessary to use a PRNG. If the application is a stream cipher, then a TRNG is not practical. The sender would need to generate a keystream of bits as long as the plaintext and then transmit the keystream and the ciphertext securely to the receiver. If a PRNG is used, the sender need only find a way to deliver the stream cipher key, which is typically 128 or 256 bits, to the receiver in a secure fashion.

Even in the case of a PRF application, in which only a limited number of bits

is generated, it is generally desirable to use a TRNG to provide the seed to the

PRF and use the PRF output rather than use the TRNG directly. As is explained in Section 8.6, a TRNG may produce a binary string with some bias. The PRF would have the effect of conditioning the output of the TRNG so as to eliminate that bias.

Finally, the mechanism used to generate true random numbers may not be

able to generate bits at a rate sufficient to keep up with the application requiring the random bits.

17

Algorithm Design

Algorithms fall into two categories:

Purpose-built algorithms

Algorithms designed specifically and solely for the purpose of generating pseudorandom bit streams

Algorithms based on existing cryptographic algorithms

Have the effect of randomizing input data

Three broad categories of cryptographic algorithms are commonly used to create P R N Gs:

Symmetric block ciphers

Asymmetric ciphers

Hash functions and message authentication codes

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Cryptographic PRNGs have been the subject of much research over the years,

and a wide variety of algorithms have been developed. These fall roughly into two categories.

• Purpose-built algorithms: These are algorithms designed specifically and

solely for the purpose of generating pseudorandom bit streams. Some of these

algorithms are used for a variety of PRNG applications; several of these are

described in the next section. Others are designed specifically for use in a

stream cipher.

• Algorithms based on existing cryptographic algorithms: Cryptographic algorithms have the effect of randomizing input data. Indeed, this is a requirement of such algorithms. For example, if a symmetric block cipher produced ciphertext that had certain regular patterns in it, it would aid in the process of cryptanalysis. Thus, cryptographic algorithms can serve as the core of PRNGs.

SP 800-90A recommends three categories of such algorithms:

—Symmetric block ciphers: This approach is discussed in Section 8.3.

—Hash functions and message authentication codes: These approaches are examined in Chapter 12.

Any of these approaches can yield a cryptographically strong PRNG.

A purpose-built algorithm may be provided by an operating system for general use. For applications that already use certain cryptographic algorithms for encryption or authentication, it makes sense to reuse the same code for the PRNG. Thus, all of these approaches are in common use.

18

8.2 – Pseudorandom Number Generators

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Linear Congruential Generator

An algorithm first proposed by Lehmer that is parameterized with four numbers:

m the modulus m > 0

a the multiplier 0 < a< m

c the increment 0≤ c < m

X0 the starting value, or seed 0 ≤ X0 < m

The sequence of random numbers {Xn} is obtained via the following iterative equation:

Xn+1 = (aXn + c) mod m

If m , a , c , and X0 are integers, then this technique will produce a sequence of integers with each integer in the range 0 ≤ Xn < m

The selection of values for a , c , and m is critical in developing a good random number generator

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

20

A widely used technique for pseudorandom number generation is an algorithm first

proposed by Lehmer [LEHM51], which is known as the linear congruential method.

The algorithm is parameterized with four numbers, as follows:

m the modulus m > 0

a the multiplier 0 < a< m

c the increment 0≤ c < m

X0 the starting value, or seed 0 ≤ X0 < m

The sequence of random numbers {Xn} is obtained via the following iterative equation:

Xn+1 = (aXn + c) mod m

If m , a , c , and X0 are integers, then this technique will produce a sequence of integers with each integer in the range 0 ≤ Xn < m .

The selection of values for a , c , and m is critical in developing a good random number generator.

We would like m to be very large, so that there is the potential for producing

a long series of distinct random numbers. A common criterion is that m be nearly equal to the maximum representable nonnegative integer for a given computer. Thus, a value of m near to or equal to 231 is typically chosen.

Blum Blum Shub (B B S) Generator

Has perhaps the strongest public proof of its cryptographic strength of any purpose-built algorithm

Referred to as a cryptographically secure pseudorandom bit generator (C S P R B G)

A C S P R B G is defined as one that passes the next-bit-test if there is not a polynomial-time algorithm that, on input of the first k bits of an output sequence, can predict the (k + 1)st bit with probability significantly greater than 1/2

The security of B B S is based on the difficulty of factoring n

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

21

A popular approach to generating secure pseudorandom numbers is known as

the Blum, Blum, Shub (BBS) generator (see Figure 8.3), named for its developers [BLUM86]. It has perhaps the strongest public proof of its cryptographic strength of any purpose-built algorithm.

The BBS is referred to as a cryptographically secure pseudorandom bit generator (CSPRBG). A CSPRBG is defined as one that passes the next-bit test , which, in turn, is defined as follows [MENE97]: A pseudorandom bit generator is said to pass the next-bit test if there is not a polynomial-time algorithm that, on input of the first k bits of an output sequence, can predict the (k + 1)st bit with probability significantly greater than 1/2. In other words, given the first k bits of the sequence, there is not a practical algorithm that can even allow you to state that the next bit will be 1 (or 0) with probability greater than 1/2. For all practical purposes, the sequence is unpredictable. The security of BBS is based on the difficulty of factoring n. That is, given n, we need to determine its two prime factors p and q .

Figure 8.3 Blum Blum Shub Block Diagram

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

A popular approach to generating secure pseudorandom numbers is known as

the Blum, Blum, Shub (BBS) generator (see Figure 8.3), named for its developers

[BLUM86].

22

Table 8.1 Example Operation of B B S Generator

i Xi Bi
0 20749 Blank
1 143135 1
2 177671 1
3 97048 0
4 89992 0
5 174051 1
6 80649 1
7 45663 1
8 69442 0
9 186894 0
10 177046 0
i Xi Bi
11 137922 0
12 123175 1
13 8630 0
14 114386 0
15 14863 1
16 133015 1
17 106065 1
18 45870 0
19 137171 1
20 48060 0

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Table 8.1, shows an example of BBS operation.

23

8.3 Pseudorandom Number Generation Using a Block Cipher

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

P R N G Using Block Cipher Modes of Operation

Two approaches that use a block cipher to build a P N R G have gained widespread acceptance:

C T R mode

Recommended in N I S T S P 800-90, A N S I standard X.82, and R F C 4086

O F B mode

Recommended in X9.82 and R F C 4086

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

25

Two approaches that use a block cipher to build a PNRG have gained widespread acceptance: the CTR mode and the OFB mode. The CTR mode is recommended in NIST SP 800-90A, in the ANSI standard X9.82 (Random Number Generation), and in RFC 4086 (Randomness Requirements for Security, June 2005). The OFB mode is recommended in X9.82 and RFC 4086.

Figure 8.4 P R N G Mechanisms Based on Block Ciphers

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Figure 8.4 illustrates the two methods. In each case, the seed consists of two

parts: the encryption key value and a value V that will be updated after each block of pseudorandom numbers is generated. Thus, for AES-128, the seed consists of a 128-bit key and a 128-bit V value. In the CTR case, the value of V is incremented by 1 after each encryption. In the case of OFB, the value of V is updated to equal the value of the preceding PRNG block. In both cases, pseudorandom bits are produced one block at a time (e.g., for AES, PRNG bits are generated 128 bits at a time).

26

Table 8.2 Example Results for P R N G Using O F B

Output Block Fraction of One Bits Fraction of Bits that Match with Preceding Block
1786f4c7ff6e291dbdfdd90ec3453176 0.57
5e17b22b14677a4d66890f87565eae64 0.51 0.52
fd18284ac82251dfb3aa62c326cd46cc 0.47 0.54
c8e545198a758ef5dd86b41946389bd5 0.50 0.44
fe7bae0e23019542962e2c52d215a2e3 0.47 0.48
14fdf5ec99469598ae0379472803accd 0.49 0.52
6aeca972e5a3ef17bd1a1b775fc8b929 0.57 0.48
f7e97badf359d128f00d9b4ae323db64 0.55 0.45

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

For the OFB PRNG, Table 8.2 shows the first eight output blocks (1024 bits)

with two rough measures of security. The second column shows the fraction of one bits in each 128-bit block. This corresponds to one of the NIST tests. The results indicate that the output is split roughly equally between zero and one bits. The third column shows the fraction of bits that match between adjacent blocks. If this number differs substantially from 0.5, that suggests a correlation between blocks, which could be a security weakness. The results suggest no correlation.

27

Table 8.3 Example Results for P R N G Using C T R

Output Block Fraction of One Bits Fraction of Bits that Match with Preceding Block
1786f4c7ff6e291dbdfdd90ec3453176 0.57
60809669a3e092a01b463472fdcae420 0.41 0.41
d4e6e170b46b0573eedf88ee39bff33d 0.59 0.45
5f8fcfc5deca18ea246785d7fadc76f8 0.59 0.52
90e63ed27bb07868c753545bdd57ee28 0.53 0.52
0125856fdf4a17f747c7833695c52235 0.50 0.47
f4be2d179b0f2548fd748c8fc7c81990 0.51 0.48
1151fc48f90eebac658a3911515c3c66 0.47 0.45

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Table 8.3 shows the results using the same key and V values for CTR mode.

Again, the results are favorable.

28

Table 8.4 C T R_D R B G Parameters

Blank 3DES AES-128 AES-192 AES-256
outlen 64 128 128 128
keylen 168 128 192 256
seedlen 232 256 320 384
reseed_interval ≤232 ≤248 ≤248 ≤248

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

We now look more closely at the details of the PRNG defined in NIST SP 800-90A based on the CTR mode of operation. The PRNG is referred to as CTRDRBG (counter mode–deterministic random bit generator). CTR_DRBG is widely implemented and is part of the hardware random number generator implemented on all recent Intel processor chips (discussed in Section 8.6).

The DRBG assumes that an entropy source is available to provide random bits. Typically, the entropy source will be a TRNG based on some physical source. Other sources are possible if they meet the required entropy measure of the application. Entropy is an information theoretic concept that measures unpredict- ability, or randomness; see Appendix B for details. The encryption algorithm used in the DRBG may be 3DES with three keys or AES with a key size of 128, 192, or 256 bits.

Four parameters are associated with the algorithm:

■ Output block length (outlen): Length of the output block of the encryption algorithm.

■ Key length (keylen): Length of the encryption key.

■ Seed length (seedlen): The seed is a string of bits that is used as input to a DRBG mechanism. The seed will determine a portion of the internal state of the DRBG, and its entropy must be sufficient to support the security strength of the DRBG. seedlen = outlen + keylen.

■ Reseed interval (reseed_interval): Length of the encryption key. It is the maximum number of output blocks generated before updating the algorithm with a new seed.

Table 8.4 lists the values specified in SP 800-90A for these parameters.

29

Figure 8.5 C T R_D R B G Functions

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

30

Figure 8.5 shows the two principal functions that comprise CTR_DRBG.

We first consider how CTR_DRBG is initialized, using the initialize and update function (Figure 8.5a).

Once values of Key and V are obtained, the DRBG enters the generate phase and is able to generate pseudorandom bits, one output block at a time (Figure 8.5b). The encryption function is iterated to generate the number of pseudo- random bits desired. Each iteration uses the same encryption key. The counter value V is incremented by 1 for each iteration.

To enhance security, the number of bits generated by any PRNG should be limited. CTR_DRGB uses the parameter reseed_interval to set that limit.

8.4 Stream Ciphers

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Figure 8.6 Generic Structure of a Typical Stream Cipher

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Stream ciphers can be viewed a pseudorandom equivalent of a one-time pad. The one-time pad uses a long random key, of length equal to the plaintext message. A stream cipher uses a short secret key and a pseudorandomly generated stream of bits, computationally indistinguishable from a stream of random digits. Traditionally, block ciphers have been more widely used, in a greater range of applications. This is primarily due to the ability of block ciphers to easily be used in a variety of ways using different modes of operation. In addition, block ciphers can be used as stream ciphers via modes of operation such as Counter, OFB, and CBC.

In recent years, there has been a resurgence of interest in the use of stream ciphers [BIRY04]. Stream ciphers are useful when there is a need to encrypt large amounts of fast streaming data. And stream ciphers are well suited to use in devices with very limited memory and processing power, called constrained devices. Examples include small wireless sensors as part of an Internet of Things (IoT) and radio frequency identification (RFID) tags.

Figure 8.6 shows the structure of a typical stream cipher. There are three internal elements. There is a secret state i (i.e., memory) that evolves with time during encryption and decryption; the initial state is designated as 0. A state transition function f, at each bit generation time, computes a new state value from the old state value. An output function g produces the stream of bits used for encryption and decryption, known as the keystream zi. A secret key K provides input to the stream cipher, and is used to initialize the state. K may also serve as an input parameter to f. Some stream ciphers also include an initialization vector IV that is used, along with K, to initialize the state. As is the case for block ciphers, the IV for a stream cipher need not be secret. However, it should be unpredictable and unique.

32

Stream Cipher Design Considerations (1 of 2)

The encryption sequence should have a large period

A pseudorandom number generator uses a function that produces a deterministic stream of bits that eventually repeats; the longer the period of repeat the more difficult it will be to do cryptanalysis

The keystream should approximate the properties of a true random number stream as close as possible

There should be an approximately equal number of 1s and 0s

If the keystream is treated as a stream of bytes, then all of the 256 possible byte values should appear approximately equally often

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

33

The stream cipher is similar to the one-time pad discussed in Chapter 3. The

difference is that a one-time pad uses a genuine random number stream, whereas a stream cipher uses a pseudorandom number stream.

[KUMA97] lists the following important design considerations for a stream cipher.

1. The encryption sequence should have a large period. A pseudorandom number generator uses a function that produces a deterministic stream of bits that eventually repeats. The longer the period of repeat the more difficult it will

be to do cryptanalysis. This is essentially the same consideration that was discussed with reference to the Vigenère cipher, namely that the longer the keyword the more difficult the cryptanalysis.

2. The keystream should approximate the properties of a true random number

stream as close as possible. For example, there should be an approximately

equal number of 1s and 0s. If the keystream is treated as a stream of bytes,

then all of the 256 possible byte values should appear approximately equally

often. The more random-appearing the keystream is, the more randomized the

ciphertext is, making cryptanalysis more difficult.

3. Note from Figure 8.6 that the output of the pseudorandom number generator is conditioned on the value of the input key. To guard against brute-force attacks, the key needs to be sufficiently long. The same considerations that apply to block ciphers are valid here. Thus, with current technology, a key

length of at least 128 bits is desirable.

With a properly designed pseudorandom number generator, a stream cipher

can be as secure as a block cipher of comparable key length. A potential advantage of a stream cipher is that stream ciphers that do not use block ciphers as a building block are typically faster and use far less code than do block ciphers. The example in this chapter, RC4, can be implemented in just a few lines of code. In recent years, this advantage has diminished with the introduction of AES, which is quite efficient in software. Furthermore, hardware acceleration techniques are now available for AES. For example, the Intel AES Instruction Set has machine instructions for one round of encryption and decryption and key generation. Using the hardware instructions

results in speedups of about an order of magnitude compared to pure

software implementations [XU10].

One advantage of a block cipher is that you can reuse keys. In contrast, if two

plaintexts are encrypted with the same key using a stream cipher, then cryptanalysis is often quite simple [DAWS96]. If the two ciphertext streams are XORed together, the result is the XOR of the original plaintexts. If the plaintexts are text strings, credit card numbers, or other byte streams with known properties, then cryptanalysis may be successful.

For applications that require encryption/decryption of a stream of data, such as

over a data communications channel or a browser/Web link, a stream cipher might be the better alternative. For applications that deal with blocks of data, such as file transfer, e-mail, and database, block ciphers may be more appropriate. However, either type of cipher can be used in virtually any application.

A stream cipher can be constructed with any cryptographically strong PRNG,

such as the ones discussed in Sections 8.2 and 8.3. In the next section, we look at a stream cipher that uses a PRNG designed specifically for the stream cipher.

Stream Cipher Design Considerations (2 of 2)

A key length of at least 128 bits is desirable

The output of the pseudorandom number generator is conditioned on the value of the input key

The same considerations that apply to block ciphers are valid

With a properly designed pseudorandom number generator a stream cipher can be as secure as a block cipher of comparable key length

A potential advantage is that stream ciphers that do not use block ciphers as a building block are typically faster and use far less code than block ciphers

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

34

The stream cipher is similar to the one-time pad discussed in Chapter 3. The

difference is that a one-time pad uses a genuine random number stream, whereas a stream cipher uses a pseudorandom number stream.

[KUMA97] lists the following important design considerations for a stream cipher.

1. The encryption sequence should have a large period. A pseudorandom number generator uses a function that produces a deterministic stream of bits that eventually repeats. The longer the period of repeat the more difficult it will

be to do cryptanalysis. This is essentially the same consideration that was discussed with reference to the Vigenère cipher, namely that the longer the keyword the more difficult the cryptanalysis.

2. The keystream should approximate the properties of a true random number

stream as close as possible. For example, there should be an approximately

equal number of 1s and 0s. If the keystream is treated as a stream of bytes,

then all of the 256 possible byte values should appear approximately equally

often. The more random-appearing the keystream is, the more randomized the

ciphertext is, making cryptanalysis more difficult.

3. Note from Figure 8.6 that the output of the pseudorandom number generator is conditioned on the value of the input key. To guard against brute-force attacks, the key needs to be sufficiently long. The same considerations that apply to block ciphers are valid here. Thus, with current technology, a key

length of at least 128 bits is desirable.

With a properly designed pseudorandom number generator, a stream cipher

can be as secure as a block cipher of comparable key length. A potential advantage of a stream cipher is that stream ciphers that do not use block ciphers as a building block are typically faster and use far less code than do block ciphers. The example in this chapter, RC4, can be implemented in just a few lines of code. In recent years, this advantage has diminished with the introduction of AES, which is quite efficient in software. Furthermore, hardware acceleration techniques are now available for AES. For example, the Intel AES Instruction Set has machine instructions for one round of encryption and decryption and key generation. Using the hardware instructions

results in speedups of about an order of magnitude compared to pure

software implementations [XU10].

One advantage of a block cipher is that you can reuse keys. In contrast, if two

plaintexts are encrypted with the same key using a stream cipher, then cryptanalysis is often quite simple [DAWS96]. If the two ciphertext streams are XORed together, the result is the XOR of the original plaintexts. If the plaintexts are text strings, credit card numbers, or other byte streams with known properties, then cryptanalysis may be successful.

For applications that require encryption/decryption of a stream of data, such as

over a data communications channel or a browser/Web link, a stream cipher might be the better alternative. For applications that deal with blocks of data, such as file transfer, e-mail, and database, block ciphers may be more appropriate. However, either type of cipher can be used in virtually any application.

A stream cipher can be constructed with any cryptographically strong PRNG,

such as the ones discussed in Sections 8.2 and 8.3. In the next section, we look at a stream cipher that uses a PRNG designed specifically for the stream cipher.

8.5 RC4

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

R C4

Designed in 1987 by Ron Rivest for R S A Security

Variable key size stream cipher with byte-oriented operations

Based on the use of a random permutation

Eight to sixteen machine operations are required per output byte and the cipher can be expected to run very quickly in software

R C4 is used in the WiFi Protected Access (W P A) protocol that are part of the I E E E 802.11 wireless L A N standard

It is optional for use in Secure Shell (S S H) and Kerberos

R C4 was kept as a trade secret by R S A Security until September 1994 when the R C4 algorithm was anonymously posted on the Internet on the Cypherpunks anonymous remailers list

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

36

RC4 is a stream cipher designed in 1987 by Ron Rivest for RSA Security. It is a variable key size stream cipher with byte-oriented operations. The algorithm is based on the use of a random permutation. Analysis shows that the period of the cipher is overwhelmingly likely to be greater than 10100 [ROBS95a]. Eight to sixteen machine operations are required per output byte, and the cipher can be expected to run very quickly in software. RC4 is used in the WiFi Protected Access (WPA) protocol that are part of the IEEE 802.11 wireless LAN standard.

It is optional for use in Secure Shell (SSH) and Kerberos. RC4 was kept as a trade secret by RSA Security. In September 1994, the RC4 algorithm was anonymously posted on the Internet on the Cypherpunks anonymous remailers list.

The RC4 algorithm is remarkably simple and quite easy to explain. A variable-length key of from 1 to 256 bytes (8 to 2048 bits) is used to initialize a 256-byte state vector S, with elements S[0], S[1], c , S[255]. At all times, S contains a permutation of all 8-bit numbers from 0 through 255. For encryption and decryption, a byte k is generated from S by selecting one of the 255 entries in a systematic fashion. As each value of k is generated, the entries in S are once again permuted.

Figure 8.7 R C4

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

37

Figure 8.7 illustrates the RC4 logic.

Strength of R C4

A fundamental vulnerability was revealed in the R C4 key scheduling algorithm that reduces the amount of effort to discover the key

Recent cryptanalysis results exploit biases in the R C4 keystream to recover repeatedly encrypted plaintexts

As a result of the discovered weaknesses the I E T F issued R F C 7465 prohibiting the use of R C4 in T L S

In its latest T L S guidelines, N I S T also prohibited the use of R C4 for government use

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

38

More recently, [PAUL07] revealed a more fundamental vulnerability in the RC4 key scheduling algorithm that reduces the amount of effort to discover the key. Recent cryptanalysis results [ALFA13] exploit biases in the RC4 keystream to recover repeatedly encrypted plaintexts. As a result of the discovered weaknesses, particularly those reported in [ALFA13], the IETF issued RFC 7465 prohibiting the use of RC4 in TLS (Prohibiting RC4 Cipher Suites, February 2015). In its latest TLS guidelines, NIST also prohibited the use of RC4 for government use (SP 800-52, Guidelines for the Selection, Configuration, and Use of Transport Layer Security (TLS) Implementations, September 2013).

8.6 Stream Ciphers Using Feedback Shift Registers

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Stream Ciphers Using Feedback Shift Registers (1 of 2)

With the increasing use of highly constrained devices there has been increasing interest in developing new stream ciphers that take up minimal memory, are highly efficient, and have minimal power consumption requirements

Most of the recently developed stream ciphers are based on the use of feedback shift registers (F S R s)

F S R s exhibit the desired performance behavior, are well-suited to compact hardware implementation, and there are well-developed theoretical results on the statistical properties of the bit sequences they produce

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

With the increasing use of highly constrained devices, such as those used in the IoT, there has been increasing interest in developing new stream ciphers that take up minimal memory, are highly efficient, and have minimal power consumption requirements. Most of the recently developed stream ciphers are based on the use of feedback shift registers (FSRs). Feedback shift registers exhibit the desired performance behavior, are well-suited to compact hardware implementation, and there are well-developed theoretical results on the statistical properties of the bit sequences they produce.

An FSR consists of a sequence of 1-bit memory cells. Each cell has an output line, which indicates the value currently stored, and an input line. At discrete time instants, known as clock times, the value in each storage device is replaced by the value indicated by its input line. The effect is as follows: The rightmost (least significant) bit is shifted out as the output bit for this clock cycle. The other bits are shifted one bit position to the right. The new leftmost (most significant) bit is calculated as a function of the other bits in the FSR.

40

Stream Ciphers Using Feedback Shift Registers (2 of 2)

An F S R consists of a sequence of 1-bit memory cells

Each cell has an output line, which indicates the value currently stored, and an input line

At discrete time instants, known as clock times, the value in each storage device is replaced by the value indicated by its input line

The effect is as follows: The rightmost (least significant) bit is shifted out as the output bit for this clock cycle; the other bits are shifted one bit position to the right; the new leftmost (most significant) bit is calculated as a function of the other bits in the F S R

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

With the increasing use of highly constrained devices, such as those used in the IoT, there has been increasing interest in developing new stream ciphers that take up minimal memory, are highly efficient, and have minimal power consumption requirements. Most of the recently developed stream ciphers are based on the use of feedback shift registers (FSRs). Feedback shift registers exhibit the desired performance behavior, are well-suited to compact hardware implementation, and there are well-developed theoretical results on the statistical properties of the bit sequences they produce.

An FSR consists of a sequence of 1-bit memory cells. Each cell has an output line, which indicates the value currently stored, and an input line. At discrete time instants, known as clock times, the value in each storage device is replaced by the value indicated by its input line. The effect is as follows: The rightmost (least significant) bit is shifted out as the output bit for this clock cycle. The other bits are shifted one bit position to the right. The new leftmost (most significant) bit is calculated as a function of the other bits in the FSR.

41

Figure 8.8 Binary Linear Feedback Shift Register Sequence Generator

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Figure 8.8 Binary Linear Feedback Shift Register Sequence Generator

42

Figure 8.9 4-Bit Linear Feedback Shift Register

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

The shift register technique has several important advantages. The sequences generated by an LFSR can be nearly random with long periods. In addition, LFSRs are easy to implement in hardware and can run at high speeds.

It can be shown that the output of an n-bit LFSR is periodic with maximum period N = 2n - 1. The all-zeros sequence occurs only if either the initial contents of the LFSR are all zero or the coefficients in Equation (8.1) are all zero (no feedback). A feedback configuration can always be found that gives a period of N; the resulting sequences are called maximal-length sequences, or m-sequences.

Figure 8.9b shows the generation of an m-sequence for the LFSR of Figure 8.9a. The LFSR implements Equation (8.2) with an initial state of 1000 (B3 = 1, B2 = 0, B1 = 0, B0 = 0). Figure 8.9b shows the step-by-step operation as the LFSR is clocked one bit at a time. Each row of the table shows the values currently stored in the four shift register elements. In addition, the row shows the value that appears at the output of the exclusive-OR circuit. Finally, the row shows the value of the output bit, which is just B0. Note that the output repeats after 15 bits. That is, the period of the sequence, or the length of the m-sequence, is 15 = 24 - 1. This same periodic m-sequence is generated regardless of the initial state of the LFSR (except for 0000). With each different initial state, the m-sequence begins at a different point in its cycle, but it is the same sequence.

43

Figure 8.10 1/(1 + X + X3)

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

For any given size of LFSR, a number of different unique m-sequences can be generated by using different values for the A in Equation (8.1).

An equivalent definition of an LFSR configuration is a characteristic polynomial. The characteristic polynomial P (X ) that corresponds to Equation (8.1) has the form in Figure 8.10.

44

Figure 8.11 A Nonlinear Feedback Shift Register

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

The term linear, in the context of LFSR, means that the coefficients Ai in Equations 8.1 and 8.3 are constants; in particular these are Boolean constants (0 or 1). For an NFSR, the coefficients may be variables. An example is Figure 8.11.

45

Grain-128a (1 of 2)

Grain is a family of hardware-efficient stream ciphers

Grain was accepted as part of the eSTREAM effort to approve a number of new stream ciphers

The eSTREAM specification, called Grain v1, defines two stream ciphers, one with an 80-bit key and a 64-bit initialization vector (I V), and one with a 128-bit key and 80-bit I V

Grain has since been revised and expanded to include authentication, referred to as Grain-128a

The eSTREAM final report states that Grain has pushed the state of the art in terms of compact implementation

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Grain is a family of hardware-efficient stream ciphers. Grain was accepted as part of the eSTREAM effort to approve a number of new stream ciphers (described in Chapter 23). The eSTREAM specification, called Grain v1, defines two stream ciphers, one with an 80-bit key and a 64-bit initialization vector (IV), and one with a 128-bit key and 80-bit IV. Grain has since been revised and expanded to include authentication, referred to as Grain-128a [AGRE11, HELL06]. The eSTREAM final report [BABB08] states that Grain has pushed the state of the art in terms of compact implementation.

Grain-128a consists of two shift registers, one with linear feedback and the second with nonlinear feedback, and a filter function. The registers are couple by very lightweight, but judiciously chosen Boolean functions. The LFSR guarantees a minimum period for the keystream, and it also provides balancedness in the output. The NFSR, together with a nonlinear filter, introduces nonlinearity to the cipher. The input to the NFSR is masked with the output of the LFSR so that the state of the NFSR is balanced.

46

Grain-128a (2 of 2)

Grain-128a consists of two shift registers, one with linear feedback and the second with nonlinear feedback, and a filter function

The registers are couple by very lightweight, but judiciously chosen Boolean functions

The L F S R guarantees a minimum period for the keystream, and it also provides balancedness in the output.

The N F S R, together with a nonlinear filter, introduces nonlinearity to the cipher

The input to the N F S R is masked with the output of the L F S R so that the state of the N F S R is balanced

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Grain is a family of hardware-efficient stream ciphers. Grain was accepted as part of the eSTREAM effort to approve a number of new stream ciphers (described in Chapter 23). The eSTREAM specification, called Grain v1, defines two stream ciphers, one with an 80-bit key and a 64-bit initialization vector (IV), and one with a 128-bit key and 80-bit IV. Grain has since been revised and expanded to include authentication, referred to as Grain-128a [AGRE11, HELL06]. The eSTREAM final report [BABB08] states that Grain has pushed the state of the art in terms of compact implementation.

Grain-128a consists of two shift registers, one with linear feedback and the second with nonlinear feedback, and a filter function. The registers are couple by very lightweight, but judiciously chosen Boolean functions. The LFSR guarantees a minimum period for the keystream, and it also provides balancedness in the output. The NFSR, together with a nonlinear filter, introduces nonlinearity to the cipher. The input to the NFSR is masked with the output of the LFSR so that the state of the NFSR is balanced.

47

Figure 8.12 Grain-128a Stream Cipher

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Figure 8.12a shows the structure of Grain-128a for producing a stream of output bits to be used for encrypting a stream of plaintext by a simple bitwise XOR operation. Grain-128a uses a convention of numbering the bits in the registers increasing from left to right and doing a left shift, with the leftmost bit as output.

48

8.7 True Random Number Generators

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Entropy Sources

A true random number generator (T R N G) uses a nondeterministic source to produce randomness

Most operate by measuring unpredictable natural processes such as pulse detectors of ionizing radiation events, gas discharge tubes, and leaky capacitors

Intel has developed a commercially available chip that samples thermal noise by amplifying the voltage measured across undriven resistors

LavaRnd is an open source project for creating truly random numbers using inexpensive cameras, open source code, and inexpensive hardware

The system uses a saturated C C D in a light-tight can as a chaotic source to produce the seed; software processes the result into truly random numbers in a variety of formats

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

A true random number generator (TRNG) uses a nondeterministic source to

produce randomness. Most operate by measuring unpredictable natural processes, such as pulse detectors of ionizing radiation events, gas discharge tubes, and leaky capacitors. Intel has developed a commercially available chip that samples thermal noise by sampling the output of a coupled pair of inverters. LavaRnd is an open source project for creating truly random numbers using inexpensive cameras, open source code, and inexpensive hardware. The system uses a saturated CCD in a light-tight can as a chaotic source to produce the seed. Software processes the result into truly random numbers in a variety of formats.

50

Possible Sources of Randomness

R F C 4086 lists the following possible sources of randomness that can be used on a computer to generate true random sequences:

Sound/video input

The input from a sound digitizer with no source plugged in or from a camera with the lens cap on is essentially thermal noise

If the system has enough gain to detect anything, such input can provide reasonable high quality random bits

Disk drives

Have small random fluctuations in their rotational speed due to chaotic air turbulence

The addition of low-level disk seek-time instrumentation produces a series of measurements that contain this randomness

There is also an online service (random.org) which can deliver random sequences securely over the Internet

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

RFC 4086 lists the following possible sources of randomness that, with care,

easily can be used on a computer to generate true random sequences.

• Sound/video input: Many computers are built with inputs that digitize some

real-world analog source, such as sound from a microphone or video input

from a camera. The “input” from a sound digitizer with no source plugged

in or from a camera with the lens cap on is essentially thermal noise. If the

system has enough gain to detect anything, such input can provide reasonably

high quality random bits.

• Disk drives: Disk drives have small random fluctuations in their rotational

speed due to chaotic air turbulence [JAKO98]. The addition of low-level disk

seek-time instrumentation produces a series of measurements that contain

this randomness. Such data is usually highly correlated, so significant processing is needed. Nevertheless, experimentation a decade ago showed that, with such processing, even slow disk drives on the slower computers of that day could easily produce 100 bits a minute or more of excellent random data.

There is also an online service (random.org), which can deliver random

sequences securely over the Internet.

51

Table 8.5 Comparison of P R N Gs and T R N Gs

Blank Pseudorandom Number Generators True Random Number Generators
Efficiency Very efficient Generally inefficient
Determinism Deterministic Nondeterministic
Periodicity Periodic Aperiodic

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Table 8.5 summarizes the principal differences between PRNGs and TRNGs.

PRNGs are efficient, meaning they can produce many numbers in a short time, and deterministic, meaning that a given sequence of numbers can be reproduced at a later date if the starting point in the sequence is known. Efficiency is a nice characteristic if your application needs many numbers, and determinism is handy if you need to replay the same sequence of numbers again at a later stage. PRNGs are typically also periodic, which means that the sequence will eventually repeat itself. While periodicity is hardly ever a desirable characteristic, modern PRNGs have a period that is so long that it can be ignored for most practical purposes.

TRNGs are generally rather inefficient compared to PRNGs, taking considerably longer time to produce numbers. This presents a difficulty in many applications. For example, cryptography system in banking or national security might need to generate millions of random bits per second. TRNGs are also nondeterministic, meaning that a given sequence of numbers cannot be reproduced, although the same sequence may of course occur several times by chance. TRNGs have no period.

52

Conditioning

A T R N G may produce an output that is biased in some way (such as having more ones than zeros or vice versa)

Biased

N I S T S P 800-90B defines a random process as biased with respect to an assumed discrete set of potential outcomes if some of those outcomes have a greater probability of occurring than do others

Entropy rate

N I S T 800-90B defines entropy rate as the rate at which a digitized noise source provides entropy

Is a measure of the randomness or unpredictability of a bit string

Will be a value between 0 (no entropy) and 1 (full entropy)

Conditioning algorithms/deskewing algorithms

Methods of modifying a bit stream to further randomize the bits

Typically conditioning is done by using a cryptographic algorithm to scramble the random bits so as to eliminate bias and increase entropy

The two most common approaches are the use of a hash function or a symmetric block cipher

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

A TRNG may produce an output that is biased in some way, such as having more ones than zeros or vice versa. More generally, NIST SP 800-90B defines a random process as biased with respect to an assumed discrete set of potential outcomes (i.e., possible output values) if some of those outcomes have a greater probability of occurring than do others. For example, a physical source such as electronic noise may contain a superposition of regular structures, such as waves or other periodic phenomena, which may appear to be random, yet are determined to be non-random using statistical tests.

In addition to bias, another concept used by SP 800-98B is that of entropy rate. SP 800-90B defines entropy rate as the rate at which a digitized noise source (or entropy source) provides entropy; it is computed as the assessed amount of entropy provided by a bit string output from the source, divided by the total number of bits in the bit string (yielding assessed bits of entropy per output bit). This will be a value between 0 (no entropy) and 1 (full entropy). Entropy rate is a measure of the randomness or unpredictability of a bit string. Another way of expressing it is that the entropy rate is k /n for a random source of length n bits and min-entropy k. Min-entropy is a measure of the number of random bits and is explained in Appendix B. In essence, a block of bits or a bit stream that is unbiased, and in which each bit and each group of bits is independent of all other bits and groups of bits will have an entropy rate of 1.

For hardware sources of random bits, the recommended approach is to assume that there may be bias and/or an entropy rate of less than 1 and to apply techniques to further “randomize” the bits. Various methods of modifying a bit stream for this purpose have been developed. These are referred to as conditioning algorithms or deskewing algorithms.

Typically, conditioning is done by using a cryptographic algorithm to “scramble”

the random bits so as to eliminate bias and increase entropy. The two most common approaches are the use of a hash function or a symmetric block cipher.

53

Hash Function

A hash function produces an n-bit output from an input of arbitrary length

A simple way to use a hash function for conditioning is as follows:

Blocks of m input bits, with m ≥ n, are passed through the hash function and the n output bits are used as random bits

To generate a stream of random bits, successive input blocks pass through the hash function to produce successive hashed output blocks

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

A hash function produces an n-bit output from an input of arbitrary length. A simple way to use a hash function for conditioning is as follows. Blocks of m input bits, with m >n , are passed through the hash function and the n output bits are used as random bits. To generate a stream of random bits, successive input blocks pass through the hash function to produce successive hashed output blocks.

Operating systems typically provide a built-in mechanism for generating random numbers. For example, Linux uses four entropy sources: mouse and keyboard, activity, disk I/O operations, and specific interrupts. Bits are generated from these four sources and combined in a pooled buffer. When random bits are needed, the appropriate number of bits are read from the buffer and passed through the SHA-1 hash function.

A more complex approach is the hash derivation function specified in

SP800-90A. Hash_df can be defined as follows:

Parameters:

input_string : The string to be hashed.

outlen : Output length.

no_of_bits_to_return : The number of bits to be returned by Hash_df. The maximum length (max_number_of_bits ) is implementation dependent, but shall be less than or equal to (255 * outlen ). no_of_bits_to_return is represented as a 32-bit integer.

requested_bits : The result of performing the Hash_df.

54

Figure 8.13 N R B G Model

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Figure 8.13 provides a general model for a nondeterministic random bit generator. A hardware noise source produces a true random output. This is digitized to produce true, or nondeterministic, source of bits. This bit source then passes through a conditioning module to mitigate bias and maximize entropy.

Figure 8.13 also shows a health-testing module, which is used on the outputs

of both the digitizer and conditioner. In essence, health testing is used to validate

that the noise source is working as expected and that the conditioning module is

produced output with the desired characteristics. Both forms of health testing are

recommended by SP 800-90B.

55

Health Tests on the Noise Source (1 of 4)

The nature of the health testing of the noise source depends strongly on the technology used to produce noise

In general, the assumption can be made that the digitized output of the noise source will exhibit some bias

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

The nature of the health testing of the noise source depends strongly on the technology used to produce noise. In general, we can assume that the digitized output of the noise source will exhibit some bias. Thus,

the traditional statistical tests, such as those defined in SP 800-22 and discussed in Section 8.1, are not useful for monitoring the noise source, because the noise source is likely to always fail. Rather, the tests on the noise source need to be tailored to the expected statistical behavior of the correctly operating noise source. The goal is not to determine if the source is unbiased, which it isn’t, but if it is operating as expected.

56

Health Tests on the Noise Source (2 of 4)

Thus, traditional statistical tests are not useful for monitoring the noise source, because the noise source is likely to always fail

The tests on the noise source need to be tailored to the expected statistical behavior of the correctly operating noise source

The goal is not to determine if the source is unbiased, but if it is operating as expected

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

The nature of the health testing of the noise source depends strongly on the technology used to produce noise. In general, we can assume that the digitized output of the noise source will exhibit some bias. Thus,

the traditional statistical tests, such as those defined in SP 800-22 and discussed in Section 8.1, are not useful for monitoring the noise source, because the noise source is likely to always fail. Rather, the tests on the noise source need to be tailored to the expected statistical behavior of the correctly operating noise source. The goal is not to determine if the source is unbiased, which it isn’t, but if it is operating as expected.

57

Health Tests on the Noise Source (3 of 4)

S P 800-90B specifies that continuous tests be done on digitized samples obtained from the noise source

The purpose is to test for variability and to determine if the noise source is producing at the expected entropy rate

S P 800-90B mandates the use of two tests

Repetition Count Test

Designed to quickly detect a catastrophic failure that causes the noise source to become “stuck” on a single output value for a long time

Involves looking for consecutive identical samples

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

SP 800-90B specifies that continuous tests be done on digitized samples

obtained from the noise source (point A in Figure 8.13). The purpose is to test for variability. More specifically, the purpose is to determine if the noise source is producing at the expected entropy rate. SP 800-90B mandates the use of two tests: the Repetition Count Test and the Adaptive Proportion Test.

The Repetition Count Test is designed to quickly detect a catastrophic failure

that causes the noise source to become “stuck” on a single output value for a long time. For this test, it is assumed that a given noise source is assessed to have a given min-entropy value of H . The entropy is expressed as the amount of entropy per sample, where a sample could be a single bit or some block of bits of length n . With an assessed value of H , it is straightforward to calculate the probability that a sequence of C consecutive samples will yield identical sample values.

The Repetition Count Test involves looking for consecutive identical samples. If the count reaches some cutoff value C, then an error condition is raised. To determine the value of C used in the test, the test must be configured with a parameter W, which is the acceptable false-positive probability associated with an alarm triggered by C repeated sample values. To avoid false positives, W should be set at some very small number greater than 0. Given W, we can now determine the value of C.

The Repetition Count Test starts by recording a sample value and then counting the number of repetitions of the same value. If the counter reaches the cutoff value C, an error is reported. If a sample value is encountered that differs from the preceding sample, then the counter is reset to 1 and the algorithm starts over.

The Adaptive Proportion Test is designed to detect a large loss of entropy,

such as might occur as a result of some physical failure or environmental change affecting the noise source. The test continuously measures the local frequency of occurrence of some sample value in a sequence of noise source samples to determine if the sample occurs too frequently.

The test starts by recording a sample value and then observes N successive sample values. If the initial sample value is observed at least C times, then an error condition is reported. SP 800-90B recommends that a probability of a false positive of W = 2-30 be used for the test and provides guidance on the selection of values for N and C.

SP 800-90B specifies that health tests should also be applied to the output of the conditioning component (point B in Figure 8.13), but does not indicate which tests to use. The purpose of the health tests on the conditioning component is to assure that the output behaves as a true random bit stream. Thus, it is reasonable to use the tests for randomness defined in SP 800-22, and described in Section 8.1.

58

Health Tests on the Noise Source (4 of 4)

Adaptive Proportion Test

Designed to detect a large loss of entropy, such as might occur as a result of some physical failure or environmental change affecting the noise source

The test continuously measures the local frequency of occurrence of some sample value in a sequence of noise source samples to determine if the sample occurs too frequently

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

SP 800-90B specifies that continuous tests be done on digitized samples

obtained from the noise source (point A in Figure 8.13). The purpose is to test for variability. More specifically, the purpose is to determine if the noise source is producing at the expected entropy rate. SP 800-90B mandates the use of two tests: the Repetition Count Test and the Adaptive Proportion Test.

The Repetition Count Test is designed to quickly detect a catastrophic failure

that causes the noise source to become “stuck” on a single output value for a long time. For this test, it is assumed that a given noise source is assessed to have a given min-entropy value of H . The entropy is expressed as the amount of entropy per sample, where a sample could be a single bit or some block of bits of length n . With an assessed value of H , it is straightforward to calculate the probability that a sequence of C consecutive samples will yield identical sample values.

The Repetition Count Test involves looking for consecutive identical samples. If the count reaches some cutoff value C, then an error condition is raised. To determine the value of C used in the test, the test must be configured with a parameter W, which is the acceptable false-positive probability associated with an alarm triggered by C repeated sample values. To avoid false positives, W should be set at some very small number greater than 0. Given W, we can now determine the value of C.

The Repetition Count Test starts by recording a sample value and then counting the number of repetitions of the same value. If the counter reaches the cutoff value C, an error is reported. If a sample value is encountered that differs from the preceding sample, then the counter is reset to 1 and the algorithm starts over.

The Adaptive Proportion Test is designed to detect a large loss of entropy,

such as might occur as a result of some physical failure or environmental change affecting the noise source. The test continuously measures the local frequency of occurrence of some sample value in a sequence of noise source samples to determine if the sample occurs too frequently.

The test starts by recording a sample value and then observes N successive sample values. If the initial sample value is observed at least C times, then an error condition is reported. SP 800-90B recommends that a probability of a false positive of W = 2-30 be used for the test and provides guidance on the selection of values for N and C.

SP 800-90B specifies that health tests should also be applied to the output of the conditioning component (point B in Figure 8.13), but does not indicate which tests to use. The purpose of the health tests on the conditioning component is to assure that the output behaves as a true random bit stream. Thus, it is reasonable to use the tests for randomness defined in SP 800-22, and described in Section 8.1.

59

Health Tests on the Conditioning Function

S P 800-90B specifies that health tests should also be applied to the output of the conditioning component, but does not indicate which tests to use

The purpose of the health tests on the conditioning component is to assure that the output behaves as a true random bit stream

It is reasonable to use the tests for randomness defined in S P 800-22

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

SP 800-90B specifies that health tests should also be applied to the output of the conditioning component (point B in Figure 8.13), but does not indicate which tests to use. The purpose of the health tests on the conditioning component is to assure that the output behaves as a true random bit stream. Thus, it is reasonable to use the tests for randomness defined in SP 800-22, and described in Section 8.1.

60

Intel Digital Random Number Generator

T R N Gs have traditionally been used only for key generation and other applications where only a small number of random bits were required

This is because T R N Gs have generally been inefficient with a low bit rate of random bit production

The first commercially available T R N G that achieves bit production rates comparable with that of P R N Gs is the Intel digital random number generator offered on new multicore chips since May 2012

It is implemented entirely in hardware

The entire D R N G is on the same multicore chip as the processors

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

As was mentioned, TRNGs have traditionally been used only for key generation and other applications where only a small number of random bits were required. This is because TRNGs have generally been inefficient, with a low bit rate of random bit production.

The first commercially available TRNG that achieves bit production rates

comparable with that of PRNGs is the Intel digital random number generator

(DRNG) [TAYL11, MECH14], offered on new multicore chips since May 2012.

Two notable aspects of the DRNG:

1. It is implemented entirely in hardware. This provides greater security than a facility that includes a software component. A hardware-only implementation should also be able to achieve greater computation speed than a software module.

2. The entire DRNG is on the same multicore chip as the processors. This eliminates the I/O delays found in other hardware random number generators.

61

Figure 8.14 Intel Processor Chip with Random Number Generator

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Figure 8.14 shows the overall structure of the DRNG. The first stage of the DRNG generates random numbers from thermal noise. The heart of the stage consists of two inverters (NOT gates), with the output of each inverter connected to the input of the other. Such an arrangement has two stable states, with one inverter having an output of logical 1 and the other having an output of logical 0. The circuit is then configured so that both inverters are forced to have the same indeterminate state (both inputs and both outputs at logical 1) by clock pulses. Random thermal noise within the inverters soon jostles the two inverters into a mutually stable state. Additional

circuitry is intended to compensate for any biases or correlations. This stage is

capable, with current hardware, of generating random bits at a rate of 4 Gbps.

The output of the first stage is generated 512 bits at a time. To assure that

the bit stream does not have skew or bias, a conditioner randomizes its input using a cryptographic function. In this case, the function is referred to

as CBC-MAC or CMAC, as specified in NIST SP 800-38B. In essence, CMAC

encrypts its input using the cipher block chaining (CBC) mode (Figure 8.4) and

outputs the final block. We examine CMAC in detail in Chapter 12. The output of this stage is generated 256 bits at a time and is intended to exhibit true randomness with no skew or bias.

While the hardware’s circuitry generates random numbers from thermal noise much more quickly than its predecessors, it is still not fast enough for some of today’s computing requirements. To enable the DRNG to generate random numbers as quickly as a software DRBG, and also maintain the high quality of the random numbers, a third stage is added. This stage uses the 256-bit random numbers to seed a cryptographically secure DRBG that creates 128-bit numbers. From one 256-bit seed, the DRBG can output many pseudorandom numbers, exceeding the 3-Gbps rate of the entropy source. An upper bound of 511 128-bit samples can be generated per seed. The algorithm used for this stage is CTR_DRBG, described in Section 8.3.

The output of the DRNG is available to each of the cores on the chip via the

RDRAND instruction. RDRAND retrieves a 16-, 32-, or 64-bit random value and makes it available in a software-accessible register.

Preliminary data from a pre-production sample on a system with a third

generation Intel® Core™ family processor produced the following performance

[INTE12]: up to 70 million RDRAND invocations per second, and a random data production rate of over 4 Gbps.

62

Figure 8.15 Intel D R N G Logical Structure

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Figure 8.15 provides a simplified view of the logical flow of the Intel DRBG. As was described, the heart of the hardware entropy source is a pair of inverters that feed each other. Two transistors, driven by the same clock, force the inputs and outputs of both inverters to the logical 1 state. Because this is an unstable state, thermal noise will cause the configuration to settle randomly into a stable state with either Node A at logical 1 and Node B at logical 0, or the reverse. Thus the module generates random bits at the clock rate.

The output of the entropy source is collected 512 bits at a time and used to

feed to two CBC hardware implementations using AES encryption. Each implementation takes two blocks of 128 bits of “plaintext” and encrypts using the CBC mode. The output of the second encryption is retained. For both CBC modules, an all-zeros key is used initially. Subsequently, the output of the PRNG stage is fed back to become the key for the conditioner stage.

The output of the conditioner stage consists of 256 bits. This block is provided

as input to the update function of the PRNG stage. The update function is initialized with the all-zeros key and the counter value 0. The function is iterated twice to produce a 256-block, which is then XORed with the input from the conditioner stage. The results are used as the 128-bit key and the 128-bit seed for the generate function. The generate function produces pseudorandom bits in 128-bit blocks.

63

Summary

Explain the concepts of randomness and unpredictability with respect to random numbers

Present an overview of requirements for pseudorandom number generators

Explain the significance of skew

Present an overview of stream ciphers and R C4

Understand the differences among true random number generators, pseudorandom number generators, and pseudorandom functions

Explain how a block cipher can be used to construct a pseudorandom number generator

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

Chapter 8 summary.

64

Copyright

This work is protected by United States copyright laws and is provided solely for the use of instructors in teaching their courses and assessing student learning. Dissemination or sale of any part of this work (including on the World Wide Web) will destroy the integrity of the work and is not permitted. The work and materials from it should never be made available to students except by instructors using the accompanying text in their classes. All recipients of this work are expected to abide by these restrictions and to honor the intended pedagogical purposes and the needs of other instructors who rely on these materials.

Copyright © 2020 Pearson Education, Inc. All Rights Reserved.

65

.MsftOfcThm_Text1_Fill { fill:#000000; } .MsftOfcThm_MainDark1_Stroke { stroke:#000000; }