Chapter-3-DataLinkLayer.pdf

The Data Link Layer Chapter 3

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

• Data Link Layer Design Issues (in PART-1)

• Error Detection and Correction (in PART-1)

• Elementary Data Link Protocols (in PART-2)

• Sliding Window Protocols (in PART-2)

• Example Data Link Protocols (in PART-2)

Revised: August 2011

The Data Link Layer

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Responsible for delivering frames of

information over a single link

• Handles transmission errors and

regulates the flow of data Physical

Link

Network

Transport

Application

Data Link Layer Design Issues

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

• Frames »

• Possible services »

• Framing methods »

• Error control »

• Flow control »

Frames

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Link layer accepts packets from the network layer, and

encapsulates them into frames that it sends using the

physical layer; reception is the opposite process

Actual data path

Virtual data path

Network

Link

Physical

Possible Services

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Acknowledged connection-oriented service

• Dedicated connection (data channel) is set up (there are specific

protocols for connection set-up and disconnection) before data is

transmitted from transmitter (tx) to receiver (rx); rare.

• Each frame sent by tx is numbered and the rx replies with an ack

when it receives a frame.

Unacknowledged connectionless service (e.g., Ethernet)

• No connection established beforehand.

• No ack sent by rx.

• Frame is sent with no connection / error recovery

Acknowledged connectionless service

• Frame is sent with retransmissions if needed.

• Example is 802.11

Framing Methods

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Key Question: How does the rx know where is the

start of a certain frame and where does it end?

• Byte count »

• Flag bytes with byte stuffing »

• Flag bits with bit stuffing »

• Physical layer coding violations

− Use non-data symbol to indicate frame

Framing – Byte count

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Frame begins with a count of the number of bytes in it

(length field). Note that the number of bytes in a frame

includes the length field.

• Simple, but difficult to resynchronize after an error

Error

case

Expected

case

Framing – Byte stuffing

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Special flag bytes delimit frames; occurrences of flags in

the data must be stuffed (escaped).

• Two consecutive FLAG bytes demarcate the end of a frame

and start of the next frame.

• Longer, but easy to resynchronize after error (look for next

occurrence of two successive FLAG bytes.

Stuffing

examples

Frame

format

Escape byte

stuffed.

Framing – Bit stuffing

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Stuffing done at the bit level:

• Each frame begins and ends with a special bit pattern 0 1 1 1 1 1

1 0 (or 0x7E in hex, the key is 6 consecutive1‟s).

• What if six consecutive 1‟s appear in the main data

stream?

− At tx, after five 1‟s in the data, a 0 is added (stuffed)

− At rx, a 0 after five 1‟s is deleted

Transmitted bits

with stuffing

Data bits

Framing – Practice

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

In practice, combinations of the previously discussed

approaches for used.

For example, in Ethernet and 802.11, a frame begins

with a well defined pattern known as a preamble. The

preamble is followed by a „length (count) field‟ which is

used to locate the end of the frame.

• In 802.11, the preamble can be as long as 72 bits.

Error Control

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Error control repairs frames that are received in error

• Requires errors to be detected at the receiver

• Typical strategy: retransmit a frame which has not

been acknowledged.

• Timer protects against lost acknowledgements (this

is the case when the rx does send an ack, but the

ack is lost).

Detecting errors and retransmissions are next topics.

Flow Control

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Prevents a fast sender from out-pacing a slow receiver

(buffer overflow at the rx)

• Receiver gives feedback on the data it can accept

• Rare in the link layer as NICs (network interface

card) run at “wire speed”

− Receiver can take data as fast as it can be sent

Flow control is a topic in the Link and Transport layers.

Error Detection and Correction

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Error codes add structured redundancy to data so errors can be detected, and possibly also corrected.

Error correction codes:

• Hamming codes (we will only cover this) »

• Binary convolutional codes »

• Reed-Solomon and Low-Density Parity Check codes − Mathematically complex, widely used in real systems

Error detection codes:

• Parity »

• Checksums »

• Cyclic redundancy codes (CRC check) »

Error Models

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

• Isolated single bit errors – the probability that any bit is in error is independent of the probability that other bits in the frame are in

error; e.g, if the frame is [1 0 0 1 1 0 1], and if the bit error

probability is 0.001, the probability that the 7th bit is in error is still

0.001, even if the first 6 bits are in error. (you can generalize it to

isolated 2 bit errors, 3 bit errors, etc.)

• Burst errors (more common) – the probability that any bit is in error is dependent on the location of the first bit which is in error

and the burst length, B; e.g, if the frame is [1 0 0 1 1 0 1], an

error burst of length B = 7 means that if the first bit is in error, all

seven bits in the frame are in error.

• Error detection codes are characterized by how well they can detect isolated and bursty bit errors.

Error Bounds – Hamming distance

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Code turns data of n bits into codewords of n+k bits (i.e., no. of

redundant bits added = k)

The number of 1‟s in a codeword (bit string) is called the Hamming

Weight of the codeword.

Hamming Distance (d) between two codewords (bit strings) is the

number of bit flips required to turn one codeword into the other one

(or, in other words, the number of positions in which the two

codewords differ).

− For example, the Hamming Distance between [01100] and [10101] is 3. You

can find this by performing a bitwise XOR (exclusive OR) and then adding up

the result.

− 0 (XOR) 0 = 0, 1 (XOR) 1 = 0, 0 (XOR) 1 = 1, 1 (XOR) 0 = 1

0 1 1 0 0

1 0 1 0 1

XOR 1 1 0 0 1 sum = 3

Error Bounds – Hamming distance

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

A set of valid codewords is called a dictionary. The Hamming

Distance of a dictionary is the minimum of the Hamming Distances

between any two codewords in the dictionary.

− For example, suppose the dictionary has four codewords {01100, 10101,

11111, 00001}. Verify that the Hamming Distance of this dictionary is 2 (you

have to compute 6 Hamming Distances - between codewords (1,2), (1,3),

(1,4), (2,3), (2,4), and (3,4) - and then find their minimum).

• Example with a dictionary of 4 codewords of 10 bits

(n=2, k=8) each, called a (10,2) code (in general,

(n+k,n) code). Recall that the original number of data

bits = k, which turns into n+k after coding. − Dictionary: {0000000000, 0000011111, 1111100000, 1111111111}

− Hamming Distance is d=5 (take any pair of codewords, perform a

bitwise XOR, and add up – the answer will be at least 5).

Error Bounds – Hamming distance

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Bounds for a code with distance:

• 2d+1 – can correct d errors

− If you want to correct up to „d‟ (say d = 5) number of errors,

the Hamming Distance between any two codewords in the

dictionary must be at least 11.

• d+1 – can detect d errors

− If you want to detect up to „d‟ (say d = 5) number of errors,

the Hamming Distance between any two codewords in the

dictionary must be at least 6. This is a less stringent

requirement than maintaining a Hamming distance of 11 for

error correction.

• Error correction is a heavier task than error

detection. Detection is the first step, then you can

correct.

Error Bounds – Hamming distance

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

An alternate way of stating the contents in the previous

slide is as follows:

• If the minimum Hamming Distance between any two

codewords in a dictionary = d, then:

− you can correct up to number of errors. For our

example, d = 5, therefore you can correct up to 2 errors.

− you can detect up to „d-1‟ number of errors. For our

example, d = 5, therefore you can detect up to 4 errors.

 

  

 

2

1d

Hamming Distance and Error Detection

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Why can a code with distance d+1 detect up to d errors?

• Both tx and rx know the dictionary of valid codewords. In our

previous example, the dictionary is {0000000000, 0000011111,

1111100000, 1111111111}. The Hamming Distance for this

dictionary is d = 5.

• Rx detects an error when it receives an invalid codeword (e.g.

1100000000).

• If no. of errors  5, then one valid codeword may be turned into

another valid codeword and there is no way to detect that an error

has occurred.

− For example, with 5 strategically placed bit flips, codeword 1 (0000000000 )

can be converted into codeword 2 (0000011111). And then, you don‟t know

whether codeword 2 was sent and received with no errors, or codeword 1 was

sent which was received with 5 errors.

Hamming Distance and Error Correction

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Why can a code with distance 2d+1 correct up to d

errors?

• Because errors are corrected by mapping a received invalid codeword to

the nearest valid codeword, i.e., the one that can be reached with the

fewest bit flips (errors). − In other words, if the invalid codeword can be reached from valid codeword 1 with „m‟ bit

flips or from valid codeword 2 with „n‟ bit flips (m < n), we map the invalid codeword back

to valid codeword 1. That is, the probability that the channel made „m‟ (say 4) errors is

higher than it made „n‟ (say 5) errors. The error is therefore „corrected‟ in a probabilistic

sense.

− If the channel flips each bit independently (meaning, what it does to the current bit has no

bearing to what it has done with the previous bits) with a probability equal to 0.001, then

the probability that it has flipped specific 4 bits out of 10 bits (say) = (0.001)4 (1-0.001)6 

10-12 and the probability that it has flipped specific 5 bits out of 10 = (0.001)5 (1-0.001)5 

10-15. Which one is more likely to have happened?

• If there are more than bit flips, the received codeword may be

closer to another valid codeword than the codeword that was sent. − For example, sending 0000000000 with 2 flips might give 1100000000 which is closest (in

terms of Hamming Distance) to 0000000000, correcting the error. But with 3 flips,

1110000000 might be received, which is closest to 1111100000, which is an error.

 

  

 

2

1d

Hamming Distance and Error Correction

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

The strength of a dictionary (i.e., its error correction

capability) is dictated by the two closest codewords in the

dictionary. Suppose, those two codewords are i and j.

Codeword i Codeword j

with „d‟ bit flips, you end here with „d‟ bit flips, you end here

For error correction, the magenta and yellow (representing

invalid codewords) must still be separated by at least one

Hamming Distance. If they coincide, you won‟t know where

the invalid codeword came from. Therefore, we need a

Hamming Distance of 2d+1 to correct up to d number of bit

errors.

d d1

Odd and Even Parity

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

A bit string has odd parity if the number of 1‟s in the

string is odd. Conversely, the bit string has even parity if

the number of 1‟s in the string is even (0 is an even

number).

− 01100, 000, 11001001 are all even parity strings.

− 1000011, 1, 00010 are all odd parity strings.

− ASCII representations of characters use a parity bit. For

example, the ASCII representation of the character „a‟ is

01100001 (a byte), the rightmost bit being the parity bit. Also,

the ASCII representation of the character „A‟ is 01000001, the

rightmost bit being the parity bit. Similarly, the representations

for „m‟ and „M‟ are 01101101 and 01001101 respectively. So,

what parity (even or odd?) scheme does ASCII implement for

characters?

− The complete ASCII table can be found here:

http://www.ascii-code.com/

Parity Check and Parity Bit (1)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Suppose we are transmitting blocks of n bits and for every block of n

bits, we append a parity bit at the end, such that the encoded

codeword has even parity (could also be odd parity).

If the rx knows that an even parity encoding scheme is employed by

the tx, it can detect any odd number (1,3,5,7, …) of errors (by parity

mismatch) and request retransmission. But any even number of

errors will go undetected (no parity mismatch in this case).

− Example: before encoding [10110] (here n=5), after encoding

[101101], the last bit is the parity bit which ensures even parity of the

codeword. You should experiment with arbitrarily placed 1, 3, or 5

errors (could also affect the parity bit positions) and verify even

parity mismatch. Note that the parity bit would have been 0 if an odd

parity scheme were adopted.

− Example: before encoding [10111], after encoding [101110] for even

parity or [101111] for odd parity.

− You cannot correct for any error, not even one, because you cannot

localize the error.

Parity Check and Parity Bit (2)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Suppose the length of all codewords after adding a parity bit at the end

is „n‟. Then, the probability that an error phenomenon goes undetected

= probability that there are 2 errors + probability that there are 4 errors

+ probability that there are 6 errors + …

Assuming that the channel flips each bit independently of the past (or

future), the probability that there are two (any two) bit flips = ,

where p = probability that the channel flips any bit (bit error probability)

and (n choose 2) is the number of ways you can choose 2

bits out of n. This probability form is known as binomial probability.

Similarly, the probability that there are four bit flips = .

Suppose n = 7 and p = 0.001. Then, the probability that an error

phenomenon (2 errors, or 4 errors, or 6 errors) goes undetected is

The probability that an error phenomenon is detected is:

22 )1(

2

 

  

 n pp

n

)!2(!2

!

2  

  

n

nn

44 )1(

4

 

  

 n pp

n

5062442 101.2)001.01(001.0

6

6 )001.01(001.0

4

6 )001.01(001.0

2

6 

 

  

 

  

 

  

0.99998)001.01(001.0 6

6 )001.01(001.0

4

6 )001.01(001.0

2

6 1

062442 

  

 

  

 

  

 

  

 

Multiple Parity Check Bits (1)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Of course, we could add multiple parity check bits to improve error

detection/correction performance. Suppose our data is 4 bits [d1, d2, d3,

d4] = [1 1 1 0], and we choose to add three check bits at the end [c1,c2,c3].

The encoded codeword is [d1,d2,d3,d4,c1,c2,c3] = [1 1 1 0 0 0 1], where the

check bits are computed as follows ( denotes modulo 2 addition or XOR

operation). This is not the only way, but the check bits must „cover‟ all

data bits. Note that the rx knows how the check bits are computed at tx.

− c1 = d1  d3  d4 = 1  1  0 = 0  0 = 0 (since 0  0 = 0, 1  1 = 0 and 0  1 = 1)

− c2 = d1  d2  d4 = 1  1  0 = 0

− c3 = d1  d2  d3 = 1  1  1 = 0  1 = 1

With this scheme, you can now correct one error. Suppose the rx receives [1 0 1 0

0 0 1] - notice that bit 2 has been flipped. When the rx verifies the parities, c1 will

work out fine, c2 will show an error ( either d1 or d2 or d4 is in error, which further

reduces to either d2 or d4 is in error since c1 worked out fine), and c3 will show an

error ( either d1 or d2 or d3 is in error, which further reduces to d2 must be in

error). Can you argue similarly when bit 1 is flipped?

Multiple Parity Check Bits (2)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

A formal prescription for localizing the errored bit involves computing the

syndrome from the received codeword. It will help if we look at the tx

operation in terms of a matrix multiplication (except additions are modulo 2):

    321421431,4,3,2,14,3,2,1

,,

0111000

1010100

1100010

1110001

ddddddddddddddddd 

   

   

4 x 4

identity

matrix 1‟s in 1st,

3rd and 4th

rows

1‟s in 1st,

2nd and 4th

rows

1‟s in 1st,

2nd and 3rd

rows

GENERATOR

MATRIX

G = [I44, P43]

CODEWORD, AFTER ENCODINGDATA VECTOR

Multiple Parity Check Bits (3)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Here‟s another example: data vectors are of length 5 [d1, d2, d3, d4, d5] and

there are two check bits [c1,c2] computed as follows:

− c1 = d1  d3  d5

− c2 = d2  d4

    425315,4,3,2,154,3,2,1

,,

0110000

1001000

0100100

1000010

0100001

, ddddddddddddddd 

     

     

5 x 5

identity

matrix

1‟s in 1st,

3rd and 5th

rows

1‟s in 2nd and

4th rows

GENERATOR

MATRIX

G = [I55, P52]

Multiple Parity Check Bits (4)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Observe that the general rule for computing the generator matrix for data

vectors of length „n‟ is:

− Write down an n  n identity matrix, I

− Write down the n  k matrix P, where k = no. of check bits. This matrix dictates

which data bits are „covered‟ by which check bits.

− Then, G = [Inn, Pnk] i.e., P is concatenated to I horizontally. The size of matrix G is

therefore n  (n+k).

− Since the data vector d is of size 1  n, the encoded data vector is given by d  G

(with modulo 2 addition), and is of size 1  (n+k).

The receiver first computes the parity check matrix C from the generator

matrix G (which it knows) as follows: C = [PT, Ikk], where P T denotes the

transpose of P and is of size k  n. Note that C is of size k  (n+k).

Next, it computes the product s = C  rT, where r is the 1  (n+k) row

vector received at the receiver and rT denotes the transpose of r. It is

clear that the size of s is k  1. This vector s is known as the syndrome

vector. If there are no errors, we will have s = 0 (zero vector). If there‟s an

error (one bit), we can figure out its location from the syndrome vector.

Multiple Parity Check Bits (5)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Let‟s do an example. The generator matrix in slide 26 is (recall that n =

length of data vector = 4 and k = no. of check bits = 3):

The parity check matrix is therefore:

  

  



   

   



   

   

0111

1011

1101

011

101

110

111

0111000

1010100

1100010

1110001

T PPG

  

  

1000111

0101011

0011101

C

PT Identity

matrix

Identity

matrix

Multiple Parity Check Bits (6)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Let‟s revisit slide 25. The encoded codeword is d = [1 1 1 0 0 0 1].

Case 1 (no error): In this case, the receiver receives r = [1 1 1 0 0 0 1].

The syndrome vector is all zero:

  

  

  

  







        

        

  

  

0

0

0

1000111

0000011

0000101

1

0

0

0

1

1

1

1000111

0101011

0011101

s

Instead of performing 1  0  1  0  0  0  0 = 0, you could perform a

regular addition, giving you 2, divide that by 2, and output the remainder

(which is 0) as the final answer.

Similarly, instead of performing 1  1  1  0  1  1  0 = 1 (say), you

could perform a regular addition, giving you 5, divide that by 2, and output the

remainder (which is 1) as the final answer.

Dividing by 2 and outputting the remainder is „modulo 2 addition‟.

Multiple Parity Check Bits (7)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

The encoded codeword is d = [1 1 1 0 0 0 1].

Case 2 (second bit in error): In this case, the receiver receives r = [1 0 1 0

0 0 1]. The syndrome vector is:

Next, the rx looks for that column in the parity check matrix C which is

identical to the syndrome vector. In this case, the second column of C is

identical to the syndrome vector, and therefore, it concludes that the error

is in the second bit of r and simply flips it back to what was sent. That this

logic is always valid can be proven mathematically, but we will not do that.

  

  

  

  







        

        

  

  

1

1

0

1000101

0000001

0000101

1

0

0

0

1

0

1

1000111

0101011

0011101

s

Multiple Parity Check Bits (8)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

The encoded codeword is d = [1 1 1 0 0 0 1].

Case 3 (seventh bit in error): In this case, the receiver receives r = [1 1 1

0 0 0 0]. The syndrome vector is:

This syndrome vector appears in the seventh column of the parity check

matrix, and therefore, the error is in the seventh bit. Again, error

localization successfully done.

  

  

  

  







        

        

  

  

1

0

0

0000111

0000011

0000101

0

0

0

0

1

1

1

1000111

0101011

0011101

s

2-D Parity Check Bits (1)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Data need not be one dimensional (vector). Images are 2-D, and 2-D

parity check schemes can be used. Alternately, you can even reshape a

data vector (say of length 16) into a 44 matrix and use a 2-D parity

check scheme. The recipe is as follows:

• Add single parity bit to each column.

• Add a final “parity” column

1 0 0 1 0 0

0 1 0 0 0 1

1 0 0 1 0 0

1 1 0 1 1 0

1 0 0 1 1 1

Bottom row consists of

(even) parity check bits for

each column

Last column consists

of (even) parity check

bits for each row

(even) parity bit for row

and column parity bits

2-D Parity Check Bits (2)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

1 0 0 1 0 0

0 0 0 1 0 1

1 0 0 1 0 0

1 0 0 0 1 0

1 0 0 1 1 1

1 0 0 1 0 0

0 0 0 0 0 1

1 0 0 1 0 0

1 0 0 1 1 0

1 0 0 1 1 1

1 0 0 1 0 0

0 0 0 1 0 1

1 0 0 1 0 0

1 0 0 1 1 0

1 0 0 1 1 1

1 0 0 1 0 0

0 0 0 0 0 1

1 0 0 1 0 0

1 1 0 1 1 0

1 0 0 1 1 1

Arrows indicate failed check bits

Two errorsOne error

Three

errors Four errors (undetectable)

1, 2, or 3 errors can

always be detected; Not

all patterns involving > 4

errors can be detected

2-D Parity Check Bits (3)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Can 2-D codes correct any error? Consider an 1- bit error pattern.

1 0 0 1 0

0 1 0 0 1

0 0 1 1 0

1 1 0 1 1

0 0 1 1 0

0 0 0 1 0

0 1 0 0 1

0 0 1 1 0

1 1 0 1 1

0 0 1 1 0

parity mismatch

parity mismatchmust be in error

Error Correction – Hamming Code (1)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

A Hamming code prescribes a certain way of inserting

multiple parity check bits (the locations and how to

compute them). The check bits are generally interleaved

with the data bits (as opposed to appending at the end).

Interleaving with the data bits affords better protection

against burst errors.

Several popular Hamming Codes are (7,4), (15,11),

(31,26) and (63,57). The notation (15,11) means that 4

check bits are provided for data vectors of length 11 so

that the encoded codeword is of length 15.

Example – (7,4) Hamming Code

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Error Detection – Interleaved Parity Bits

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

As mentioned before, interleaving of N parity bits can

detect burst errors up to length N. Column 1 is

transmitted first, then the second, the third, etc.

Error Detection – Checksums (1)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Several Internet protocols use check bits to detect

errors in the header (or in the header and data).

A checksum is calculated for header contents and

included in a special field.

Checksum recalculated at every router, so algorithm

selected for ease of implementation in software

Let header consist of L number of 16-bit words,

[b0, b1, b2, ..., bL-1]

The algorithm appends a 16-bit checksum bL

[b0, b1, b2, ..., bL-1, bL]

Error Detection – Checksums (2)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

The checksum bL is calculated as follows:

Treating each 16-bit word as an integer, find

x = (b0 + b1 + b2+ ...+ bL-1) modulo 2 16-1

The checksum is then given by:

bL = - x modulo 2 16 -1

Thus, the headers must satisfy the following pattern:

0 = (b0 + b1 + b2+ ...+ bL-1 + bL ) modulo 2 16 -1

The checksum calculation is carried out in software using

one‟s complement arithmetic. N-bit burst errors can be

detected.

Error Detection – Checksums (3)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Using Modulo Arithmetic (can also be done using

binary arithmetic and 1‟s complement) − Assume 4-bit words

− Use modulo 24-1 arithmetic

− Suppose the field to protect has 2 words b0 and b1

− We will add the checksum b2

− Data word b0 = 1100 = 12 (in decimal)

− Data word b1 = 1010 = 10 (in decimal)

− b0+b1 = (12+10) modulo (2 4-1) = 22 modulo15 = 7

− The checksum b2 = -7 modulo 15 = 8 = 1000 (in binary)

− If there‟s no errors, we will always have b0+b1+b2 modulo 15 = 0

Error Detection – CRCs (1)

CRC stands for „Cyclic Redundancy Check‟. Also known as

polynomial codes.

Most data communications standards use polynomial

codes for error detection.

Polynomial codes also basis for powerful error-correction

methods.

Polynomial arithmetic used (modulo 2) : subtraction is the

same as addition, i.e, 0-1 = 0+1 = 1 and 1-1 = 1+1 = 0

Implemented using shift-register circuits.

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Error Detection – CRCs (2)

Binary vectors can be mapped to polynomials.

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

[dk-1 , dk-2 ,…, d2 , d1 , d0]  dk-1x k-1 + dk-2x

k-2 + … + d2x 2 + d1x + d0

e.g., [1 1 1]  x2 + x + 1 and [1 1 0 1]  x3 + x2 + 1 .

Note that the degree of polynomial = length of vector – 1.

(x + 1) (x2 + x + 1) = x(x2 + x + 1) + 1(x2 + x + 1)

= (x3 + x2 + x) + (x2 + x + 1)

= x3 + 1

Addition (or, subtraction):

Multiplication:

(x7 + x6 + 1) + (x6 + x5) = x7 + x6 + x6 + x5 + 1

= x7 + (1+1)x6 + x5 + 1

= x7 + x5 + 1 since 1+1=0 (using modulo 2)

Error Detection – CRCs (3)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Decimal Division:

Polynomial Division:

Error Detection Example – CRCs (4)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Generator polynomial: g(x)= x3 + x + 1, degree =  = 3.

Data (k = 4): [1,1,0,0]  d(x) = x3 + x2

Step 1: Perform x d(x), which for our example, is x3 d(x) = x6 + x5

(binary = 1 1 0 0 0 0 0). This step is equivalent to appending „‟ (in

this case, 3) number of zeros to the data vector.

Step 2: Divide the polynomial obtained in Step 1 by g(x), such that

the degree of the remainder polynomial is less than the degree of

g(x). Let the remainder be called r(x). In our case, r(x) = x.

Step 3: Add (or subtract) r(x) to the polynomial obtained in Step 1,

to obtain the polynomial b(x) = x6 + x5 + x. b(x) is the polynomial

representation of the encoded data vector (which is transmitted),

the binary representation of which is [1 1 0 0 0 1 0]. This step

ensures that the transmitted codeword is divisible by g(x). Verify!!

Error Detection Example – CRCs (5)

Your book has another example on how the same sequence

of steps can be performed without using polynomials. I don‟t

like this approach.

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Start by adding four

0‟s to the data vector.

Why 4? Because

generator polynomial

is g(x) = x4 + x + 1, so

degree = 4

Add/subtract the

remainder to/from the

dividend

Error Detection – CRCs (6)

So what is the strategy at the rx? By adding/subtracting the

remainder to/from the dividend, we have ensured that the

transmitted codeword/polynomial b(x) will be divisible by

g(x). Alternately, we can view b(x) as: b(x) = q(x)g(x), where

q(x) = quotient and g(x) = divisor.

The rx does just that. It takes the received vector/polynomial

and divides it by g(x). If the remainder turns out to be zero, it

determines that no error has occurred. But will this strategy

always work?

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Error Detection – CRCs (7)

Suppose the received polynomial is z(x) = b(x) + e(x),

where e(x) is the error polynomial, which unfortunately turns

out to be a multiple of g(x), i.e., e(x) = q(x)g(x). Then:

z(x) = b(x) + e(x) = q(x)g(x) + q(x)g(x) = [q(x) + q(x)] g(x),

which is clearly divisible by g(x). Therefore, the receiver will

miss errors which are multiples of g(x).

We will work out an example, but first, note that “channel

flipping bits”  “channel adding an error vector”.

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

[1 0 1] Channel

[0 1 1] [1 0 1] [0 1 1] +

[1 1 0] = error vector First two bits flipped by channel

Error Detection – CRCs (8)

Suppose g(x)= x3 + x + 1 and d(x) = x3 + x2 as in slide 45. Then, the

transmitted polynomial is b(x) = x6 + x5 + x (binary is b = [1 1 0 0 0 1 0])

Suppose q(x) = x + 1  e(x) = q(x)g(x) = x4 + x3 + x2 + 1 (binary is e = [0

0 1 1 1 0 1]). We have chosen the error vector/polynomial such that it is

divisible by g(x).

In this case, the rx receives the vector:

b + e = [1 1 0 0 0 1 0] + [0 0 1 1 1 0 1] = [1 1 1 1 1 1]

or alternately, the polynomial b(x) + e(x): x6 + x5 + x4 + x3 + x2 + x + 1

You should verify that the above polynomial is divisible by g(x), the

quotient turns out to be x3 + x2 + 1  Rx fooled, error not detected.

Now repeat the above exercise with e(x) = x + 1 (not divisible by g(x)). Is

this error detected?

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Error Detection – CRCs (9)

CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Here are some standard generator polynomials:

• CRC-8:

• CRC-16:

• CCITT-16 (Bluetooth):

• CRC-32 (Ethernet, 802.11):

x8 + x2 + x + 1

x16 + x15 + x2 + 1

= (x + 1)(x15 + x + 1)

x16 + x12 + x5 + 1

x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1

Reliable Data Transfer

Basics

 Suppose you want to transmit 1000 bits over a 100Mbits/sec Ethernet link.

 This implies that the transmission time/delay is given by 1000/(100  106 ) = 10-5 sec.

 Also, suppose the transmitter and receiver are separated by a distance of 200 m and the transmission medium is copper (the rate at which waves can travel in copper is roughly 2  108 m/s)

 This implies that the propagation time/delay is given by 200/(2  108 ) = 10-6 sec

Basics

 Purpose of reliable data transfer is to ensure that data is delivered accurately to the upper layer despite errors that might occur during transmission.

 Error correction codes are not strong enough to correct all errors. Plus, it won‟t work if a frame is lost completely.

 Need mechanisms to recover from errors that are not correctable and from lost frames.

 Collectively called ARQ (Automatic Repeat Request) protocols.

Basics

 Essential components of any ARQ protocol:

- Receiver acknowledgements : a small control frame sent by the receiver to let the transmitter know that it has received an earlier frame. An ACK packet can be sent individually or piggybacked on a data frame that the receiver might be sending to the transmitter.

- Timeouts : if the transmitter does not receive an ACK after a reasonable amount of time after sending a frame, it retransmits the frame.

Basics

 Three ARQ protocols:

- Stop-and-Wait

- Go Back N

- Selective repeat

 We will study only the Stop-and-Wait and Go Back N ARQ protocols in class.

 Acknowledgements can be either positive (“OK, got that”) or negative (“Please repeat that”). Can also be positive but implicit (“Send me the next one”)

 Positive acknowledgements – ACK

 Negative acknowledgements - NAK

Stop-and-Wait

 We will use timelines to illustrate the operation of the protocol.

frame

ACK

time sender receiver

timeout

• Everything works correctly: the ACK is received before the timeout period.

• Even if other frames are ready at the transmitter, it will not send those before hearing about the fate of the last frame.

Stop-and-Wait

 “The case of the lost data frame”.

If a data frame is lost, the transmitter waits for the first timeout period (no ACK can come since the receiver isn‟t even aware of the frame) and retransmits the frame.

frame

ACK

time sender receiver

timeout x

frame

timeout

Stop-and-Wait

 “The case of the lost ACK frame”.

If the ACK frame is lost, the transmitter waits for the first timeout period and retransmits the frame.

frame

ACK

time sender receiver

timeout x

frame

timeout

ACK

Stop-and-Wait

 “The case of the inadequate timer”.

The timer times out before the ACK comes in. The transmitter retransmits the frame, and the receiver again sends an ACK.

frame

ACK

time sender receiver

timeout

frame

timeout

ACK

Stop-and-Wait

 Consider the last two cases where the receiver gets two copies of the data frame (due to lost ACK or delayed ACK).

 How does the receiver know that the second frame is a copy of a previously transmitted frame or a new frame? If it cannot make that decision, it will pass on both the frames to the IP layer.

 Soln: Add a 1-bit sequence number (0 or 1) in a sequence number field, for both the data frame and the ACK frame.

Sequence numbers included by the transmitter and receiver are incremented (when incremented) modulo 2.

Stop-and-Wait  SNT = seq. number at the transmitter

 SNR = seq. number at the receiver (implicit ACK, sends the seq. number of the frame it is expecting)

SNT / SNR state transition diagram

Request for frame 1 arrives at transmitter

[SNT,SNR] = [0,0] [SNT,SNR] = [0,1]

[SNT,SNR] = [1,0] [SNT,SNR] = [1,1]

Frame 0 arrives error free at receiver

Frame 1 arrives error free at receiver

Request for frame 0 arrives at transmitter

Stop-and-Wait  Ex 1: Using a 1-bit sequence number (perfect

scenario)

Frame 0

Send 0

time sender receiver

timeout

Frame 1

timeout

Send 1

Frame 0

Send 1

Frame 1

Send 0

these two frame 0‟s are different

these two frame 1‟s are different

timeout

timeout

Stop-and-Wait

 Ex 2: Using a 1-bit sequence number

Frame 0

Send 1

time sender receiver

timeout

Frame 0

timeout

Send 1 x

Frame 1

Send 0

copy of previous frame

new frame

first frame

Stop-and-Wait

 Ex 3: Using a 1-bit sequence number

Frame 0

Send 1

time sender receiver

timeout

Frame 0

timeout

Send 1 x

Frame 0

Send 1

copy of original frame

second copy of original frame

original frame

x

Frame 1

Send 0

new frame

Stop-and-Wait

 Ex 4: Using a 1-bit sequence number

What action will the transmitter take if the channel changes “Send 1” to “Send 0” ?

receiver

Frame 0

Send 0

time sender

timeout

Frame 1 timeout

Send 1

Send 1

Frame 0

Send 0 timeout

Stop-and-Wait

 How much time is spent from the instant the first bit enters the channel to the instant the last ACK bit is received by the transmitter?

(1) AB = prop. delay (2) BC = data frame trans. delay (3) CD = proc. delay

(4) DE = ACK frame trans. delay (5) EF = prop. delay (6) FG = proc. delay

transmitter

receiver

1st bit in last bit in last ACK bit in

A B C D E F G

Transmitter/Forward channel idle (think (in)efficiency!!)

Stop-and-Wait

 Expression for Total Delay (TD)

TD = (2  proc. delay) + (2  prop. delay) + LD/R + LA/R

where R = link data rate, LD = no. of bits in data frame, LA = no. of bits in ACK frame, prop. delay = propagation delay, proc. delay = processing delay.

 Expression for Effective Information Transmission Rate (EITR)

EITR = number of information bits delivered / TD

= (LD – LO) / TD

where LO = number of overhead bits in data frame

Stop-and-Wait

 Expression for Transmission Efficiency (TE)

 - LO term in numerator  loss in efficiency due to addition of overhead bits

 LA term in denominator  loss in efficiency due to necessity for ACK frame

 2 (proc. delay + prop. delay) R  reflects delay bandwidth product

 Processing delays usually negligible.

 Stop-and-Wait extremely inefficient if propagation delays are large (why??).

  RLL LL

AD

OD

delayproc.delayprop. 

 

2 TE = EITR / R

Stop-and-Wait

 Expression for Transmission Efficiency (TE) in the presence of errors, TEerror - Let Pf be the probability that a frame has errors.

- On average, a frame needs to be transmitted 1/(1-Pf) number of times.

- On average, effective total delay to transmit 1 frame:

Eff. TD = TD / (1-Pf)

- EITR in the presence of errors is given by:

EITRerror = (LD – LO) / Eff. TD = (LD – LO) (1-Pf) / TD

- TEerror = EITRerror / R = TE  (1-Pf)

where TE is the transmission efficiency without any errors, as derived in slide 17.

Basics of Go-Back-N protocol

 Falls under the category of „sliding window‟ protocols.

 The receiving DLC (data link control) operates in essentially the same way as Stop-and-wait ARQ. It accepts packets only in the correct order and sends request numbers back to the transmitting DLC.

The ACK sent is an implicit ACK (SNR = 2  send me packet with sequence no. 2, implicitly acknowledging received all packets up to sequence no. 1)

 In contrast to Stop-and-Wait, the transmitting DLC (data link control) can send several successive packets without waiting for the next packet to be requested. The packets are numbered sequentially.

Basics of Go-Back-N protocol

 The go-back number, N, is a parameter that determines how many successive packets can be sent in the absence of a request for a new packet.

 Specifically, the transmitting DLC is not allowed to

send packet (i + N - 1) before packet (i - 1) has been

acknowledged, or, packet i has been requested.

 The transmitter therefore maintains a window of the frames it can transmit. The window is of length N.

- when the transmitter receives a packet with

SNR = i, its window is [i, i + N - 1].

- when the transmitter receives a packet with

SNR = i + 1, its window slides upward to [i + 1, i + N].

Example of Sliding Window Opn. at Transmitter (no error)  Generally, data frames are sent in both directions; so, the

receiver ACK frames are piggybacked. For simplicity, we assume that the receiver has no data frames and therefore sends only ACK frames.

 We assume that SNT and SNR (shown in blue) can grow without bound.

76543210

[0, 3] [1, 4] [2, 5] [3, 6]

1 2 3

sender

receiver

………

window

4Go Back 4

Ex. (1) of Sliding Window Opn. at Transmitter (forward error)  Frame 0 is received in error.

 The receiver keeps on requesting frame 0. Frames 1, 2 and 3 are discarded, even though they have been received correctly.

 The transmitter exhausts its first window, times out and goes back to frame 0 since it is yet to receive a request for frame 1.

3210

[0, 3] [1, 4]

0 0 0

sender

receiver

………

window

0

3210

[0, 3]

1

4

Go Back 4

T.O

Ex. (2) of Sliding Window Opn. at Transmitter (forward error)  Frame 2 is received in error. The receiver keeps on requesting

frame 2. Frames 3, 4 and 5 are discarded, even though they have been received correctly.

 The transmitter exhausts its window, times out and goes back to frame 2 since it is yet to receive a request for frame 3.

Go Back 4

3210

[0, 3] [2, 5]

1 2 2

sender

………

window

2

32

[2, 5]

3

44

2

[1, 4]

5

2

5 6

[3, 6]

receiver

T.O

Ex. (3) of Sliding Window Opn. at Transmitter (lost ACK)  ACK for frame 0 is lost. However, if the sender receives an ACK

for frame 1 (requesting for frame 2), it can deduce that both frame 0 and frame 1 must have been received properly, as otherwise, it would not be asking for frame 2. Consequently, when the request for frame 2 arrives, the sender can slide its window upward to [2, 5]. Note that the lower end of the window moves from 0 to 2.

Go Back 4

3210

[0, 3]

1 2 3

sender

………

window

4

4

[2, 5]

5

[3, 6]

receiver

x

6

Do we have an idea for an improvement?

 In the previous example (slide 24), the receiver has to discard (by definition of a Go Back N protocol) frames 3, 4 and 5, even though they have been received correctly.

 Wouldn‟t it be better if the receiver could buffer those frames so that the transmitter could only selectively repeat frame 0 (in general, those frames that are not received correctly)?

Questions

 What is the maximum number of frames the transmitter needs to store in its buffers in a go back N protocol?

 How should you choose N?

- very high N  (a) excessive retransmissions may result even if a single frame within a window is received in error (b) need very high buffer capacity.

- very low N  link underutilization (N = 1 is effectively stop-and-wait)

- N should be chosen high enough so that the link is properly utilized.

Are retransmissions triggered by errors only?

 We consider data transfer in both directions.

 What if some data frames from B  A are really long?

 The sequence numbers sent by B are piggybacked on the data frames.

06543210

6300

SNA

SNB

why is B still asking for frame 0?

A

B

Go Back 7

A Lower Bound on N

(N-1)  frame transmission time  round trip time (RTT)

Ideally, you would like the ACK for frame 1 come in right when the sender is just finishing transmission of frame N

…… sender

receiver

Transmission time for (N-1)

packets

RTT

1 2 N3

time when last bit of frame 1 leaves tx

time when last bit of frame 1 arrives at rx

time when ACK for frame 1 is received at tx

Questions

 If the transmitter wants to reuse sequence numbers - i.e., the sequence numbers are computed modulo M (e.g, for M = 4, the frames are numbered 0, 1, 2, 3, 0, 1, 2, 3, etc.):

what should M be w.r.t N ?

 Key fact: M  N + 1

 The proof is involved, we will not go over it.

 Instead, we will provide an example to illustrate the contradictions faced by the receiver if M  N

 Typically, M = 8 (3 bit sequence numbers) is used in the standard DLC protocols. An alternative M = 128 (7 bit sequence numbers) is also provided.

Problems with M  N

 Assume that data frames are sent in one direction. The receiver responds with only ACK frames. Receiver sequence numbers shown in “blue”. Also, let M=N=4

3210

[0, 3]

1 0

sender

receiver

………

window [0, 3]

x x2 x x3

3210

can this frame be properly identified by the receiver?

T.O

Effect of very high M (and N)  Suppose N = M-1 and M >> 1

 If one forward frame is in error, it could potentially involve retransmission of N frames (if the sending DLC waits to exhaust its window of N frames before retransmitting)

 Waiting to go back and start retransmission when a forward frame error occurs is not a good idea (in terms of link utilization and delay).

 However, waiting to start retransmission is good if

(a) the propagation delay is large, or

(b) the receiver has particularly long data frame(s), or

(c) if there is an error in the ACK frame or receiver data frame with the ACK piggybacked

Effect of very high M (and N)

 A possible solution: Use a timer with each packet. If a packet is not acknowledged within a fixed time-out period after transmission, it is retransmitted. How do we calculate this time-out period? (next slide)

 Another possible solution: The receiver sends back a short supervisory frame upon receiving a frame in error (assuming that it typically sends ACK‟s piggybacked on data frames). This allows the sending DLC to go back much sooner in the event of a forward error, thereby avoiding excessive retransmissions.

Calculation of time-out period for one data frame  We assume that data frames are transmitted in both directions.

 We also assume that Y‟s frame transmission starts just before the first bit from X‟s data frame comes in.

(1) AB = propagation delay (2) BC & CD = transmission delay for B‟s data frames (where‟s X‟s transmission delay?) (3) DE = propagation delay (4) EF = processing delay (where‟s Y‟s processing delay?)

X

Y

A B C D E F

the ACK for X‟s data frame goes in this frame sent by Y

time-out period

Performance of Go Back N

 We will analyze the performance of the Go back N using a simplistic assumption.

 We assume that the window size and ACK delivery times are such that the channel is full all the time.

 Let any data frame (say the green frame) be received successfully on the kth attempt  we will have (k-1) blocks of N frames, followed by a successful green frame.

(k-1) = 2 blocks of N = 3 frames

3rd time successful (k = 3)

Performance of Go Back N

 Let LO be the number of overhead bits in a data frame of length LD bits.

 Let Pf be the probability that a frame has errors and tf be the transmission time (= LD/R) for each frame, where R is the link transmission rate in bps.

 You expect to transmit a frame 1/(1 - Pf) number of times. Of these, the last one is successful, and the previous attempts, which is one less than 1/(1 - Pf), resulted in N frames (see previous slide).

 Expected Total Delay (ETD) for delivering a frame is:



  



  

 

  

  

 

 

f

fD

f

DD

P

P N

R

L

P N

R

L

R

L

1 11

1

1 ETD

Performance of Go Back N

 Expression for Effective Information Transmission Rate in the presence of frame error (EITRerror) :

EITRerror = number of information bits delivered / ETD

= (LD – LO) / ETD

 Expression for Transmission Efficiency (TE) in the presence of errors, TEerror

f

OD

D

f

PN

LL

L

P

R )1(1 

  

 

 

-1EITR TE errorerror

Go Back N ARQ / Flow Control

 Till now, we have discussed how Go Back N can be used for reliable data transfer. Now, we will discuss how the same protocol can provide some flow control.

 In a Go Back N protocol, at most N frames are outstanding in the network at any given time, ignoring errors and retransmissions. As congestion builds in the network and delay increases, acknowledgements are delayed and the source is effectively slowed down.

 Also, if the destination wishes to receive packets less rapidly, it can simply delay sending its ACK‟s (see next slide). But is it always a good idea?

Go Back N ARQ / Flow Control

 Example: Receiver sends ACK‟s only after the transmitter‟s window (length „2‟ in our example) is exhausted.

 Let tf be the transmission time for each frame.

 2 frames are sent over tcycle seconds  the effective

frame transmission rate = 2/tcycle frames/sec.

What if the ACK is unduly delayed due to congestion (no errors) or lost (no congestion)

tcycle

Go Back N ARQ / Flow Control

 In practice, the receiving DLC sends a special supervisory frame, called the Receive-not-Ready (RNR) frame.

 The RNR frame provides some flow control by informing the transmitter that the receiver is unable to accept any more frames. There‟s also a Receive-Ready (RR) frame to indicate the willingness of the receiver to accept more frames.

 The RNR frame is sent in addition to the ACK frames. Essentially, therefore, flow control and acknowledgement functions are decoupled.