PA4-Cryptography.docx

PA 4 - Cryptography

Part 1 Due: Friday 11/08/2019, 7:00 pm

Part 2 Due: Friday 11/15/2019, 7:00 pm

In this programming assignment, you will build a few famous encryption/decryption algorithms that can hide the content of messages, from a cipher used by Julius Caesar to the famous Enigma from WWII.

The goal is to practice the programming constructs you are already familiar with, and add the concepts of arrays, strings and passing array variables to functions.

Summary

For an introduction to cryptography, check out the Background section. This also contains a detailed description of the three ciphers you need to implement: the substitution cipher, the Caesar cipher and a version of the Enigma cipher. The details of the assignment will be explained in the upcoming General Instructions .

Your deliverable consists of working versions of the functions listed below, which need to go in the file ciphers.c. Note that we don’t provide you with exact function definitions, but a description of what they should look like:

index_of_char( , , )

arguments: input string, shift, character

returns: index

30 pt

char_at_index( , , )

arguments: input string, shift, index

returns: character

30 pt

cipher_substitution( , , , )

arguments: input string, output string, key, encrypt/decrypt flag

40 pt

cipher_caesar( , , , )

arguments: input string, output string, key, encrypt/decrypt flag

40 pt

cipher_enigma( , , , )

arguments: input string, output string, key, encrypt/decrypt flag

60 pt

crack_caesar( , , )

(Extra)

arguments: input string, output string, codeword string

returns: 1/0

25 pt

To submit the work, please follow the Submission Instructions .

General Instructions

Before starting the assignment, it is important to read the background section on Cryptography to become familiar with the basics of encryption/decryption and ciphers.

The goal of this assignment is to implement three famous ciphers. You will be writing a separate function for each of them. The general format of these functions will be similar and has the following arguments:

· An input message, basically a collection of symbols.

· An output message, also a collection of symbols.

· The key for the cipher. This will differ depending on the type of cipher.

· A flag (i.e., an input that can only have two possible values) that specifies whether the relationship between the input and output message is an encryption or a decryption. So for a particular cipher, you will only be writing one function, which will do an encrypt or a decrypt, as specified by this flag.

For this entire PA, you may assume we will only test valid inputs. This means you do not need to check this in your functions. The next section will talk more about the format of the input and output messages.

You are not allowed to use the <string.h> library (or any other string library). The goal of this PA is for you to learn how to work with strings directly.

Information about the Strings

For this assignment, we will limit ourselves to only 27 symbols for the messages: the 26 upper case letters plus the blank space (the character ' '). So a possible message could be "THE DIE HAS BEEN CAST", but not "You too Brutus?". Note that we do not allow any punctuation (other than the space).[footnoteRef:0] [0: This limitation is there to simplify the problem by limiting the possible character set. It also has historical relevance; the Enigma, for example, only worked with upper case letters.]

In terms of our C-programs, messages are implemented as strings. These strings are arrays of characters, with the last one always the ‘\0’ termination character.[footnoteRef:1] In fact, we will be using strings in two ways in this assignment: [1: This termination character is useful to mark the end of the message in our character array. We can just look through the string until we find ‘\0’.]

1. The input and output messages will be strings. They are composed solely of the allowed symbols (so the upper case letters plus the blank space). Their length is variable, but we will assume fewer than 255 characters. As they also need to be able to hold the final ‘\0’ termination character, you will need a character array of 256 elements. The presence of the ‘\0’ character at the end will be useful to determine the actual length of the message.

2. Some of the ciphers also require a code string. This will hold all the distinct symbols that can be encrypted, in some scrambled order. Since we limit ourselves to upper case letters and the blank space as possible symbols, this means the code string holds 27 symbols plus the termination character. Note that for the Caesar cipher (as will be explained further in the section on the Caesar Cipher ), we do not encrypt the spaces; in this case the code string only has to hold the 26 letters plus the termination character.

Important note: When you are creating output messages, you will be writing individual characters to a char array. Make sure you explicitly add the ‘\0’ termination character as well.

Organizing your Code

For this assignment, your main() function will NOT be in the same file as the ciphers you are implementing. Instead, all the functions you are asked to write, plus any helper functions you may create, have to go in ciphers.c. Your main() should be in another file: encrypt.c. There should be no main() function in ciphers.c. The Starter Code also reflects this organization already. We will only test the functions in ciphers.c. You will not be asked to submit your encrypt.c.

This kind of organization is common in software design. It also corresponds to how libraries are created. We are essentially asking you to create a basic cryptography library. All the functions are contained in files that are separate from the main program, so that they can be reused by others. In this case, we will call the functions you wrote in our test and grading programs.

To support this organization, you also need to create a .h file. This is a header file that is included in the two .c files, so that common definitions can be shared. The .h file contains constants we need (they are already there as part of the starter code; do not modify them). You will also have to add the function prototype declarations of your functions to this file.

Also, for this entire assignment (parts 1 and 2), you may assume the inputs provided to the functions are valid. You do not need to test for edge cases.

Starter Code

To get the starter code for this assignment, execute the following command in your home directory:

$ tar xvfk $STARTERCODE/PA4.tar

This will create a new subdirectory called “PA4” with all the starter code. In addition to the Makefile, you will find three files.

ciphers.c

In this file, you will write all the functions we will ask you to implement. The upcoming sections will provide the detailed specifications for those functions. Note that the file contains the line #include “ciphers.h”, which includes the ciphers.h header file. Do not delete this line. It is fine to include additional header files.

ciphers.h

In this file, you need to put the function prototype declarations of the functions you wrote in ciphers.c. Also, there are two constant definitions already provided for you: ENCRYPT and DECRYPT. We use these for our flag to indicate if the function will be doing an encrypt or a decrypt operation. Do not modify or delete these constant definitions.

encrypt.c

This is where you will write your main() function. We will not specify what your main needs to do, as we will not test it. However, you will probably want your main to call the different functions you wrote in ciphers.c to test them. There is already sample code in there, just to illustrate some basics of getting user inputs.[footnoteRef:2] It also shows how to call some of the cipher functions. You still need to add your own code to actually test your ciphers. [2: The format string in the scanf() function, “ %[^\n]s”, reads a string until a newline is encountered. It lets us read in strings that contain blank spaces (which is not possible with just “%s”). You do not need to understand this syntax, for this PA you needn’t worry about the specifics of how this format string works. Simply use it to read a string entered by a user that includes spaces. ]

For this assignment, we will not test your main(). We are only interested in the correct behavior of the functions we will write in ciphers.c.

Compiling Your Code

To compile your code, run

$ gcc encrypt.c ciphers.c -o crypto

This command runs the gcc compiler on the two input .c files and creates an output executable named crypto (the name that follows the -o flag). It illustrates how you can compile a C-program in the case where our code is split across multiple files.

To run the program, we just run our executable crypto:

$ ./crypto

If you don’t want to keep on typing the compile command over and over again, note that if you press the up-arrow on your keyboard when inside a terminal window, you can revisit past commands you have executed. This is a handy shortcut to keep in mind.

Testing Your Code

Demo Programs

We have provided you with two demo programs to help you in solidifying your understanding of the specifications of the ciphers. To compile these demo programs, run:

$ make demo1

$ make demo2

To execute them:

$ ./demo1

$ ./demo2

The first demo program shows you how the two functions in Task 1 - Finding a Character in a Code String work. The second demo program lets you encrypt a message with a substitution cipher, a caesar cipher and an enigma cipher.

Run these demo programs and then verify that the outputs correspond to how you understand the algorithms should work. Without understanding an algorithm, it is impossible to program it. These two programs do not use any of your functions. They are standalone implementations we created to show you correct outputs for encryption (and the reverse, decryption). In addition to helping you understand the goals of the code you are writing, they allow you to generate test data that you may want to use to test your own implementations.

Testing the Functionality of Your Code

To grade your work, we will compile your functions in your ciphers.c with our own main() in our version of encrypt.c. For this to work, your function specification has to be correct. We created test programs which allow you to verify that this is indeed the case. They will compile your functions in ciphers.c with a test main() we wrote.[footnoteRef:3] [3: If you try these programs before you write any of your code, you will get errors as you haven’t defined your functions yet.]

To compile a test program (for example, test1), run:

$ make test1

To execute it:

$ ./test1

Below is a list of the test programs and the corresponding functions they check:

test1

char_at_index() and index_of_char()

test2

cipher_substition

test3

cipher_caesar

test4

cipher_enigma

If you get compilation errors running these tests, your code will not pass our grading scripts. The errors could be due to mistakes in your functions or incompatible function specifications. If your code compiles correctly with your own main() in encrypt.c, but does not with the test programs, it is probably the latter.

You will notice that these tests only check a very simple example. They are not meant to help you verify if your code does the right thing! You should do much more extensive testing on your own. The test programs are mainly there to make sure you code is compatible with our grading scripts.

You actually have to thoroughly test the functionality of your own code . This means that you have to write your own main() in encrypt.c that calls the functions you need to implement and tests them for a wide variety of inputs. The ability to test your own code is an important part of software design. The demo programs (see Demo Programs ) will help you generate some test data. Make sure you verify your code works for different inputs.

Assignment Details - Part 1

Task 1a - Finding a Character in a Code String

Before building your encryption algorithms, we will ask you to create two helper functions. These will come in very handy later. Both functions work on code strings. These are strings that just have all the possible symbols in some order (see Information about the Strings ). For example, in the encrypt.c file in the starter code, the string “alphabet” is a code string.

char_at_index

The first function you need to implement is char_at_index. We will specify exactly what it needs to do, but we won’t give you the actual function prototype, as we want you to figure this out yourself. It is a function with three arguments:

char_at_index( arg1 , arg2 , arg3)

1. The first argument, arg1, is the code string (basically an array of characters). It is not modified by this function.

2. The second argument, arg2, is an integer that indicates how much the array needs to be shifted to the left.

3. The third argument, arg3, is an integer, which is an index in the shifted array.

4. The function returns the character that can be found in the shifted array at this index.

The idea is the following: We pass this function a string of unique symbols (we will use only 5 symbols here for illustration purposes). The function returns the character at a particular index (position) in this string, if the string were to be shifted first. This shifting is a “wrap-around” shift to the left, as shown below (so characters that are shifted out on the left appear on the right).

Original code string

Code string shifted by 2

So, for the example shown above,

char_at_index(“ABCDE”, 2, 3)

returns the character ‘A’ (indexes in C start at 0).

We don’t know the length of the code string a priori. For example, when using this function when implementing your ciphers, it might be called with strings of 26 or 27 characters (the upper-case letters, without or with the space). However, note that you can figure out the length of the string by knowing that it ends with the ‘\0’ character. You may assume the string is not empty.

Note that the string you are passing it should not be modified. The shift is basically something you should imagine being applied to the input string prior to looking for the character at a certain index. You don’t want to actually move all the items around! There is a better way to do this, that does not involve an actual shift. Before you start coding, take the time to think about how you can implement this function efficiently. It is important to come up with a good plan first.

Also, when you implement this function, make sure it works for both positive values of this shift (so shifting the array to the left) and negative values (which corresponds to shifting the array to the right). The shift values can be any number (as long as it is an integer, since it is a parameter of type int).

To get a better understanding of this helper function and the index_of_char below, you can also run the demo1 program, as explained in Demo Programs .

index_of_char

The second function, index_of_char, is related to the previous one. It is essentially the complement of it. Again, it is a function with three arguments (as before, what you see below is a description of the function, not the actual function prototype):

index_of_char( arg1 , arg2 , arg3)

1. The first argument, arg1, is the code string. The string contains unique characters; i.e., no character is repeated. It is not modified by this function.

2. The second argument, arg2, is an integer that indicates how much the array needs to be shifted to the left.

3. The third argument, arg3, is a character that needs to be found. You may assume the character is present in the array and is present exactly once.

4. The function returns an integer, which is the index of this character in the shifted array.

So, for the same example from before,

index_of_char(“ABCDE”, 2, ‘D’)

returns the value 1 (remember indexes start at 0 in C).

Again, when implementing this function, you want to think about whether you actually need to shift all symbols around. This function also needs to work for both positive and negative integer values of the shift. As with char_at_index, we don’t know the length of the string a priori (but it is not an empty string). You may also assume the character that needs to be found is actually present in the string (and only once).

Task 1b - Substitution Cipher

The first actual cipher you will implement is the substitution cipher. The basic idea is that each character maps exactly to one other character. For example, all A’s are replaced by T’s, all B’s become G’s, etc. For a more detailed explanation, check out the Substitution Cipher background section.

The way we specify how to do the substitution is through a code string. This string will have 27 symbols: the 26 upper case letters and the space. The interpretation of the code string is as follows: the first symbol indicates what to map an A to, the second what to map a B to, etc. So the code string “TGL ….” basically says that A’s becomes T’s, B’s become G’s, etc. when encrypting (and the reverse when decrypting). Note that the blank space character is just treated as another symbol in this cipher (as also explained in the Substitution Cipher background section).

The goal is to implement a function that can encrypt/decrypt with a substitution cipher. The same function should be able to do both, with a flag specifying whether the input needs to be encrypted or decrypted. The functionality should be as follows:

cipher_substitution(arg1, arg2, arg3, arg4)

1. The first argument, arg1, is a string containing the data that needs to be encrypted or decrypted. It is not modified by this function.

2. The second argument, arg2, is a string that will contain the result of the encryption or decryption.

3. The third argument, arg3, is the code string used for our substitution cipher. So this has 27 entries (plus the ‘\0’ character). It is not modified by this function.

4. For the fourth argument, arg4, which is an integer, we either pass ENCRYPT or DECRYPT, which are the two constants that we defined in ciphers.h (see Starter Code ).

5. The function returns nothing.

For example (and again using just 5 characters in the code rather than the full 27 for simplicity here), assume we want to encrypt some data (the 4th argument of the function is ENCRYPT). If the data to be encrypted is “DDAB” and the code is “BDEAC”, the result of the encryption is “AABD”.

Also check out encrypt.c in the starter code. It illustrates how to call this function in your code.

Hint: Consider calling the two functions you implemented before ( Task 1 - Finding a Character in a Code String ) in your implementation of the substitution cipher. This was the reason why we asked you to create those. Also, you may want to create the string “ABCD ...Z “ (the alphabet plus a blank space) in your function to help you as well.

Assignment Details - Part 2

Task 2a - Caesar Cipher

In this problem, you have to implement a function that can encrypt/decrypt using a caesar cipher. Check out the Caesar Cipher background section for more details on how it works. Essentially, each letter of the alphabet is replaced by one that comes a certain number of places after it (this number is the cipher’s key). Note that blank spaces simply remain blank spaces when using this cipher.

The function you need to implement has the following functionality:

cipher_caesar(arg1, arg2, arg3, arg4)

1. The first argument, arg1, is a string containing the data that needs to be encrypted or decrypted. It is not modified by this function.

2. The second argument, arg2, is a string that will contain the result of the encryption or decryption.

3. The third argument, arg3, is the key of the Caesar cipher (an integer). A positive value of x means that during encryption a letter is replaced by one coming x positions after it in the alphabet. This argument can be positive or negative.

4. For the fourth argument, arg4, which is an integer, we either pass ENCRYPT or DECRYPT, which are the two constants that we defined (see Starter Code ).

5. The function returns nothing.

For example, when calling this function with ENCRYPT and a key of 1, the data “HELLO” will be encrypted as “IFMMP”.

As discussed in the background section, a caesar cipher is a special case of a substitution cipher. So you could base your solution on the function for that cipher. However, is that the best approach? You also have the two other functions you implemented in Finding a Character in a Code String . It is worthwhile thinking a little bit about how we can implement our Caesar cipher efficiently. This will also prepare you for the next problem in this PA: implementing the Enigma cipher.

Hint: You may again want to build a helper string, but now “ABC ..Z” (without the blank space).

Task 2b - Enigma

For this problem, you need to implement a simplified version of the famous Enigma cipher that was used by the axis powers in WWII. The cracking of the Enigma cipher also is a seminal moment in the history of computer science. You can learn a bit more about this in the History section on Enigma.

For a detailed explanation of the workings of this cipher, first read our write-up in the Enigma Operation for this PA section. This will provide you with all the details you need. Make sure you thoroughly understand the Enigma operation before you go on to the coding part. Note that the Enigma treats spaces just like other characters (so we are again working with a set of 27 symbols, as with the substitution cipher).

The function you need to implement has the following functionality:

cipher_enigma(arg1, arg2, arg3, arg4)

1. The first argument, arg1, is a string containing the data that needs to be encrypted or decrypted. It is not modified by this function.

2. The second argument, arg2, is a string that will contain the result of the encryption or decryption.

3. The third argument, arg3, is an integer number, specifying the key of the Enigma cipher. The two right-most decimal digits (what are called the least significant digits) of this number represent the initial shift (to right!) of the inner rotor (what we called key_part_1 in Enigma Operation for this PA ). The next two digits give the initial shift (to the right) of the middle rotor (what we called key_part_2). For example, the number 1304 means that the initial shift of the inner rotor is 4 positions and that of the middle rotor is 13 positions[footnoteRef:4]. [4: If you want to test it with the number like 0312, write it as 312 rather than with the leading 0. The reason is that a number with a leading 0 is interpreted as an octal (i.e., base-8) number, rather than a decimal (i.e., base-10) number.]

4. For the fourth argument, arg4, which is an integer, we either pass ENCRYPT or DECRYPT, which are the two constants that we defined.

5. The function returns nothing.

The default positions for all the rotors (i.e., without any initial shift) and the organization of the symbols on each of the rotors is as shown in Enigma Operation for this PA . This information is also already included in the ciphers.c starter code, so you don’t have to copy it from this document.

When testing your Enigma, make sure it also works for messages longer than 27 symbols (to capture the effects of the rotors moving after encrypting each symbol).

Task 3 - Extra Challenge: Cracking the Code (Optional)

The Caesar cipher is not really secure, as you can imagine. You could simply examine each possible shift and pick the one that results in a sensical message. For this extra challenge, it is your job to write a program that can crack a message encrypted with a Caesar cipher and an unknown key.

We are going to assume the following scenario. We suspect people are planning a coup against our friend Julius and we are intercepting their messages. Ironically, they are using Caesar’s own cipher, but with a key that is unknown to us. However, we are pretty sure they will mention the word “CAESAR” somewhere in their message (they are planning a coup against him after all). So we can just go try to figure out what shift to use by realizing that the correct decryption will result in a message that has the word CAESAR in it somewhere. We will assume there is no ambiguity -- basically that there is no word that with key1 maps to the same encrypted text as the word CAESAR maps to when encrypted with key2 (this assumption turns out to be reasonable).

So the goal now is to build a decryption function for a Caesar cipher when we don’t know the key. However, we specify a keyword we suspect will appear in the decrypted text. In the example above, this keyword was CAESAR, but you need to build a function for which we can specify any keyword (up to 255 characters max, as our messages are never longer than that anyway).

Note that we need to find an exact match for the keyword. It can not be part of another word. Instead, we need to find the keyword as a standalone word (we are not interested in messages about CAESARION). If the keyword was found, the function should decrypt the message and return a value of 1 (signifying success). If the keyword could not be found, the function should return a value of 0 (signifying failure).

The function you need to implement does decryption only. It has the following functionality:

crack_caesar(arg1, arg2, arg3)

1. The first argument, arg1, is a string containing the data that needs to be decrypted.

2. The second argument, arg2, is a string that will contain the result of the decryption. If the keyword was not found, it does not matter what arg2 is.

3. The third argument, arg3, is a string that specifies the keyword to look for.

4. The function returns 1 if the decryption was successful; otherwise it returns 0.

For example, we could ask your program to crack this message:

CVKJ KRBV TRVJRI J KFZCVK GRGVI

With the keyword: CAESAR

Submission Instructions

To submit Part 1 of your assignment, run this bundle script:

$ bundle41

To submit Part 2 of your assignment, run this bundle script:

$ bundle42

Both will submit only your ciphers.c file. This is the only file we will need. As mentioned before, we will be testing your functions with our own main(). The reason to have two bundle scripts is so that your submission for Part 2 does not overwrite that of Part 1. For Part 1, we will only test the functions that were required there. We will ignore any ones that are for Part 2, so it is fine to have unfinished, non-working code of Part 2 already in there as long as the program compiles.

All the functions you are asked to write, go in the same file. So, at the end of Part 2, your work for both Part 1 and 2 will be there. The reason is to allow reuse of helper functions between the two parts.

It is important to re-emphasize that ciphers.c cannot contain a main() function (see Organizing your Code ). Make sure you have verified that your code compiles and runs correctly with the test programs (see Testing the Functionality of Your Code ).

Background

Cryptography

Encryption is the process of encoding a message such that only authorized people can interpret it. You can think of it as scrambling the content. Encryption algorithms are also called ciphers. They convert the original message, which is also referred to as plaintext, into an encrypted message, also called the ciphertext.

The trick, of course, is to not only make the message unreadable, but also to be able to reverse this process. This is called decrypting a message or decryption. Good encryption hides a message in such a way that it is difficult to unscramble it without some extra information, called a key. So even if your secret encrypted message were to fall into the wrong hands, they could do nothing with it, as they wouldn’t be able to unscramble it without this key. Instead only someone with the key is able to decrypt it. At least that is the goal. Breaking an encryption, or cracking the code, is basically the ability to find out the message without knowing the key, or equivalently, discovering the key somehow based on the encrypted message. Good encryption algorithms are those that are hard to break.

The history of encryption is fascinating. Over the centuries, people have proposed many ciphers, which others have consequently tried to crack to show their vulnerabilities. In this assignment, you will implement a few ciphers that are important because of their historical significance, but which nowadays are considered pretty vulnerable. Nevertheless, they provide a good introduction to some of the basic elements of encryption, which are present in most modern ciphers as well.

Also, the different ways in which people try to crack ciphers are really interesting and creative. These are called attacks. It is like playing detective and eking out information from an array of different sources to try to crack a code. For example, people have used the fact that not all letters occur with the same frequency in typical messages, have looked at repetitions in encrypted messages, or have measured the electrical power it would take to run a certain algorithm, and much more. Data security is an active and fascinating research area to this day.

Caesar Cipher

One of the first known military uses of codes was by Julius Caesar in 50 - 60 BC. It is reported he would scramble his messages by replacing each letter in the alphabet by one that would follow three positions later. This is illustrated in the figure below. The message “ALEA IACTA EST” would thus be encrypted as “DOHD LDFWD HVW”. In an era when most people couldn’t read (or would assume this was simply another language), this was probably reasonably effective at the time.

In general, a code where we shift all the letters in the alphabet by a fixed value (the key) is nowadays referred to as a Caesar cipher . Another famous example of this is known as ROT13 . It is a Caesar cipher with a shift of 13. An interesting property is that, because there are 26 letters in the alphabet, applying ROT13 again on the ciphertext turns it back into the original message, so encryption and decryption are basically the same.

For our assignment, we will only consider upper-case letters, such that there are only 26 symbols to consider. The one extra symbol we add is the blank space (to bring our total to 27). We will do this for all ciphers in our assignment. Note that typically (or at least historically) for Caesar ciphers only the letters are shifted, as also indicated in the figure. The blank space in between the words is simply preserved. We have included it in the table to be consistent with our discussion of substitution ciphers and enigma, which will follow shortly. It is also worth pointing out that this ‘shift’ is a wrap-around shift: letters that are ‘pushed out’ on the left, are re-inserted on the right. A way to visualize this is by imagining all the letters are in a wheel, or rotor, and we are turning that wheel. This interpretation will be particularly useful when we will talk about the Enigma cipher.

For this PA, you will need to implement a Caesar cipher with a variable shift. Note that substituting a letter with one coming x positions after it in the alphabet, is essentially a shift of x to the left in the figure above. This x is the key of the cipher. So the earlier figure corresponds to a key of value +3. It is possible to have negative values of the key as well (it just means the letters are shifted to the right). Blank spaces need to be preserved in your implementation (as also shown in the figure).

Substitution Cipher

A substitution cipher can be seen as a more general case of a Caesar cipher. The first historical ciphers were typically substitution ciphers. The idea is shown in the figure below (this is just an example):

Each letter is mapped to another one according to a fixed mapping. Note that the blank space character is treated as just another character. For our example, “ALEA IACTA EST” would be encrypted as “MFYMRNM PMRYUP”.

To specify the cipher, one needs to give the entire code mapping. In our PA, we will do this by providing the bottom row of the table (as a code string). So for this example, the code would be “MG AYSHCNTBFLWEXIZUPJVDKOQR”. In our cryptography nomenclature, this code string thus serves as the encryption key.

If we have the code table, decryption is also straightforward. Note that it is important that the code mapping is one-to-one, meaning that each letter is only appears once in the code string. Otherwise, it would be impossible to reverse the operation. For example, if A’s, B’s, C’s, etc. are all mapped to G, our encrypted word would just always be a bunch of G’s and thus impossible to decrypt.

As also mentioned for the Caesar cipher, we will limit ourselves to upper-case letters, plus the blank space character. We will do this for all ciphers in our assignment. Unlike the Caesar cipher, the substitution cipher treats the blank space as just one of the 27 characters (so can can be mapped to a letter).

Enigma

History

The Enigma was an encryption device invented by a German engineer after WWI. It vaguely resembled a typewriter, but with a complex mechanism of rotors on the inside. During WWII, it was used extensively by the German military to encrypt military orders. Eventually, due to initial efforts of Polish mathematicians and then a team of codebreakers at England’s Bletchley Park, the Enigma code was cracked by the allies. This is widely considered to have played a crucial role in the outcome of the war. In fact, the belief of the German command that the Enigma was unbreakable heightened the impact of the vulnerability, as they were unwilling to accept their messages were being intercepted.

The breaking of the Enigma was also significant as it involved some of the first uses of what we would today consider as computers. An electro- mechanical computer, called the Bombe, was the one used to eventually break the daily settings of the Enigma cipher. It was designed by Alan Turing , who is widely considered the father of computer science[footnoteRef:5]. A related device, the Colossus (pictured on the right; image from Wikipedia), is considered to be the first programmable electronic digital computer. It was used to break the Lorentz cipher, another encryption used by the Germans during the war. [5: At UCSD, the Alan Turing Memorial Scholarship was established in his honor. The story of breaking the Enigma cipher was also the basis of the 2014 movie The Imitation Game (which is more an interpretation than an accurate telling of history, but at least recognizes the significant role of Alan Turing).]

Going back to the Enigma, the efforts involved in breaking the cipher have a fascinating history as well. To encrypt messages, a secret key was used that changed on a daily basis. The goal of the codebreaker team was to crack this key as soon as they could, from which point on messages could be decoded, before the key changed again the next day. A major breakthrough came when a physical Enigma machine was captured from a German U-boat. However, decrypting efforts still required knowing the plaintext of some encrypted messages. These were often obtained through operational vulnerabilities. For example, the British were able to intercept the daily reports that a German guard station in the North African desert would radio out. The Germans encrypted these reports as they knew they could be intercepted. However, as nothing was happening in that stretch of desert, the report would always be “Keine besonderen Ereignisse” (basically "nothing to report"). Since the codebreakers knew what the message had to decrypt to, it helped them in finding the daily key. All these pieces together eventually led to the allies getting a significant edge on the axis powers.

Enigma Operation for this PA

There were actually a number of different versions of the Enigma, as features were added over the years to increase its strength. They all relied on a set of interconnect rotors. In this assignment, you will be implementing a version of the Enigma cipher with three rotors, as shown on the right (here # is used to represent the blank space character). For more information on how the full Enigma machine worked, you can check out this video .

Encryption: For each symbol in the message follow the following procedure:

1. Let’s call the symbol we want to encrypt . Find on the inner rotor and look on the outer rotor for the character that is aligned with it.

2. Find on the middle rotor and look on the outer rotor for the character that is aligned with it. The encrypted symbol is .

The blank space is treated just like the other symbols; this is similar to what we did for a substitution cipher. So we have to consider 27 symbols.

As an example, consider the rotors as shown below. Assume we want to encrypt the letter P.

1. We look for P on the inner rotor and find it at index 9. At that same index on the outer rotor is T.

2. Then look for T on the middle rotor. We find it there at index 3. At that same index on the outer rotor is H. The encrypted symbol is this H.

Outer rotor:

Middle rotor:

Inner rotor:

After encrypting each symbol, the rotors are moved such that the code for the next symbol is different, significantly enhancing the strength of the cipher.

1. After encrypting a symbol, the inner rotor is turned one position clockwise (to the right). This corresponds to our wrap-around shift from before, but now to the right. So P would now be encrypted as Y.

2. If the inner rotor has completed one full rotation of 27 symbols (basically it is back at its starting position), the middle rotor is also rotated one position to the right at the same time.

Another way to understand this operation is to realize it is reminiscent of a traditional car odometer.

Decryption: For each symbol in the message follow the following procedure:

1. Let’s call the symbol we want to decrypt . Find on the outer rotor and look on the middle rotor for the character that is aligned with it.

2. Find on the outer rotor and look on the inner rotor for the character that is aligned with it. The decrypted symbol is .

If you look closely, you will notice that the decryption procedure is (unsurprisingly) essentially the inverse of the encryption.

After decrypting one symbol, the rotors also need to be turned. The procedure is exactly the same as for encryption (not the reverse!).

Key: The Enigma cipher also uses a secret key such that the process does not remain identical. As mentioned in the History section, this key was changed every day. The way the key works in our algorithm is that it specifies the initial positions of the inner rotor and the middle rotor. So for the key, we need to provide two numbers:

1. key_part_1: The shift (to the right) of the inner rotor at the start of the encryption or decryption of the message.

2. key_part_2: The shift (to the right) of the middle rotor at the start of the encryption or decryption of the message.

Example

Let’s walk through a complete example of the algorithm. We will be using the same rotor setup as before (which is repeated below). This is also the one you will be using in your assignment (and which is provided in the starter code).

Now, let’s say we want to encrypt the following message: ECEFIFTEEN ROCKS

Let’s assume we pick a key of { key_part_1 = 4, key_part_2 = 13 }. Before we start encrypting our message, we therefore move the rotors to the following positions, according to this key:

Following the encryption procedure we explained earlier, we can see that the first symbol of our message, E, will be encrypted as as B.

The next step in the procedure is to shift the inner rotor by one position:

Then we need to use this new position of the rotors to encrypt the next symbol, i.e., the C. It will be mapped to an R.

Again, we shift the inner rotor by one position.

The next E will be encrypted into a I. And so on ...

When you continue this process, you will find that our encrypted message becomes:

BRIMRTQMCKUTVF U

Verify this for yourself. Also keep in mind that once the inner rotor has made a complete rotation, you also have to move the middle rotor by one position to the right. This will not occur in this example message, but as soon as you want to encrypt a message of more than 27 symbols, this will become an issue.

Finally, also walk yourself through the decryption process and verify its operation as well.