20-CS-110-001 Introduction to Computer Science Fall 2010

## Lab Number 3

XOR, RSA in Cryptography

Due: October 19, 2010
Submit two matlab programs via blackboard. See submit page for instructions.

Rationale:
 This will be a serious first attempt to do something useful in Matlab: you will write several small programs that encrypt and decrypt messages using common cryptographic operations such as xor, and modulo arithmetic. You will see the need for conversion between numbers and characters as people prefer strings of characters and computers prefer numbers.

Problem 1: simple xor encryption
 Background: A successful cipher must be able to encrypt and decrypt. The decryption operation just undoes what the encryption operation did. That is, decryption relies on the existence of an inverse to encryption. This should not be mysterious as nearly all mathematical operations have inverse operations. As examples, sin-1(x) is the inverse of sin(x) since sin-1(sin(x)) = x and integration is the inverse of differentiation. The simplest cryptographic operation with an inverse is xor which has itself as its inverse. You can experiment with this using the applet below: Enter a number from 0 to 255 in the textfield whose row is integer: and whose column is number 1. Choose another number from 0 to 255 and enter it in the two textfields of the row labeled integer: at columns labeled number 2 and number 3. The result column shows the result of xoring the xor of number 1 with number 2 with number 3. When number 2 and number 3 are the same, the result is the same as number 1. Otherwise, it may not be. The binary: and character: rows show the bit patterns and ASCII character interpretations of the numbers involved so you can see this better. Matlab's xor operation evaluates to 0 or 1, depending on whether its arguments evaluate to 0 or 1. Since Matlab thinks a character such as 'A' is a single element, we cannot effectively xor characters or even numbers as shown in the applet above. But xor does operate on vectors so we will have to convert numbers and characters to their bit vector counterparts in order to use xor then convert the results back to characters. For this purpose, Matlab provides the function dec2bin, which produces a string of 0s and 1s that is the binary representation of its numeric input, and the function bin2dec, which produces the number corresponding to an input binary number that is expressed as a string of 1s and 0s. Suppose plaintext_char is a variable whose value is a single character of plaintext that is going to be encypted by xoring with a key byte. If we execute ``` dec2bin(plaintext_char) ``` the result will appear to a human as, for example, the string ``` 1000001 ``` in case the value of plaintext_char is the character A. Since a string of 1s and 0s is a vector of numbers whose values are set according to the ASCII conversion table as 48 for character 0 and 49 for character 1, the above is represented internally as the vector ``` [49 48 48 48 48 48 49] ``` But xor needs this to be a binary vector. To get that we must subtract the value of the character 0 (which is 48) from each element of the vector before using the xor. For example, doing this: ``` dec2bin(plaintext_char) - '0' ``` displays the vector ``` [1 0 0 0 0 0 1] ``` where each 1 is the number 1 and each 0 is the number 0. There is one more requirement of Matlab's xor: that its two arguments must agree in dimension. In this homework we will be operating on blocks of 8 bits (single character bytes). But most characters have binary representations that require only 6 or 7 bits. Therefore, we need to pad their binary vectors with leading 0s when necessary. The function pad8 does this. A description of this code is provided here. Matlab probably has a builtin function for doing this as well but I do not know what it is - I suggest downloading this to whatever directory your code runs from. Now, whenever we get a plaintext character to encrypt we can create a corresponding 8 bit binary vector, called plaintext_byte, using ``` plaintext_byte = pad8(dec2bin(plaintext_char)-'0'); ``` Similarly, the binary vector corresponding to the key can be obtained like this: ``` key_byte = pad8(dec2bin(key)-'0'); ``` where key is the number that is the user's key. We can xor the variables key_byte and plaintext_byte like this: ``` encrypted_byte = xor(plaintext_byte, key_byte); ``` and get another 8 bit binary vector. But encrypted_byte must be turned back into a character so it can be placed in the ciphertext stream. This is done by reversing the above process. First, we replace the 0-1 numbers with ASCII character 48-49 numbers. Do this with the following: ``` (encrypted_byte+'0') ``` So, for example, if encrypted_byte is [0 0 1 1 0 1 0 0] then (encrypted_byte+'0') is [48 48 49 49 48 49 48 48]. This needs to be converted to a string. Building on the above, use ``` char(encrypted_byte+'0') ``` the value of which is the string 00110100. Now we can use bin2dec to convert this to a number as follows: ``` bin2dec(char(encrypted_byte+'0')) ``` which gives 52 in this case. Finally, this is translated to a character with the following: ``` encrypted_char = char(bin2dec(char(encrypted_byte+'0'))); ``` which in this case is the character 4. The problem: write a Matlab program in a file called problem1a.m that prompts for a number between 0 and 255, which we call the key, and then repeated prompts for a message string. The program terminates if the message string is empty (you just hit return without entering any characters). Otherwise, the key is used to xor all bytes of the message to produce an encrypted string of characters, call it enc. Print the encrypted string using fprintf('Encrypted: %s\n',enc). Then decrypt the encrypted string by xoring the key with the characters of the encrypted string. Print the decrypted string, call it dec, using fprintf('Decrypted: %s\n',dec). The following is an example of a possible session you might have: ``` EDU>>problem1a key> 29 message> Now is the time for all good men to come to the aid Encrypted: ±ßßßßßßß... Decrypted: Now is the time for all good men to come to the aid message> For all the slurry there is a wall named Larry Encrypted: ¹ßßßßßßß... Decrypted: For all the slurry there is a wall named Larry message> EDU>> ``` Some help: The above sequence from setting values for plaintext_char to encrypted_char state what needs to be done for a single plaintext character. This needs to be done for all plaintext characters. Thus these statements must be inside a for i=1:length(message) loop and the plaintext character needing encryption must be message(i). Individual encrypted characters may be assembled into an encrypted string by means of the line enc = [enc encrypted_char];. Similarly, decrypted characters may be assembled into a decrypted string by means of the line dec = [dec decrypted_char];. The key_byte need only be computed one time - it should be done before the for loop. Decrypting is the same as encrypting except that it is the encrypted characters that are being xored instead of the plaintext_characters. Therefore you can copy the encryption code block, paste it after the block, and make a very small modification to achieve decryption. To support encryption and decryption of more than one message you need to add another loop, outside the for loop discussed above. The structure of the program is as follows: ``` key = input('key> '); s = input('message> ','s'); while length(s) != 0 % encrypt message s ... % decrypt message s ... s = input('message> ','s'); end ```

Problem 2: RSA encryption
 Background: Division of one integer by another results in a divisor and remainder. For example, 11/3 yields a divisor of 3 and a remainder of 2. We are interested in the remainder which may also be expressed in this example as 11 mod 3. It is interesting and fortuitous that xa mod n*xb mod n    equals    xa*b mod n. Details of RSA encryption, one of the leading public key encryption schemes today, can be found here. In any public key scheme a participant X maintains two keys that work together: a public key that is used to encrypt a message intended for X by a sender and a private key that is used to decrypt the encrypted message. Public keys are known to everyone, including hackers. Private keys are known only to their owners. Both keys are made at the same time and their values depend on each other. In the case of RSA, keys are made as follows by X. Numbers p and q are chosen. Both have to be prime but that is not important for us here. The number n is computed as the product of p and q. The public key e is chosen to be a prime number, often the number 3 is used. The private key d is chosen so that e*d mod (p-1)*(q-1) equals 1, if possible. If p and q are chosen badly this key does not exist. Messages are encrypted as follows. Let a message block be 1 byte (in practice it might be 128 bytes or more). An encrypted block c corresponding to message block m is computed using this: ``` c = me mod n ``` Decryption to plaintext block p is accomplished with this: ``` p = cd mod n ``` Thus, p = me*d mod n which is m1 so p is the original message block decrypted. The problem: Repeat problem 1 using the RSA cipher instead of the xor scheme except, this time, instead of asking for a single key, the user is prompted for p, q, and e like this: ``` keyparms = input('key params as [p q e] - try [11 19 7]> '); ``` and these three quantities are saved as ``` p = keyparms(1); q = keyparms(2); e = keyparms(3); ``` A subproblem: This looks pretty easy - for encryption just do something like this: ``` enc = []; for i=1:length(message) enc = [enc mod(pow(message(i),e),n)]; end ``` This does not work because message(i) could be 127 or higher and e could be a number like 11 so before you know it, message(i) raised to the power e can be larger than any integer that Matlab can handle. But there is a very simple solution to this problem which comes from the fact that ``` X*Y mod n = ((X mod n) * (Y mod n)) mod n ``` So, for example, to calculate 2566387 mod 78837 do this: Compute 256632 mod 78837 which is 64108. Compute 256334 mod 78837 which is 641082 mod 78837 which is 62854. Compute 256338 mod 78837 which is 628542 mod 78837 which is 24409. Compute 2563316 mod 78837 which is 244092 mod 78837 which is 28072. Compute 2563332 mod 78837 which is 280722 mod 78837 which is 61369. Compute 2563364 mod 78837 which is 613692 mod 78837 which is 31834. Now notice that 2566387 = 2566364+16+4+2+1 = 2566364*2566316*256634*256632*25663*1 so... Compute 256632*25663 mod 78837 which is 64108*25663 mod 78837 which is 33088. Compute 256634*256633 mod 78837 which is 62854*33088 mod 78837 which is 71929. Compute 2566316*256637 mod 78837 which is 28072*71929 mod 78837 which is 17644. Compute 2566364*2566323 mod 78837 which is 31834*17644 mod 78837 which is 44308. So, 2566387 mod 78837 is 44308. Therefore, the encryption algorithm will look as follows: ``` enc = []; for i=1:length(message) enc = [enc modpow(message(i),e,n)]; end ``` where modpow is a function that takes a message segment message(i), an encryption key e, and a modulus n as input. That is, modpow(base,expon,modulus) computes the following: ``` baseexpon mod modulus ``` where all the numbers involved can be quite large. We have to build this function following what was done in the example above. Here is an English language description of an algorithm which we can work from. ``` Start with a result that is initialized to 1 Repeat the following as long as expon is greater than 0 If expon is an odd number do the following Set result to the product of its previous value and the base Set result to the remainder of the result divided by the modulus Set base to itself squared Set base to the remainder of the base divided by the modulus Set expon to the integer value of itself divided by 2 The output is the value of result ``` To see why this works see this demo which only leaves out taking the mod.