CS88: Midterm

University of Virginia, Department of Computer Science
CS588: Cryptology - Principles and Applications, Fall 2001

CS588 Midterm Discussion
24 October 2001
Averages: Question 1: 15.2, 2: 18.2, 3: 20.5, 4: 17.9
1. Mr. Jefferson's Imperfect Cipher

Recall the Jefferson Wheel Cipher from Lecture 1:

There are 36 wheels, each containing the 26 letters of the alphabet in scambled order. The wheels are placed on a spindle in an order determined by the key. A message is transmitted in 36-letter blocks by rotating the wheels to display the message horizontally, and then choosing a random row of the wheels as the ciphertext. For this question, we will assume that instead of choosing a random row, the sender chooses a row based on an offest that is part of the key known to both the sender and received.
a. (8) Argue convincingly that the Jefferson Wheel is a perfect cipher if only one letter of ciphertext is transmitted (assume the attacker knows the wheels, but not the order they are placed on the spindle or the choosen row offsets).

Answer: Getting one ciphertext letter doesn't give any help if determining the probabilities of the plaintext letter. For all possible plaintext letters and all possible wheel orders (any choice for the first wheel), there is some offset that produces the given ciphertext letter.
Note that this is not true if offset 0 is not allowed (which would be reasonable for a real Jefferson Wheel Cipher, since otherwise a row would appear as plaintext). If the offset were not allowed to be 0, we would know given ciphertext letter X the probability that the plaintext letter was X is 0, hence the ciphertext letter effects the probabilities of the plaintext letter.
I was disappointed in how many students attempted to show this using the Perfect Cipher Keyspace Theorem (Lecture 2): If a cipher is perfect, there must be at least as many keys (l) are there are possible messages (n). The implication is in the other direction! You cannot prove a cipher is perfect by showing there are as many keys are possible messages. You can prove a cipher is not perfect by showing there are not as many keys as there are possible messages. (See Lecture 2, Slide 29 for an example.)
b. (10) Is the Jefferson Wheel a perfect cipher if two letters of ciphertext are transmitted? (Use convincing and clear arguments to support your answer. A well-supported wrong answer is better than a poorly argued correct one. As in part a, assume the attacker knows the wheels, but not the order they are placed on the spindle or the choosen row offsets.)
Answer: No.
Consider a pathological example: all wheels are ABCDEFGHIJKLMNOPQRSTUVWXYZ, the ciphertext is AM. The first letter of the plaintext could be anything, because we don't know the offset. Suppose it is T. This determines the offset (7). The second letter of the plaintext must be F.
For a more realistic choice of wheels, this is harder to see (and the effect on the probabilities is less clear), but knowing the wheels still means so possible two-letter plaintexts are more likely than others for a given ciphertext. We could determine the probabilities by guessing all possible wheels for the first wheel, guessing all possible plaintext letters for the first letter, determining the offset that would produce the known first ciphertext letter, and then determining what the corresponding plaintext letter would be for the second letter for each of the other 35 wheels (as the possible second wheel).
If you're still not convinced this reveals something useful to the attacker, here's what happens when we try it for the wheels and ciphertext given in the Jefferson Wheel Challenge problem. An excerpt from the program I used to do this is below (you can find the entire Scheme program at http://www.cs.virginia.edu/cs588/challenges/digrams.scm):
;;;
;;; For given wheels and ciphertext, this function produces a list of the possible
;;; second plaintext letters given each possible guess for the first plaintext letter
;;; and first and second wheels.
;;;
;;; The result is of the form:
;;; 
;;; ((#\A
;;;  (0 . "*ZVKERMXVOKHOPQVOTMELHGHGADJNIICRPPG")
;;;  (1 . "O*CLBIQMLQBSCOIIINYBGJYXAICGELAGOBCI")
;;;  (2 . "KP*GLLZPIROMARWWQYNLJNTDOQGBOHGIGNRA")
;;;  ...
;;;  (35 . "UNXEMQBZPBRWTAUOZAPCYWWYBMTCIYSSPTY*"))
;;; (#\B
;;;  (0 . "*JZZWKVHOPNLYVVARMIKHSROMGYMQTDOBKBL")
;;; ...
;;;
;;; This means if the first plaintext letter is "A", and the first
;;; wheel is wheel #0 ("FIQGHJKZEPLRMWYNOSXTAUVBCD"), if the second
;;; wheel is 1, then the second plaintext letter must be "Z".  Note
;;; the first and second wheels cannot be the same, hence the "*"
;;; in the corresponding position.
;;;

(define (find-possible-letters wheels ciphertext)
  (let ((first-ciphertext (string-ref ciphertext 0))
	(second-ciphertext (string-ref ciphertext 1)))
    (map
     (lambda (plaintext-guess)
       (cons 
	plaintext-guess
	(map
	 (lambda (first-wheel-guess)
	   (cons 
	    first-wheel-guess
	    (list->string
	     (map
	      (lambda (second-wheel-guess)
		(if (eq? first-wheel-guess second-wheel-guess)
		    ;;; Can't reuse the first wheel
		    #\*
		    (find-plaintext-letter
		     (get-wheel wheels second-wheel-guess)
		     second-ciphertext
		     (find-offset (get-wheel wheels first-wheel-guess) 
		     plaintext-guess first-ciphertext))))
	      all-wheels))))
	 all-wheels)))
     ;;; Remove the first-ciphertext from the alphabet (no zero offsets)
     (remove-match alphabet first-ciphertext)))) 
The sorted digram counts are:
> (sort-tallys converted-tallys) (("MA" . 77) ("LU" . 71) ("LC" . 70) ("AR" . 70) ("OE" . 70) ("PC" . 69) ("OL" . 68) ... (all other possible digrams elided) ("YG" . 35) ("BX" . 35) ("OH" . 34) ("MO" . 33) ("SR" . 33) ("SO" . 32) ("PM" . 32) ("KH" . 31) ("QG" . 31) ("FC" . 31) ("RH" . 29) ("JF" . 0) ("YF" . 0) ("RF" . 0) ("NF" . 0) ("ZF" . 0) ("LF" . 0) ("OF" . 0) ("MF" . 0) ("FF" . 0) ("EF" . 0) ("PF" . 0) ...

Note that all the digrams ending in F have count 0, since we assume for the Jefferson Wheel Challenge (but not for the midterm question) that the offset cannot be 0 and the second ciphertext letter was F. Of the other possible digrams, MA appears 77 times, while RH appears only 29 times. This means, without taking into account anything about the likely plaintext language, MA is 2.65 times more likely to be the first two plaintext letters than RH based on what we know from the wheels and ciphertext alone! This is a convincing argument that the Jefferson Wheel is not a perfect cipher if the wheels are known, even if only 2 letters are transmitted. Nevertheless, it does not necessarily provide a practical break of the cipher (although inspection of the results does reveal some interesting properties of the wheels.) Before anyone gets mislead, the plaintext message does not begin with MA.
c. (7) If the attacker does not know anything about the wheels, how many letters can be transmitted using a Jefferson Wheel Cipher with perfect secrecy?
This answer is wrong, but included here for historical accuracy. Skip down for the correct answer.
Answer: 37 (no one got this, but full credit was awarded for good explanations of other answers)
Recall that the secrecy is not perfect if an attacker could determine the probability of any message is different based on the ciphertext she has seen. Suppose (pathologically), the ciphertext is 36 A's.
This could correspond to any plaintext message. Suppose the plaintext message is 36 B's. This would be the case (for example) if the first wheel is "FIQGHJKZEPLRMWYNOSXTAUVBCD" and the offset is 23. The second and later wheels would then be constrained to have A and B in the same relative positions (but the other letters could be anywhere). Without knowing the wheels though, any plaintext message is equally likely - we can find exactly the same number of offsets and wheels that would produce it as would produce this one.
What happens if we see a 37th ciphertext letter (say another A). The offset for the 37th letter is a new randomly choosen (as far as the attacker is concerned) offset. Hence, it could correspond to any plaintext letter. The attacker still does not have any useful information for an attack.
One way to see this is to view the Jefferson Wheel like a Vigenere or One-Time Pad. If we don't know the wheels, each wheel corresponds to an unknown letter in a Vigenere key. If we transmit as many letters as the key length (and the key is random), we still have perfect secrecy. When we transmit the next letter, we have used the same key letter twice, and the ciphertext now provides an attacker with some useful information. In a Vigenere with 36-letter key, for example, if the 1st and 37th ciphertext letters match, we know that the 1st and 37th plaintext letters also match.
With the Jefferson Wheel, however, there is an offset to contend with also. Knowing the 1st and 37th ciphertext letters does not tell us anything about the 1st and 37th plaintext letters. In the trivial case, if they are the same it could mean the plaintext letters were the same and the offset was the same, or the plaintext letters were different and the offset was different (remember, in this case we don't know anything about the wheels to being with).
But, when we see the 38th ciphertext letter, we do know something useful. The offset is the same for the 37th and 38th letters, so if the 1st, 2nd, 37th and 38th ciphertext letters are identical, we know (findIndex returns the location a letter on a wheel):
C[1] = wheel1 [findIndex (wheel1, P[1]) + offset1]
C[2] = wheel2 [findIndex (wheel2, P[2]) + offset1]
C[37] = wheel1 [findIndex (wheel1, P[37]) + offset2]
C[38] = wheel2 [findIndex (wheel2, P[38]) + offset2]
To simplify, suppose C[1] = C[2] = C[37] = C[38] = C.
C = wheel1 [findIndex (wheel1, P[1]) + offset1] 
  = wheel2 [findIndex (wheel2, P[2]) + offset1]
  = wheel1 [findIndex (wheel1, P[37]) + offset2] 
  = wheel2 [findIndex (wheel2, P[38]) + offset2]

   findIndex (wheel1, P[1]) + offset1 = findIndex (wheel1, P[37]) + offset2
   findIndex (wheel2, P[2]) + offset1 = findIndex (wheel2, P[38]) + offset2
Subtracting we have,
     findIndex (wheel1, P[1]) - findIndex (wheel2, P[2]) 
   = findIndex (wheel1, P[37]) - findIndex (wheel2, P[38])
We still don't know the wheels, but the constraints on the wheels mean certain choices for P[1], P[2], P[37] and P[38] will be more likely than others.
7 Nov: Here's the correct answer:
That answer is wrong. Its only perfect for 1 letter even if you know the wheels. Matthew Mah noticed this problem:
I claim that the Jefferson Wheel Cipher is imperfect for two letters or
more.  To do this, we will use the following definition of perfect
cipher from Lecture 2, slide 30:

Given any plaintext, the probability that it matches any particular
ciphertext is the same.

Now we examine the case where we have a plaintext AB. Assume, for
contradiction, that the Jefferson Wheel Cipher is perfect. AB can be
mapped to any ciphertext using the Jefferson Wheel Cipher. Then AB can
be mapped to ciphertext AC. A -> A => the offset for the round of 36
letters is 0. B -> C => the offset for the round of 36 letters is not
0. So the offset is both 0 and not 0, =><=. So the Jefferson Wheel
Cipher cannot be perfect for two letters.  Note that this proof does not
require any knowledge of the letter arrangement on the wheels. It only
assumes that each wheel has 26 distinct letters.

To summarize, it is not possible to get every 2-letter ciphertext from a
two letter plaintext, so the cipher cannot be perfect.
2. Number Theory and RSA
Lisa Euler, great-great-great-great grandchild of a well known Swiss mathematician, has just invented a remarkable quantum computing device that is able to compute the Euler's Totient function for any number of any size.
a. (8) Show how she can use the device to crack RSA encrypted documents if she knows the public key (exponent e and modulus n) that was used.

Answer: Since Lisa knows n, her device can calculate φ(n). Since e is also known, she can find the multiplicative inverse of e mod φ(n). Using the extended Euclidean algorithm, she finds d such that ed ≡ 1 mod φ(n). The decryption of an RSA encrypted document proceeds as normal, by raising the ciphertext to the d power and reducing mod n.
Alice and Cathy are really good friends. They decide that instead of each having to go through the work of finding large primes, they will write a program that finds two large primes p and q, but does not reveal them to anyone. Note that neither Alice nor Cathy know p or q.
The program then picks e_A for Alice, and finds d_A such that e_Ad_A ≡ 1 mod (p - 1)(q - 1) and reveals d_A to Alice only. Similarly, it then picks e_C for Cathy (different from e_A, and finds d_C such that e_Cd_C ≡ 1 mod (p - 1)(q - 1) and reveals d_C to Cathy only.
Alice publishes (e_A, n) as her RSA public key. Cathy publishes (e_C, n) as her RSA public key.
b. (5) Bob wants to send a message M to Alice and Cathy, such that neither can read it independently, but they can both read it if they work together. What should Bob send? (Note: no more that one modular exponentiation should be required.)

Answer: The straightforward way Bob can do this is to encrypt a message M first with Alice's public key, then with Cathy's public key resulting in (because of the limitations of HTML formatting we use e_A for Alice's public exponent and e_C for Cathy's public exponent):
(M^e_A mod n)^e_C mod n
Since Alice and Cathy use the same modulus n, this can be simplified to:
M^e_Ae_C mod n
Hence Bob can first calculate e_AC = e_A e_C and find M^e_AC mod n using a single modular exponentiation.
c. (5) Show that Alice and Cathy together can decrypt Bob's encrypted message to obtain M.

Answer: Bob sends C = M^{e_A e_C} mod n to Alice and Cathy. They each use their private keys to decrypt, giving:
((M^{e_A e_C} mod n)^d_A mod n) ^d_C mod n, and since the mod can be moved out,
= M^{e_A e_C d_A
d_C} mod n, rearranging,
= M^{e_Ad_A e_C
d_C} mod n, rearranging,
= (M^e_Ad_A mod n) ^{e_C d_C} mod n, and since e_A d_A ≡ 1 mod φ(n),
= (M^{k φ(n) + 1} mod n) ^{e_C d_C} mod n, which is,
= (MM^{k φ(n)} mod n) ^{e_C d_C} mod n, and by Euler's Theorem,
= (M mod n) ^{e_C d_C} mod n, and removing the mod again,
= M^{e_C d_C} mod n, and since e_C d_C ≡ 1 mod φ(n),
= M^{k' φ(n) + 1} mod n, which is,
= MM^{k' φ(n)} mod n, and by Euler's Theorem,
= M mod n, and since M < n,
= M
d. [hard question] (7) After the poker incident, Alice and Cathy are no longer such good friends. Alice wants to determine Cathy's private key d_C. Is it any easier for her to do this than it would be for Bob?

Answer: Alice and Bob both know the public information: e_A, e_C, and n. Neither know d_C, p, q, or φ(n). Alice does know her own decryption key d_A, and the relationship e_A d_A ≡ 1 mod φ(n). She can therefore compute e_A d_A - 1, which is equal to k φ(n), and try factoring this to find k and φ(n). This is not necessarily easier than factoring n directly, though k is likely to be smaller than either p or q. Products of the factors found can easily be checked for whether they produce a "working" φ(n). (This was sufficient to get full credit for this question.)
It turns out that with a little work (complexity O(length_of_n³)), Alice can find the factors of n directly using her own e_A and d_A. This is called the Common modulus attack (see MBC pg. 169) and is due to Simmons.
A related strategy, sometimes also called the common modulus attack, can recover the plaintext from messages sent identically to multiple parties who use the same modulus. In this case, the plaintext is recovered from nothing more than the public encryption exponents and the ciphertexts. It does not yield the factorization of the modulus.

3. Debt Collection
Cathy is having difficulty collecting the poker winnings from Alice and Bob. She decides to enlist the services of WHACKERS.com ("We have already collected kilobucks, everytime really secretly") to "persuade" Alice and Bob to pay their debts.
Because WHACKERS is worried some unscrupulous individuals may use their services for nefarious purposes, they follow a strict policy of only offering their services to customers who present creditials to prove they are members of the Better Bettors Bureau.
Cathy visits her local BBB office, shows her photo ID and affirms her honesty, pays her dues, gives the BBB official a new public key KU_C (for which only Cathy knows the corresponding private key KR_C) and receives this certificate from the BBB:
E_{KR_BBB} [ "The bearer of thie certificate is a member in good standing of the Better Bettors Bureau and the owner of the private key corresponding to " || KU_C ]
Note that the certificate doesn't reveal Cathy's identity, just that she is a member of the BBB (this relies on the BBB not keeping a record that maps Cathy's true identity to the certificate it generates for her).
Cathy uses an anonymous routing protocol to send a message to WHACKERS and arrange a meeting with one of their representatives at the local Starbucks.
a. (7) Cathy presents her certificate to the WHACKERS representative. How would he check the certificate is valid?
Answer: The WHACKERS representative should calculate E_{KR_BBB} (certificate) and verify that the decrypted certificate matches "The bearer of this certificate is a member in good standing of the Better Bettors Bureau and the owner of the private key corresponding to " || KU_C (for some KU_C).
b. (8) After a few unfortunate incidents, WHACKERS has become concerned that potential customers who are not BBB members are presenting stolen creditials. Nothing in the certificate validation distinguishes the legitimate creditial holder from someone else who copied the bits. Suggest an improvement to the credentials verification protocol (that can be done without any changes to the certificate itself).
Answer: The best answers used a challenge-response protocol. The WHACKERS representative should:

Calculate a secret random number, R
Give the potential customer E_{KU_C} (R)
Ask the customer to decrypt and give back R.
Only the real owner of the certificate should know KR_C to be able to calculate R.
A common solution was to modify the certificate somehow, like by adding an extra encryption step. Unfortunately this doesn't solve the problem, since anyone can steal this NEW certificate and pretend to be a real customer. It doesn't matter that the crook doesn't know how to decrypt the certificate, just that the WHACKERS representative thinks it is real.
c. (10) After the visit from WHACKER's representative, Alice and Cathy agree to settle their differences by tossing a coin. If Cathy calls the coin toss correctly, Alice will pay double her debt; if not, Alice will pay nothing. They must do the coin toss over the telephone. Describe a protocol they can use to fairly determine the result (e.g., have exactly a 50% chance that Alice wins). Alice and Cathy both agree H satisfies the properties of a cryptographic hash function but neither trusts any public or symmetric key encryption system.
Answer: Here's a solution based on a Bit Commitment protocol:

Cathy generates 2 random strings, R1 and R2, and her guess, b ("heads" or "tails").
Cathy sends H(R1 || R2 || b) and R1.
Alice picks "heads" or "tails" and sends that to Cathy.
Cathy sends R2 and b to Alice.
Alice verifies H(R1 || R2 || b) is the same as before.
Common errors involved using only one random string, instead of R1 and R2. There are two possibilities in this case:

Cathy keeps R secret and sends H(R || b) - Cathy can cheat by lying about R after the fact. She can say that R is whatever she wants and can change her value of b.
Cathy sends R and H(R || b) - Alice can cheat by calculating H(R || "heads") and H(R || "tails") and determining which value equals H(R || b).
By keeping part of the random string secret (R2) and part of the string public (R1), both of these problems can be avoided.

4. Truths and Falsehoods
The following statements were taken from Microsoft's recently released "Safe Internet: Microsoft Privacy & Security Fundamentals" web site (http://www.microsoft.com/privacy/safeinternet/) and their "The Ten Immutable Laws of Security" page (http://www.microsoft.com/technet/treeview/default.asp?url=/technet/columns/security/10imlaws.asp).
For each of these statements, indicate whether the statement is true, misleading or an outright falsehood. Support your answer with 1-3 sentences. An incorrect, but well-supported answer is worth more credit that a correct, poorly-supported answer. Some parts of the statements are bold to focus your attention.

Note: We show what we believe is the "best" answer here, but it was possible to get full credit on each of these subparts with a good justification of any other answer (and possible to get no credit by having the "best" answer with inadequate justification).
a. (7) Unless your e-mail application provides encryption features such as the use of digital signatures, your messages are about as private as a letter sent in an unsealed envelope.
Outright Falsehood. Digital signatures (like physicial signatures) provide authenticity, not privacy. Anyone can read a signed message by decrypting using the signer's public key. (Various full credit answers answered True and explained that a sealed envelope doesn't really protect privacy either, since of course, any intercepter can open it. I believe its giving Microsoft too much credit to interpret this statement as meaning an unsigned message is like an unsealed envelope because you can't trust it came from the claimed sender, but a digitally signed message is like a sealed envelope because you can do a DNA test on the seal saliva to confirm it matches the purported sender's.)
b. (6) For those messages that require strong security, numerous encryption options are available. If you use Microsoft® Outlook® or Outlook Express, you can turn on the application's security options. Even though most communications don't requires this level of security, when in doubt, encrypt it.
Misleading or Outright Falsehood. Sending out ciphertext unnecessarily is risky. The more ciphertext an attacker has, the better likelihood she can break the cipher. Historically, many of the WWII ciphers were broken when careless operators sent something encrypted that was not top secret and got lazy and made mistakes with the encryption process (of course laziness and mistakes are not a concern when the encryption is done by software). From a pragmatic standpoint, most modern ciphers are strong enough that the amount of ciphertext revealed even in all your emails would be unlikely to help any realistic attack. Perhaps a greater concern, is if you are recklessly encrypting (and worse, signing) messages that an attacker can control. For example, if you were to sign and encrypt messages you forwarded. A clever attacker could construct a message such that if you signed the message (or a hash of the message) it would reveal useful properties about your private key.
c. (6) The key to strong passwords lies in their length and unfamiliarity. Passwords with eight characters are harder to crack than passwords with four or six characters. Passwords that mix letters and numbers or punctuation symbols (hyphens, for instance) are even more difficult.
Misleading. The reason I think its misleading is that an eight character, repeated use passowrd cannot really be considered strong. L0ftcrack (now LC3) can break 90% of passwords over eight characters required to include mixed case, letters, numbers and symbols within 48 hours (on a Pentium II/300). Its true that longer passwords are harder to crack, but its misleading to claim they are strong.
d. (6) Many operating systems and cryptographic software products give you an option to store cryptographic keys on the computer. The advantage is convenience - you don't have to handle the key - but it comes at the cost of security. The keys are usually obfuscated (that is, hidden), and some of the obfuscation methods are quite good. But in the end, no matter how well-hidden the key is, if it's on the machine it can be found.
True. If the whole key can be produced by the legitimate program from information on the machine, then it can also be discovered by a rogue program using the same information. Splitting the key in different locations (using XOR), or encrypting it using another key (which also has to be stored somewhere) can make it harder to find, but not impossible. Given that an attacker can probably examine the binary of the legitimate and find out how it produces the key, it is not so hard for the attacker to do the same thing. This is why programs that try to keep keys secret from users like Adobe eBook are inevitably broken (see Inside eBook Security, Dr. Dobb's Journal November 2001), and try to protect their keys by putting programmers in jail.

University of Virginia
Department of Computer Science
CS 588: Cryptology - Principles and Applications

David Evans
evans@virginia.edu