University of Virginia, Department of Computer Science CS588: Cryptography, Spring 2005

### Probability

1. (10) There are 9 possible x86 opcodes that behave indistinguishably from the unconditional jump: JMP and 8 of the conditional jump instructions. Given that an opcode exhibiting the jump behavior is guessed, what is the conditional probability that the guessed opcode is JMP?
Conditional Probability: Prob [ F | E ] = Prob [ E intersect F ] / Prob [ E ].

F = the opcode is an unconditional jump E = the opcode causes an infinite loop

Prob [ E ] = 9/256 Prob [ E intersect F ] = 1/256

Prob [ F | E ] = 1/256 / 9/256 = 1/9.

2. (20) The MicroVM described in the FEEB paper is injected on the stack using an attack technique that is not capable of pushing a NULL byte on the stack. This would be necessary if the intended byte at a given location matches the mask byte at that location. The paper claims,
As long as the masks are randomly distributed, two or fewer will be sufficient over 99.8% of the time, so we can nearly always inject the worm once 128 key bytes have been acquired.
Explain how you would estimate the probability that the 125-byte MicroVM can fit into 128 bytes of known key space. A clear explanation would include equations you could use to produce the probability estimate. You can assume all instructions are 1-byte long (not true for x86 code).
We can estimated it by assuming the code fits if either 0 out of the first 125 bytes have the NULL mask, 1 out of 126 bytes have the NULL mask, 2 out of 127 or 3 out of 128 have the NULL mask. Note that the 1 out of 126 case does not include 0 out of 126, since if there were 0 out of 126 we already counted that in the 0 out of 125 bytes. Similarly for 2 out of 127 and 3 out of 128.

Using the binomial distribution:

Prob [ MicroVM fits in 128 bytes ] = Prob [0 NULLs in 125] + Prob [1 NULL in 126] + Prob [2 NULLs in 127] + Prob [3 NULLs in 128]
= (125 C 0) p0 (1 - p)125 + (126 C 1) p1 (1 - p)125
+ (127 C 2) p1 (1 - p)125 + (128 C 3) p1 (1 - p)125.

### Client Authentication

3. (20) The Fu et. al paper suggests authenticating web clients using a cookie containing:
exp=t
data=s
digest=MACk (exp=t&data=s)
Suppose the MACk(m) function used is:
Sign (k, SHA1 (m))
where Sign is a secure cryptographic signature using key k. Is the protocol vulnerable to an attacker who does not know the server key k but is able to compromise the weak collision resistance property SHA1 should have? If so, explain how such an attacker would be able to create a bogus cookie. Otherwise, explain why the protocol is still secure even when the attacker can find hash collisions.
An attacker who can violate weak collision resistance and interact normally with the server to obtain a cookie, exp=t, data=s, Sign (k, SHA1 (exp=t&data=s)). Then, the attacker finds a weak collision exp=t', data=s'. This requires a particularly strong collision break, since the collision can use any value for t' and s' but cannot change the fixed string exp= and &data=. If the attacker is able to find such a collision, the original signature will appear to be valid since SHA1(m) = SHA1(m'). This means the attacker can trick the server into believing the bogus new contents of the cookie.
4. (20) When you registered for the CS588 forum, the phpBB application sent you an email like this:

 ```Welcome to UVa Computer Science Forums Forums ... Your account is currently inactive. You cannot use it until you visit the following link: http://www.cs.virginia.edu/forums/profile.php?mode=activate&u=60&act_key=0f027e468 ... ```

The value 0f027e468 is an "activation key" that is used to activate your account. This way, phpBB obtains evidence that the email address is valid before activating your account. It should be difficult for someone to guess the activation key associated with a particular email address, even if they know the activation key associated with other addresses. Make a reasonable guess how the activation key is calculated and checked.
The activation key must depend on a secret. Many answers said it could be a hash of the userid or email, but since this is not secret such an activation key would provide no strong authentication. An attacker could create an account with a choosen email address, and activate that account even if the attacker does not have access to the email account to receive the activation key.

A good guess for how to generate the activation key would be to use a one-way function with the email address and a server secret as input:

act_key = H (email || secret)
The server only needs to keep one secret and can use the same secret for all users. Since H is a one-way function, an attacker who knows both the email and corresponding act_key cannot (easily) determine the secret.

Some noticed that the activation key seems to have very little entropy. From this example, it might just be 9 hex digits, which is only 36 bits. An attacker who knows the function phpBB uses (this is easy since anyone can download the phpBB source code) to generate the act_key but not the server secret can fairly easily generate a few activation keys for legitimate email addresses and then do a brute force search to find the secret. However, the secret is not limited to 36 bits, it can be arbitrarily long. So, even though the attacker can easily find a secret that works for a given email activation key, the attacker cannot easily determine the correct secret that will work for all email addresses.

In fact, what phpBB actually does is not so sensible. Here is the register code:

```// temporarily store the activation code.
\$actcode = substr (md5 (\$email . rand (0, 100000)), 0, 6);

executeQuery ("INSERT INTO Users (UserID, Email, Password) VALUES
('\$userid' , '\$email', '\$actcode')");

//sends email for user to confirm
\$msg = "\n\nTo confirm your account, visit\n   " . \$siteurl
. \$baseurl
. "activate.php?ac=" . \$actcode . "\n";
```
It concatenates the email address with a random number between 0 and 100000, computes the MD5 hash, and selects the first 7 characters. Then, it stores this result in the database. Since it is storing the result in the database, there is no point in using the hash - if rand generates good entropy, it would be better to just store the result of rand and use that as the activation code.

To check the activation code when the user clicks on the link, it does:

```  \$ac     = \$_GET["ac"];

\$result = executeQuery ("SELECT Name, Email FROM Users WHERE password='\$ac'");
\$rows = mysql_num_rows (\$result);

if (\$rows == 0) {
//error ("No account to activate for activation code: \$ac");
invalid();
} else {
\$name = mysql_result (\$result, 0, 0);
\$email = mysql_result (\$result, 0, 1);
if (\$name != '') {
//error ("Account for \$email is already activated for user
\$user");
invalid();
} else {
```
This is a very risky way of doing things, since if the attacker can put a quote in ac, it can inject code into the SQL question, for example by making \$ac = xxx' OR 1=1 OR 'a'='b. (The magic quotes feature in most PHP installations prevents this attack in its most obvious forms.)

The guessed solution above is better than what phpBB uses, since it does not require storing the activation codes in the database. This is risky since if the database is compromised in some other way to reveal the stored activation codes, the attacker can generate arbitrary accounts. By using a one-way function instead, we only need to keep the server secret secret (which does not need to be stored in the database).

### Perfect Cipher

Section 2.1 of the FEEB paper mentions that the ISR implementation proposed by Kc et. al described two possible encryption routines. One is the XOR-pad that the paper focuses on; the other is a 32-bit transposition cipher. This is identical to the 8-bit transposition cipher you have seen in PS1 and PS3, except it operats on 32-bit blocks instead.

5. (10) Dana E. Quarles claims that the 32-bit transposition cipher is perfect, so there is no way to break this version of ISR. Her proof argument is:

• A 32-bit transposition cipher has 32! (> 2117) possible keys
• The number of possible messages is 232
• According to the perfect cipher keyspace theorem, the cipher is perfect since the number of messages < number of keys
Explain what is wrong with DEQ's argument.
Having number of messages <= number of keys is a necessary but not sufficient condition for a perfect cipher. It is obvious that the transposition ciper is not perfect since given ciphertext 00000000000000000000000000000000 we know the plaintext does not contain any 1s.
6. (20) Suppose an attacker intercepts the following two blocks encrypted with the 32-bit transposition cipher:
```     0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0

Pos: 1   3   5   7   9  11  13  15  17  19  21  23  25  27  29  31
2   4   6   8  10  12  14  16  18  20  22  24  26  28  30  32
```
Explain everything the attacker can learn about the corresponding two plaintext blocks.
The attacker can learn a great deal:
• The first block contains 3 ones, the second block contains 4 ones.
• Two of the 1 bits in the first two blocks are in the same positions.
• Whichever bits are transposed to positions 16, 24 and 30 are different in blocks 1 and 2; all other bits are the same.
Without intercepting the two blocks, there are 264 possible values for the first 2 blocks. After intercepting these blocks, the attacker can narrow down the number of possible values to
```   32 C 3        number of ways to place the 3 one bits in the first block
*  3 C 1         number of ways to choose which of the one bits is flipped
*  29 C 2        number of ways to place the 2 new one bits in the second block
= 6041280
```
Thus, after intercepting these 2 blocks, 99.99999999996725% of all possible plaintexts can be eliminated!
7. (Bonus Challenge: don't spend time on this until you have answered the other questions) Suggest a strategy for attacking an ISR-protected server where the 32-bit transposition encryption technique is used instead of XOR masking.
This is a tough one. No one suggested anything that I think it really feasible. But, it definitely requires more thought than you had time for during the exam. So, it is now Challenge Problem #4. A good answer is worth 100 bonus points.