CS 3330: Quiz 10: explanations

This page does not represent the most current semester of this course; it is present merely as an archive.

Quiz 10 contained several questions about different memory access patterns and how they impact cache performance. This writeup is my effort to explain those questions more fully.

Randomly distributed addresses:

Hash-based data structures (and sometimes link-based too, depending on the memory allocator used) tend to have random-looking but deterministic addresses. Because of their prevalence, they are worth discussing in the context of caches.

Direct-mapped:

If I access random bytes in a direct-mapped cache with n lines each pair of accesses will have a conflict 1/n of the time. Ignoring the case of two addresses within the same block, the lines that will hit the second time around are those that were loaded by exactly one address the first time around.

Consider the following example. We have 32 lines of 4 bytes in the cache and access addresses scattered across 0x00 through 0xff. A random number generator gives the following sample run:

set 00000:  [00]
set 00001:  [05, 84, 07]
set 00010:  []
set 00011:  [0e, 8d]
set 00100:  [93]
set 00101:  [17, 14]
set 00110:  [9b]
set 00111:  [1e, 1f]
set 01000:  []
set 01001:  [a4]
set 01010:  []
set 01011:  [ac]
set 01100:  []
set 01101:  [b5, 34, b6]
set 01110:  [3b]
set 01111:  []
set 10000:  [c2]
set 10001:  [47]
set 10010:  []
set 10011:  [cf, 4e]
set 10100:  []
set 10101:  []
set 10110:  [5a, 58]
set 10111:  [5f]
set 11000:  []
set 11001:  []
set 11010:  [68]
set 11011:  [ec, ec]
set 11100:  [f3]
set 11101:  [77]
set 11110:  [7b, 78]
set 11111:  []

Note that the number of collisions is fairly small; 12 of the 32 ended up on their own cache line.

Many addresses that do have multiple addressed per line still will be hits: 05, 17, 14, 1e, 1f, b5, 5a, 5f, ec, and ec will all hit because of shared cache lines in this example. Note that the chance of a shared cache line is dependent on the number of tag bits: in this case, that's just 1 (8 bits of address − 2 bits for byte-within-line − 5 for set index = 1 bit left) so even if we had a huge number of accesses (way more than capacity) we'd still expect 50% cache hit in this example. Of course, adding even a few bits to the range of addresses would significantly reduce this benefit.

Fully-associative:

If I access random bytes in a fully-associative cache with n lines I might have a few bytes end up on the same line as one another but that case aside, each access will be on its own cache line. Fully-associative caches are have basically just have a capacity situation. If the total number of cache lines accessed is lower than the total size of the cache, everything will hit the next time around; if not, we'll only get hits in the case where the first set of reads the second time around use the same cache lines as the last set of reads the first time around.

Let's consider the duplicate cache lines. For our direct-mapped example of 32 lines of 4 bytes in the cache we had only a single tag bit, but with fully associative we don't use the set index so we have 6 bit tags instead (8 bits of address − 2 bits for byte-within-line). That means we expect only a 3% chance of an address already being in the cache from an earlier read.

Evenly-spaced addresses:

Evenly-spaced but not close-together addresses are, in many ways, the worst possible for direct-mapped caches. Changes in the high-order bits of an address change only the tag, not the set index, and cause conflicts in the cache.

From a fully-associative cache, evenly-spaced but not close-together addresses are basically indistinguishable from randomly-distributed not-close-together addresses: if the cache has capacity they fit fine, if not they do not.