Q14 superscalar (diff btwn book and quiz) means we can run 2+ instructions in the same cycles fetch 2+ instructions handling mispredictions detect decide what to squash decide when to squash from 5 to 10 stages spatial locality vs temporal temporal = use same address repeatedly close "in time" spatial = use addresses similar to (but not the same) as the ones we just used loops inner loop spatial: if loop variable is the last index ... for(i...) a[j][i] or if there is one index and we don't multiply the inner loop var for(i...) a[j*N+i] temporal: if the inner loop variable is not an index for(k...) for(i...) a[k][j] for(j...) for(i...) a[k][j] for(i...) a[k*N + j] for(j) for(i) a[i][j] block size and temporal write back/through/allocate dirty bit write-back 0 when we load into cache from RAM becomes 1 when we write to that cache line if 1, then when we evict the line we send it as a write toward RAM sets (direct, set-associative, fully-associative) LRU and WB how we fetch move data set tag from address and valid bit to 1 S2015 Q7,12 Q17 Q1-... 00 0 0 - cold miss, 0[0, _*] 00 0 8 - hit 08 0 0 - cold miss, 0[0*, 8] 08 2 8 - cold miss, 0[0*, 8] 2[8, _*] 00 0 0 - hit 0[0, 8*] 2[8, _*] 10 0 0 - cold miss, 0[0*, 10] 2[8, _*] 00 0 8 - hit 0[0, 10*] 2[8, _*] 08 0 0 - conf miss, 0[0*, 8] 2[8, _*] 08 2 8 - hit