write-back/through cache address separation Q24.2 - PA size Q26.1 - virtual address size memory segment kernel vs user mode kinds of exceptions exception/signal handlers setjmp/longjmp (difference between) volatility of storage volatile storage loses its data on (sustained) power loss pipeline diagrams with Bubble/Stall with registers and mispredictions optimizing code for i for j a[i,j] = b[j,i] for ii by 16 for jj by 16 for i from ii to ii+16 for j from jj to jj+16 a[i,j] = b[j,i] suppose sizeof(a[0]) = 12 suppose sizeof(b[0]) = 12 suppose L1 cache is 64KB 12 * blocksize^2 * 2 < L1 size block size <= floor(sqrt(64KB / 24)) Multiple accumulators allow instruction-level parallelism - Func Unit (issue / capacity) > (latency) - have access to operands fast enough to exercise parallelism sum = 0 for i from 1 to 100000 sum += i sum1 = 0 sum2 = 0 sum3 = 0 sum4 = 0 for i from 1 to 100000 by 4 sum1 += i sum2 += i+1 sum3 += i+2 sum4 += i+3 sum = sum1+sum2+sum3+sum4 if(x) if(x != 0) for(i = 0; ; ) -- ANSI C for(int i = 0; ; ) -- not ANSI C