special cases: 1-byte instructions:
ModRM byte:
%reg (mod = 11)(%bx/%bp, %si/%di)(%bx/%si/%di)offset(%bx/%bp/,%si/%di) (8- or 16-byte offset)x86 has multiple modes
maintains compatiblity
e.g.: modern x86 processor can work like 8086
different mode for 32-bit/64-bit
same basic encoding; some sizes change
8086 addressing modes made registers different
32-bit mode got rid of this (mostly)
problem: not enough spare bits in ModRM byte
solution: if ‘‘r/m’’ bits = 100 (4, normally ESP), extra ‘‘SIB’’ byte:
00 is 1, 01 is 2, 10 is 4, 11 is 8(%baseReg,%indexReg,scale)
/1 == ModRM byte with reg=1btsl $7, 4(%rax) / BTS DWORD PTR [RAX + 4], 70F BA /5 ib – BTS r/m32, imm8 — MI0F BA + ModRM byte with reg=5 + immediate byte0F BA 0b01_101_000 04 07 = 0F BA 68 04 070F BA 0b10_101_000 04 00 00 00 07 = 0F BA A8 04 00 00 00 07adds 8 more registers — more bits for reg #?
didn’t change encoding for existing instructions, so…
instruction prefix ‘‘REX’’
also selects 64-bit version of instruction
add %eax, %ecx (Intel: ADD ecx, eax)
01 (opcode) c1 (MOD: 11 / REG: 000 (eax) / R/M: 001 (ecx))exercise 2a: add %eax, %r10d (Intel: ADD r10d, eax) = ???
01 + MOD-REG-R/M byteREX prefix:
0100add %eax, %ecx (Intel: ADD ecx, eax)
01 (opcode) c1 (MOD: 11 / REG: 000 (eax) / R/M: 001 (ecx))exercise 2b: add %rax, %rcx (Intel: ADD rcx, rax) = ???
01 + MOD-REG-R/M byteREX prefix:
0100LOCK — synchronization between processorsREPNE/REPNZ/REP/REPE/REPZ — turns instruction into looppushq %rax encoded as 50
01010 plus 3-bit register number 000 pushq %r13 encoded as 41 55
41: REX prefix 0010 (constant), w:0, r:0, s:0, b:10 because push is never 32-bit in 64-bit mode55: 5-bit opcode 01010; 3-bit reg # 1011101 = 13addl 0x12345678(%rax,%rbx,2), %ecx03: opcode — add r/m32 into r328c: ModRM: mod = 10; reg = 001, r/m: 100
001 = %ecx (table)58: SIB: scale = 01, index = 011, base = 000
011 = %rbx; base 000 = %rax;78 56 32 12: 32-bit constant 0x12345678addq 0x12345678(%r10,%r11,2), %rax
4b: REX prefix 0100+w:1, r:0, s:1, b:1
03: opcode — add r/m64 to r64 (with REX.w)
84: ModRM: mod = 10; reg = 000, r/m: 100
0000 = %rax5a: SIB: scale = 01, index = 011, base = 010
1011 (11), base = 1010 (10)78 56 32 12: 32-bit constant 0x12345678
movq %fs:0x10,%r13
64: FS segment override
48: REX: w: 1 (64-bit), r: 1, s: 0, b: 0
8b: opcode for MOV memory to register
2c: ModRM: mod = 00, reg = 101, r/m: 100
1101 [%r13]; r/m = 100 (SIB follows)25: SIB: scale = 00; index = (0)100; base = (0)101
10 00 00 00: 4-byte constant 0x10
movq 0x12345678ab(%rax), %rax
movq 0x12345678ab, %rax okay
mov opcode for %rax onlymovq $0x12345678ab, %rbx
movq $0x12345678ab, %rax okaypushl %eax
66)movq (%rax, %rsp), %rax
%rsp as index registermovq (%rsp, %rax), %rax okay| assembly | encoding |
|---|---|
movq label, %alor Intel: mov AL, [label]
|
8a 04 25 label addr
|
jmp *labelor Intel: jmp [label]
|
ff 24 25 label addr
|
mov label(%rip), %alor Intel NASM: mov AL, [REL label]
|
8a 05 %rip - label addr
|
jmp *label(%rip), %alor Intel NASM: jmp [REL label]
|
e9 %rip - label addr
|
suppose we’re inserting “evil” code
at changing addresses in executable’s memory
which of the following do we want absolute encoding for?
(i.e. which would absolute encoding be easier than relative)