x8664-encoding

x86 instruction encoding

x86-64 encoding is quite complicated
reason we don’t teach it in CS 2130
mostly complicated because of history

the 8086

1979 Intel processor
4 general purpose 16-bit registers: AX, BX, CX, DX
4 special 16-bit registers: SI, DI, BP, SP

8086 instruction encoding: simple

special cases: 1-byte instructions:
- anything with no arguments
- push ax, push bx, push cx, … (dedicated opcodes)
- pop ax, …

8086 instruction encoding: two-arg

1-byte opcode
sometimes ModRM byte:
- 2-bit ‘‘mod’’ and
- 3-bit register number (source or dest, depends on opcode) and
- 3-bit ‘‘r/m’’ (register or memory)
‘‘mod’’ + ‘‘r/m’’ specify one of:
- %reg (mod = 11)
- (%bx/%bp, %si/%di)
- (%bx/%si/%di)
- offset(%bx/%bp/,%si/%di) (8- or 16-byte offset)
non-intuitive table

16-bit ModRM table

8086 evolution

Intel 8086 — 1979, 16-bit registers
Intel (80)386 — 1986, 32-bit registers
AMD K8 — 2003, 64-bit registers

x86 modes

x86 has multiple modes
maintains compatiblity
e.g.: modern x86 processor can work like 8086
- called ‘‘real mode’’
different mode for 32-bit/64-bit
same basic encoding; some sizes change

32-bit ModRM table

32-bit addition: SIB bytes

8086 addressing modes made registers different
32-bit mode got rid of this (mostly)
problem: not enough spare bits in ModRM byte
solution: if ‘‘r/m’’ bits = 100 (4, normally ESP), extra ‘‘SIB’’ byte:
- 2 bit scale: 00 is 1, 01 is 2, 10 is 4, 11 is 8
- 3 bit index: index register number
- 3 bit base: base register number
(%baseReg,%indexReg,scale)

intel manual: SIB table

basic 32-bit encoding

dashed: not always present
opcodes: 1-3 bytes
- some 5-bit opcodes, with 3-bit register field
  - (alternate view: 8-bit opcode with fixed register)
- sometimes Reg part of ModRM used as add’tl part of opcode
  - in Intel manual: /1 == ModRM byte with reg=1
displacement, immediate: 1, 2, or 4 bytes
- or, rarely, 8 bytes

exercise 1

exercise 1 solution

btsl $7, 4(%rax) / BTS DWORD PTR [RAX + 4], 7
from ISA reference entry:
- 0F BA /5 ib – BTS r/m32, imm8 — MI
- 0F BA + ModRM byte with reg=5 + immediate byte
- MI: operand 1 in ModRM byte r/m field; operand 2 in immediate byte
from table:
- [EAX]+disp8: mod 01, R/M 000; [EAX]+disp32: mod 10 R/M 000;
0F BA 0b01_101_000 04 07 = 0F BA 68 04 07
0F BA 0b10_101_000 04 00 00 00 07 = 0F BA A8 04 00 00 00 07

what about 64-bit?

adds 8 more registers — more bits for reg #?
didn’t change encoding for existing instructions, so…
instruction prefix ‘‘REX’’
- 32-bit x86 already had many prefixes
also selects 64-bit version of instruction

REX prefix

64-bit REX exercise (1)

add %eax, %ecx (Intel: ADD ecx, eax)
- 01 (opcode) c1 (MOD: 11 / REG: 000 (eax) / R/M: 001 (ecx))
exercise 2a: add %eax, %r10d (Intel: ADD r10d, eax) = ???
- REX prefix + 01 + MOD-REG-R/M byte

REX prefix:
- 0100
- w (is 64-bit values?)
- r (extra bit for Reg field)
- s (extra bit for SIB index reg)
- b (extra bit for R/M or SIB base field)

64-bit REX exercise (2)

add %eax, %ecx (Intel: ADD ecx, eax)
- 01 (opcode) c1 (MOD: 11 / REG: 000 (eax) / R/M: 001 (ecx))
exercise 2b: add %rax, %rcx (Intel: ADD rcx, rax) = ???
- REX prefix + 01 + MOD-REG-R/M byte

REX prefix:
- 0100
- w (is 64-bit values?)
- r (extra bit for Reg field)
- s (extra bit for SIB index reg)
- b (extra bit for R/M or SIB base field)

overall encoding

instruction prefixes

REX (64-bit and/or extra register bits)
VEX (SSE/AVX instructions; other new instrs.)
operand/address-size change (64/32 to 16 or vice-versa)
LOCK — synchronization between processors
REPNE/REPNZ/REP/REPE/REPZ — turns instruction into loop
segment overrides

x86 encoding example (1)

pushq %rax encoded as 50
- 5-bit opcode 01010 plus 3-bit register number 000
pushq %r13 encoded as 41 55
- 41: REX prefix 0010 (constant), w:0, r:0, s:0, b:1
- w = 0 because push is never 32-bit in 64-bit mode
- 55: 5-bit opcode 01010; 3-bit reg # 101
- 4-bit reg # 1101 = 13

x86 encoding example (2)

addl 0x12345678(%rax,%rbx,2), %ecx
03: opcode — add r/m32 into r32
8c: ModRM: mod = 10; reg = 001, r/m: 100
- reg = 001 = %ecx (table)
- SIB byte + 32-bit displacement (table)
58: SIB: scale = 01, index = 011, base = 000
- index 011 = %rbx; base 000 = %rax;
78 56 32 12: 32-bit constant 0x12345678

x86 encoding example (3)

addq 0x12345678(%r10,%r11,2), %rax
4b: REX prefix 0100+w:1, r:0, s:1, b:1
03: opcode — add r/m64 to r64 (with REX.w)
84: ModRM: mod = 10; reg = 000, r/m: 100
- reg = 0000 = %rax
- SIB byte + 32-bit displacement (table)
5a: SIB: scale = 01, index = 011, base = 010
- with REX: index = 1011 (11), base = 1010 (10)
78 56 32 12: 32-bit constant 0x12345678

x86 encoding example (4)

movq %fs:0x10,%r13
64: FS segment override
48: REX: w: 1 (64-bit), r: 1, s: 0, b: 0
8b: opcode for MOV memory to register
2c: ModRM: mod = 00, reg = 101, r/m: 100
- with REX: reg = 1101 [%r13]; r/m = 100 (SIB follows)
25: SIB: scale = 00; index = (0)100; base = (0)101
- no register/no register in table
10 00 00 00: 4-byte constant 0x10

x86-64 impossibilities

illegal: movq 0x12345678ab(%rax), %rax
- maximum 32-bit displacement
- movq 0x12345678ab, %rax okay
  - extra mov opcode for %rax only
illegal: movq $0x12345678ab, %rbx
- maximum 32-bit constant
- movq $0x12345678ab, %rax okay
illegal: pushl %eax
- no 32-bit push/pop in 64-bit mode
- but 16-bit allowed (operand size prefix byte 66)
illegal: movq (%rax, %rsp), %rax
- cannot use %rsp as index register
- movq (%rsp, %rax), %rax okay

position dependence

two ways of encoding addresses in x86-64 assembly:
- address in little endian (typically 32-bits — limit on executable size)
- (usually 32-bit) difference between address and %rip (next instruction address)

assembly	encoding
`movq label, %al` or Intel: `mov AL, [label]`	`8a 04 25` label addr
`jmp *label` or Intel: `jmp [label]`	`ff 24 25` label addr
`mov label(%rip), %al` or Intel NASM: `mov AL, [REL label]`	`8a 05` %rip - label addr
`jmp *label(%rip), %al` or Intel NASM: `jmp [REL label]`	`e9` %rip - label addr

position-independence: which to use?

suppose we’re inserting “evil” code
at changing addresses in executable’s memory
which of the following do we want absolute encoding for?
(i.e. which would absolute encoding be easier than relative)
- A. address of a jump from evil code to function at fixed loc in executable
- B. address of a jump in a loop in the ‘‘evil’’ code
- C. address of a string in the ‘‘evil’’ code
- D. address of a string in the executable