Intro RTL Creation Interface Assembly Language Interface VPO Code Generation Interface

Zephyr Code-Generation Interfaces

Jack W. Davidson     Steve C. Losen     Norman Ramsey
Department of Computer Science
University of Virginia
Charlottesville, VA
zephyr-investigators@virginia.edu

Table of Contents


Introduction

VPO-based compilers and RTL

Compilers based on VPO are typically divided into three parts: a front end, a code expander, and an optimizer. The front end enforces all the rules of the source language, and for correct programs, it typically produces some sort of intermediate form, which is likely to be mostly machine-independent. The code expander (or just ``expander'') translates this intermediate form into a very naive program for a particular target machine. The optimizer, VPO, improves the quality of the program and emits assembly language.

This document presents the three interfaces that code expanders use to communicate to VPO. VPO's view of a program could be characterized as ``assembly language with pseudo-registers and functions,'' but these interfaces present a machine-independent form of assembly language.

The RTL Creation Interface
presents a collection of functions used to create register transfer lists (RTLs). An RTL is a machine-independent representation of a machine instruction.
The Assembly-Language Interface
presents the usual machinery of sections, symbols, labels, relocatable addresses, and directives for emitting initialized data. This interface actually stands on its own, and it has been used in compilers not based on VPO.
The VPO Code-Generation Interface
controls the show. It provides access to the assembler as well as directives for defining and calling functions. Machine instructions are emitted (as RTLs) through this interface.
VPO requires that every instruction passed to it satisfy the ``VPO machine invariant,'' i.e., every RTL can be represented as a single instruction on the target machine. The primary job of the expander is to establish this invariant as simply as possible.

The three interfaces alone aren't enough to build a compiler; you also have to know some facts about the target machine. These facts include not only a list of instructions (so you know which RTLs satisfy the VPO invariant), but also information about calling conventions, temporaries for register allocation, and so on. This information can be found in the Processor Supplement for the target machine.

Note that retargeting a compiler requires not only an instance of VPO that is specialized to the target machine (and a Processor Supplement documenting that instance), but also a code expander that is specialized to that machine. Although the RTL format is machine-independent, the contents or information stored in that format is not. This property is a source of some confusion; people believe that they can write one expander, get their program into RTL form, and then VPO will generate code for any target machine. False, not so. RTL is not a universal intermediate language; you need a code expander for each machine to create RTLs satisfying the invariant for that machine.

Run-time Errors

If an interface says something is a checked run-time error, that means that the compiler is guaranteed to halt if you do such a thing---it may even produce an error message more useful than ``core dumped.'' An unchecked run-time error means the compiler's behavior is undefined. It might silently produce bad code, or corrupt memory and dump core, or even turn into an angry warthog.

Several procedures in these interfaces use variable numbers of arguments, terminated by NULL. Failure to null-terminate such an argument list is always an unchecked runtime error.

Memory Management

The VPO code-generation interface provides procedures for initialization and finalization. You must initialize VPO before you can start creating RTLs. After VPO is initialized, memory for RTLs is controlled on a per-function basis. RTLs created outside any function definition persist until VPO is finalized. RTLs created inside function definitions (between VPOi_functionDefine and VPOi_functionEnd) end disappear at the end of the function definition.

Intro RTL Creation Interface Assembly Language Interface VPO Code Generation Interface