Legion objects are named with a three-level hierarchy, as depicted in Figure 23. At the top level, objects are identified by user-defined text strings called context names. These user-level context names are mapped by a directory service called context space to system-level, unique, location-independent binary names called Legion object identifiers (LOIDs). For direct object-to-object communication LOIDs must be bound to low-level addresses that are meaningful within the context of the transport protocol that will be used for message passing. These low-level addresses are called object addresses and the process by which LOIDs are mapped to object addresses is called the Legion binding process.
LOIDs are the system-level naming mechanism: every Legion object is automatically assigned a unique LOID that allows the system to find and communicate with each object. The LOID is a variable length binary structure (typically more than 128 bytes), which contains a type field and several variable size fields. The basic data structure consists of a sequence of binary string fields; a LOID can contain up to 216-1 fields, each of which may contain up to 216-1 bytes of arbitrary binary information. The LOID encodes the number of fields it contains and the size of each field and contains a type identifier, a four byte string used to describe the meaning of the LOID contents (for example, to determine the semantics of certain content fields).
Figure 24, below, shows the layout of a LOID: the four byte type identifier is first, followed by a two byte unsigned integer indicating the number of fields, then the fields themselves. Each field is effectively a two byte unsigned integer indicating the field length, followed by the bytes that make up the field. Of course, the implementation of various LOID data structures may differ from this model, but the implied information content will be preserved.
Within the abstract LOID data type, four of the variable size fields are reserved for specific system purposes. The first three reserved fields play a key role in the LOID to object address binding mechanism. Field 0 contains a Legion domain identifier, which can be used to support the dynamic connection of separate existing Legion systems. Field 1 is a class identifier, a string of bits uniquely identifying the named object's class. Field 2 is an instance number that distinguishes the named object from other instances of its class within the same Legion domain. LOIDs containing an instance number field of length zero are defined to refer to class objects.
The fourth field of the LOID (field 3) is reserved for security purposes. Specifically, this field contains a public key for encrypted communication with the named object. The format of the LOID is left unspecified beyond these four reserved fields. New LOID types can be constructed to contain additional security information, location hints, and other information in the additional available fields.
Figure 25 shows this layout. The type field, 1, indicates the object's type, the first variable field, .01, indicates its domain: these first two fields indicate the object's format. The next variable field, .07, indicates the object's class, the next, .01, its instance number, and the last field is the object's public key. Additional fields might add security and location information.
Whereas LOIDs provide the basic system-level naming abstraction, users require a more natural naming mechanism, one that allows them to assign meaningful, human-readable names to their objects. Legion supports the notion of context space (directed graphs of context objects that cooperate to translate user-defined names into LOIDs) to fill this role. An object that represents a processing resource might be assigned a context name corresponding to that host's standard DNS, for example. An object that represents a file might be assigned a descriptive context name based on the file contents. Context space is discussed in greater detail in the Basic User Manual.
Legion uses standard network protocols and communication facilities of host operating systems to support communication between Legion objects. However, LOIDs are meaningful only at the Legion level, not within existing protocols such as TCP/IP. Consequently, Legion must provide a mechanism by which LOIDs can be mapped to names that are meaningful to underlying protocols and communication facilities. These low-level names are called object addresses, or OAs. An OA is a list of object address elements and an address semantic field, which describes how to use the list. An OA element contains two basic parts, a 32-bit address type field, and the address itself. The address type field indicates the type of address that is contained in the address field, whose size and format vary depending on the address type. This element contains a 32-bit IP address and a 16-bit port number; every Legion object is linked with a Unix-sockets-based data delivery layer called the Modular Message Passing System (MMPS ) that communicates with the data delivery layers of other objects using these OA types. (See section 7.3 for more information on Legion message passing.)
The address semantic field is intended to encapsulate various forms of multicast and replicated communication. For example, the field could specify that all addresses in the list should be selected, that one of the addresses should be chosen at random, that k of the N addresses in the list should be used, etc. The composition and meaning of the full set of options that will be defined by Legion have not yet been identified, but provisions for extending the list with user-definable address semantics will likely be made.
Associations between LOIDs and OAs are called bindings, and are implemented as three-tuples. A binding consists of a LOID, an OA list, and a field that specifies the time at which the binding becomes invalid (this field may also be set to some value that indicates that the binding will never become explicitly invalid). Bindings are first-class entities that can be passed around the system and cached within objects.
Note that the third field -- the binding invalidation time -- is strictly an optimization hint. A binding may still be used after the timeout appears to expire at a client -- the binding may simply no longer be valid, leading to a communication timeout and rebinding. On the other hand, a client could use the timeout information to schedule re-binding in advance in order to avoid communication delays. Thus, the fact that there is no globally accurate notion of time does not affect correctness, just performance. Please see section 12.0 for further discussion of the binding process.