If a man will begin with certainties, he shall end in doubts;
but if he will be content to begin with doubts, he shall end in certainties.

-- Francis Bacon

Chapter 4

Fundamental Observations

We present four fundamental observations regarding Multi-Representation Modelling (MRM). These fundamental observations are the first of their kind relating to MRM; they support a framework for addressing MRM issues.

Often, characteristics of models make joint execution difficult. One model may be at a lower resolution because its entities are very abstract, whereas another may be at a higher resolution because its entities are very refined. Assumptions about objects, events, interactions and environment may be different. The fundamental processes in the model may have different algorithms because of differences in resolution. The models may progress with different systems of simulation time: discrete-event, time-stepped or continuous. Also, the time-steps at which the models progress may be vastly different.

Often, current approaches to MRM either place too many restrictions on the models or introduce new problems. For example, selective viewing is too restrictive because it requires that all representation, relationships and interactions be expressed at the highest resolution level. Aggregation-disaggregation introduces many problems, as we see in §4.1.

In this chapter, we explore problems in current approaches, and present and substantiate four fundamental observations about MRM. The fundamental observations we present here are exactly that, observations . Although they are presented informally, we present strong arguments for their existence. We arrived at these observations after analysing the causes of ineffectiveness in many models. Our observations are fundamental because any general solution to the MRM problem must take them into account. They address the general ineffectiveness of joint execution of multiple models, the necessity of maintaining consistency among concurrent representations of the same entity, the dependence among concurrent interactions and temporal consistency. These observations focus the problem of joint execution to the core problem of how to maintain consistency in the multiple representation levels of a single entity. Our framework, UNIFY , is based on these fundamental observations.

4.1 Problems with Aggregation-Disaggregation

Aggregation-disaggregation, a common approach to MRM, ensures that entities interact with one another at the same representation level by forcing one entity to be transformed to the level of the other. Typically, if a Low Resolution Entity (LRE) interacts with a High Resolution Entity (HRE), the LRE is disaggregated, i.e., decomposed into its constituents. LRE-LRE interactions would be at the LRE level. A disaggregated LRE may be aggregated so that it can interact subsequently at the LRE level. Aggregation-disaggregation causes simulations to incur considerable resource costs, thus violating R3. Problems such as chain disaggregation, network flooding and transition latency put unacceptable burdens on the resources needed to run a simulation. Moreover, aggregation-disaggregation can cause mapping inconsistencies between levels, thus violating R2 [Nat95] [NRC97]. Finally, in most variants of aggregation-disaggregation, the multiple models do not execute truly jointly since the system transitions among models as required. In the following sub-sections, we discuss problems with aggregation-disaggregation.

4.1.1 Mapping Inconsistency

Mapping inconsistency occurs when an entity undergoes a sequence of transitions between representation levels resulting in a state it could not have achieved in the simulated time spanned by that sequence. Any scheme in which entities transition between representation levels (e.g., aggregation-disaggregation) must translate attributes between levels consistently. The translation should not lead to incorrect or unintended changes in the attributes. Poor translation strategies cause discontinuities or "jumps" in the state of entities. In Figure 7, when entity L is aggregated to interact with an LRE, the positions of its constituent HREs are lost. Subsequently, when L is disaggregated to interact with an HRE, a standard algorithm or doctrine reconstructs the positions of the HREs [Clark94] [France93] [Davis93]. However, the reconstructed positions may result in "jumps" in the constituents of L. In general, mapping inconsistencies arise if the translation strategies utilise outdated, inaccurate or insufficient attribute information.

4.1.2 Chain Disaggregation

Chain disaggregation occurs when a number of entities are forced to disaggregate because a disaggregate-level entity interacts with an aggregate-level entity. Consider an HRE H interacting with an LRE L. Typically, L would be disaggregated to interact with H at the disaggregate level. However, other LREs interacting with L may have to disaggregate, possibly leading to further disaggregations. Figure 8 illustrates the problem. The interaction between can H and L force all LREs to disaggregate in order to be able to interact at the same level. The forced disaggregation caused by the initial contact is called chain disaggregation or spreading disaggregation [Allen96] [Cald95b] [Petty95] [Stober95]. Chain disaggregation causes the number of simulated entities to increase rapidly. The increased cost of simulating these entities translates to increased load on processors and the network.

4.1.3 Transition Latency

Aggregation and disaggregation incur time overheads while performing the various steps involved when entities transition between levels. Examples of these steps are set-up, generation of disaggregate values from aggregate values and initiation of protocols to adjust disaggregate values for specific situations. Transition latency, the time taken to effect an aggregation or disaggregation, can be unacceptably high if these steps are complex [Robkin92]. High transition latencies are incompatible with real-time constraints, for example, in human-in-the-loop simulations, because they may cause perceptual or conceptual inconsistencies. An entity that does not change position during a transition period, and then suddenly undergoes a large displacement at the end of the transition period causes a perceptual inconsistency. A conceptual inconsistency may be caused when it takes so long for an entity to disaggregate in order to comply with a request made by another entity that the request becomes obsolete.

4.1.4 Thrashing

When an entity undergoes rapid and repeated transitions from one level to another, it thrashes. For example, an LRE, L, may disaggregate on commencing interactions with an HRE, H. When H moves out of range, L may revert to the aggregate level. However, H's varying proximity to L may cause L to change levels frequently, thus incurring the overheads associated with making a level change and raising the costs of simulation and consistency maintenance. Thrashing depends on the policy that triggers a change of level. Thrashing must be addressed by any MRM approach. High transition latencies compound the problems caused by thrashing because they cause some entities to spend considerable amounts of time just changing levels.

4.1.5 Network Flooding

The network is projected to be a bottleneck in distributed simulations, especially when models consist of large numbers of entities [Pullen95] [Reddy95] [Hofer95]. Network resources may be strained by aggregation and disaggregation. Each entity created during disaggregation could be a sender/receiver of messages, thus increasing network traffic. Also, aggregation and disaggregation typically requires the exchange of many control messages -- an overhead that must be incurred every time a change of level occurs. These messages can reduce the effective throughput of the network. Frequent changes of level and large numbers of entities may put an unacceptable burden on the network.

4.1.6 Cross-Level Interactions

In many systems, some interactions may span multiple representation levels. For example, two entities at different representation levels could engage in combat indirectly (as in long-range artillery fire). Disaggregation is not triggered because of the indirect nature of the engagement1. Therefore, the sender and receiver of the interaction are at different representation levels. We refer to such interactions as cross-level interactions. Since the participants in cross-level interactions are entities at different representation levels, it is difficult to reconcile the effects of such interactions. Cross-level interactions occur when requirement R1 is not satisfied.

4.1.7 Summary of Problems with Aggregation-Disaggregation

Often, problems with aggregation-disaggregation occur because designers make convenient rather than correct decisions about the joint execution of multiple models. Examples of such decisions are: permitting cross-level interactions, permitting interactions only within a playbox and pseudo-disaggregating. When a multi-model grows in terms of the number of its constituent models, the kinds of interactions that entities may receive, or the different scenarios under which the models execute, such decisions can lead to ineffective joint execution. For example, cross-level interactions are difficult to reconcile, playboxes lead to thrashing and pseudo-disaggregation leads to a condition where entities must be able to disaggregate all entities in the model.

An approach for joint execution of multiple models based on correct decisions is necessary. Such an approach will avoid the pitfalls of merely convenient decisions, and satisfy three basic requirements for MRM: multi-representation interaction, multi-representation consistency and cost-effectiveness. This approach must be based on fundamental characteristics of joint execution. In §4.2, we present four fundamental observations about MRM. These observations highlight fundamental characteristics of joint execution. In Chapter 9, we show how our framework for MRM, UNIFY , satisfies the three basic requirements for MRM and avoids the pitfalls of other approaches.

4.2 Fundamental Observations

After analysing the causes for ineffectiveness in a number of multi-models, we made four fundamental observations about the joint execution of multiple models. These observations focus on entity interactions, effects of concurrent interactions, dependencies among concurrent interactions and time-step differentials. The fundamental observations influence our choice of the techniques that are part of UNIFY : Multiple Representation Entities, Attribute Dependency Graphs and a taxonomy of interactions.

4.2.1 Fundamental Observation 1

Two entities must interact at a representation level common to both so that the semantics of their interactions are meaningful to both. Therefore, the objects and processes corresponding to each entity must be modelled at all the representation levels at which the entity can interact. When entities interact at common representation levels, they avoid cross-level interactions.

FO-1: For effective joint execution, objects or processes should be modelled at representation levels at which they can interact.

Consider the joint execution of two models with entities, EA and EB, at different representation levels LA and LB respectively, as shown in Figure 9. Essentially, FO-1 states that for most applications, in order to interact with each other, either EA must be represented at LB or EB must be represented at LA. In other words, for effective joint execution, a combination of vertical and horizontal links must be followed.

To see why this observation is true, consider a military training simulation. Here, EA may be a division of tanks being modelled in a low-resolution simulation while EB may be a single, self-contained (manned) tank simulator. Typically, division-level engagements are simulated by equations that take the relative strengths of the engaging parties into account; actual firing of weapons and destruction of individual tanks are not simulated. In contrast, individual tank engagements are simulated on the basis of actions taken by the parties involved in the engagement (e.g., the human crew of the tank). These involve simulation of detailed actions such as sighting, target acquisition, firing, detonation and damage assessment.

In general, models at different representation levels are designed for different purposes and consequently, have different foci. What is relevant at one level may not be relevant at another, therefore may not be modelled there. The crew members inside an individual tank simulator expect to see individual targets through their sensors. Presenting them with an aggregated view of a tank division will be ineffective (if visual fidelity of the engagement is an effectiveness criterion).

Similar incompatibilities arise in other dimensions of resolution such as time and space. Time-steps vary from nanoseconds to minutes. When two models with disparate time-steps are executed jointly, the one with the smaller time-step may interpret a lack of response from the other as inaction when in fact, the other will report its action only at the end of its larger time-step. Likewise, terrain representation may vary between models. A simple mathematical mapping function may suffice to translate terrain coordinates between systems. However, sometimes such functions do not exist or are inadequate (e.g., when one model executes in two-dimensional space while the other executes in three-dimensional space). Further, the difference in resolution (e.g., meters versus kilometers) can lead to inconsistencies similar to those observed with time-step differentials.

A technique used to resolve these incompatibilities is to provide bridges between representation levels. In the two-level case of Figure 9, a bridge is a diagonal link. Such bridges are useful only in special cases; they are not general techniques for effective joint execution of multiple models. Pseudo-disaggregation can be such a bridge. For example, a perceiver of an aggregate entity could apply a local translation function to obtain a disaggregated view of the aggregate entity. This technique works well as long as perception is the only interaction -- it fails if the perceiver also engages the perceived in combat since the perceived units do not respond to events (e.g., attack, defend, retreat). To achieve a completely realistic engagement, the perceived units must respond as if they were being modelled as individual entities themselves. Thus, while bridges may suffice for joint execution in some cases, in general, entities must be modelled at the appropriate representation levels to achieve the required effectiveness.

Interactions may occur at any level at any time. In order to satisfy FO-1, entities must either (i) maintain representations at all levels at all times, or (ii) dynamically transition to the appropriate level as required. We take the first approach. The second approach, aggregation-disaggregation, has high associated overheads, as noted in §4.1.

4.2.2 Fundamental Observation 2

The high cost of dynamic transitions between representation levels can be reduced by reducing (i) the cost associated with a single transition, and (ii) the number of transitions. The cost associated with a single transition is application-specific. Here, we focus on reducing the number of transitions. Limiting the propagation of transitions, for example, by controlling chain disaggregation, results in significant reductions in overhead. Ideally, a transition should be restricted to a single entity and not propagate at all. Restricting transitions implies that entities must be able to resolve concurrent interactions (i.e., interactions occurring within simulated periods that overlap) at multiple levels. Resolving concurrent interactions means that the effects of these interactions must be combined without compromising effectiveness.

FO-2: The effects of concurrent interactions at multiple representation levels must be combined consistently.

In Figure 10, entity EC must resolve concurrent interactions with entities EB and ED in order to limit the propagation of the transition. Concurrent interactions could be serialized, i.e., processed sequentially and atomically. This approach fails in the context of real-time interactions which must appear to take effect concurrently. Serializing the interactions removes the appearance of concurrence.

Alternatively, interactions could be processed in parallel and their results combined. Although apparently reasonable, this approach has several pitfalls as well. The subtleties of these pitfalls are best explained by an example. Consider the following scenario (Figure 11): LRE1 and LRE2 are two platoons of tanks, engaged in battle. At the same time, LRE2 is engaged by two individual tanks -- HRE1 and HRE2. The battle between LRE1 and LRE2 is simulated at the aggregate level while the battle between LRE1, HRE1 and HRE2 is simulated at the disaggregate level. During a particular time-step, LRE1 inflicts 50% attrition on LRE2. The 50% attrition may be interpreted as the destruction of two of the four tanks in LRE2. During the same time-step, HRE1 and HRE2 destroy two tanks in LRE22. How should these two results be combined? Depending on the amount of overlap in the two interactions, the final result could be a reduction in LRE2's strength by 50% (complete overlap), 75% (partial overlap) or 100% (no overlap). For the most part, this choice must be made arbitrarily and the result assumed to be realistic. Unfortunately, apparently reasonable choices may lead to an unfair fight. The no-overlap choice does not account for the case where LRE1, HRE1 and HRE2 may have fired at the same tanks in LRE1, whereas the complete overlap choice penalises any co-ordination between LRE1, HRE1 and HRE2 in picking targets from LRE2. As another example, consider a time-step during which LRE2 expends 75% of its ammunition fighting LRE1. HRE1 and HRE2 also engage LRE2 during this time-step, causing LRE2 to expend 40% of its ammunition. At the end of the time-step, LRE2 will have expended 115% of its ammunition!

The problems above occur because the effects of an interaction are computed assuming that the interaction is isolated, i.e., it is the only interaction that occurs in a time-step. For some concurrent interactions, assuming they occur in isolation causes their combined effects to be computed incorrectly, leading to ineffective joint execution.

4.2.3 Fundamental Observation 3

Often, consistency problems arise during joint execution because a key property of interactions is ignored when the interactions are isolated. That property is interaction dependence -- an interaction's existence or effects depend on another interaction. Consider the more detailed view of Figure 11 shown in Figure 12. In a time-step duration τ, LRE2 interacts with LRE1, reducing the ammunition of a constituent tank (P) by 25%. In effect, P fires at LRE1 during τ. Also, in τ, LRE2 interacts with HRE1 because P fires at HRE1. Both interactions involve the firing of a weapon by P in the same time-step . Clearly, this is physically impossible (indicated in Figure 12 by tank P having two turrets). By permitting such an outcome, the simulation permits an unfair engagement.

The problem arises because two interactions that occur at overlapping simulation times involve a common entity, thus affecting each other's outcome. The two interactions of interest, the aggregate-level interaction between LRE1 and LRE2, I 1, and the disaggregate-level interaction between tank P in LRE2 and HRE1, I 2, both involve tank P firing. Since P can fire only once, I 1 and I 2 are dependent. Therefore, the results generated by applying their effects independently are incorrect.

FO-3: Concurrent interactions may be dependent.

Interactions that overlap in (i) simulation time, and (ii) the set of interacting entities, may be dependent because they can affect the outcome of one another. For example in Figure 12, one interaction precludes the other. If two interactions that are dependent are executed independently, effectiveness will be compromised when the results of these interactions are combined.

4.2.4 Fundamental Observation 4

In §4.2.3, we have shown that the fundamental issue underlying consistent combination of concurrent interactions is dependence among interactions. Time-step differentials aggravate the inconsistencies created due to dependency issues. Two interactions can be dependent if they overlap in time. The greater this overlap, the higher the potential for inconsistency.

FO-4: Time differentials may cause inconsistencies.

We elaborate on the problem of time differentials with a simple example. Let E1 and E2 be two entities that can change an attribute v . For this discussion it does not matter whether or not E1 and E2 are entities that describe the same object or process. During their time-steps, E1 and E2 send interactions that cause v to change; the changes may depend on the previous value of v . Thus, during each time-step, each entity reads v , performs some computation and writes to v .

Let the models for E1 and E2 both execute initially with time-steps of equal duration, i.e., TS(E1) = TS(E2) = τ. Furthermore, we synchronise the executions of E1 and E2 so that all time-step boundaries for these entities occur at the same time. In Figure 13, each bar represents a time-line for one of the entities. Vertical breaks in the bar denote time-step boundaries. It is simple to ensure that E1 and E2 are temporally consistent, i.e., they have the same view of v . At the end of each time-step, we reconcile the changes to v by computing some function of the effects of E1 and E2. At the start of the next time-step, both E1 and E2 read the same value of v , no matter how we resolve the concurrent changes of the previous time-step.

Now let us assume that we neglected to synchronise the time-steps of E1 and E2. The shaded areas in Figure 14 denote times when E1 and E2 are temporally inconsistent. The inconsistency arises because E1 (which lags in terms of time-steps) continues to compute a change to v based on the value read at the start of E1's time-step, whereas E2 may have changed v at the end of E2's time-step, which occurred before the end of E1's time-step. The implications of temporal inconsistency can be different for different applications. E1 may write a new value for v at the end of its time-step, thus causing E2's computation to become "stale". E1 may discard its computation and read the new value of v ; however, E1 may be forced to do so at the end of every time-step, thus rendering it redundant.

Temporal inconsistency is exacerbated if the durations of E1 and E2's time-steps are different. In Figure 15, E2's time-step duration is τ/5, whereas E1's time-step duration remains τ. At the end of each of its time-steps, E2 writes to v , therefore, for most of its time-step, E1 uses outdated values of v . The increase in temporal inconsistency can be seen by the increase in the length of the shaded regions.

If E1 and E2 have equal time-step durations, they can be temporally consistent. However, this requirement unnecessarily forces the time-step duration of E2 to be τ, or the time-step duration of E1 to be τ/5. If a difference in E1 and E2's views of v at an observation time changes the behaviour of neither E1 nor E2, then the temporal inconsistency is tolerable . Let δ v be a tolerable variance in the value of v during the time-step [ t 0, t 5] for E1 (Figure 16). At the end of each time-step [ t 0, t 1], [ t 1, t 2], ..., [ t 4, t 5] for E2, if the value of v changes by less than ±δ v , then E1 and E2 are temporally consistent with respect to v . If during all time-steps E1 and E2 are temporally consistent, then E1 and E2 execute at compatible time-steps .

Even if time-steps are made equal, temporal inconsistency may arise if the entities do not read the same value of v at the start of each time-step. Consider Figure 17, in which some time-steps have been labelled. Suppose E1 modifies v during the time-step between t 1 and t 2 without reading v beforehand. In effect, E1 executes with the value of v read in the previous time-step. That value may have been changed by E2 subsequently. Therefore, during the time-step between t 1 and t 2, E1 and E2 may be temporally inconsistent.

While proper design of models can remedy temporal inconsistency caused by cases such as the last one, temporal inconsistency caused by the previous cases may undermine the joint execution of multiple well-designed models. When executing legacy simulations such as AWSIM/ModSAF , Eagle/BDS-D and BBS/SIMNET jointly, time-step differentials are common. Low-resolution simulations typically use equations with coefficients derived from historical data aggregated over periods ranging from several minutes to days [Karr83] [Epst85]. Hence, time-steps of several minutes to a few hours are typical for such simulations. On the other hand, high-resolution simulations such as CCTT/SIMNET tanks execute at the millisecond time-step level [Miller95]. Resolving time-step differentials may be a very difficult problem, especially for legacy systems. FO-4 indicates that we must direct future simulation efforts towards solving this problem if we are to achieve effective multi-representation modelling.

4.3 Chapter Summary

The fundamental observations highlight the basic issues that must be addressed by any general, scalable approach to multi-representation modelling (MRM). These observations are a foundation for a successful approach to effective MRM. The fundamental observations address the issue of how models may interact, how dependent concurrent interactions may cause inconsistency and why resolving time differentials is important. These observations arise from the experience of analysing many models and determining why joint execution of these models becomes ineffective.

The key to multi-representation modelling is employing a holistic approach that is designed to solve issues of consistency. In the rest of this dissertation, we present one such approach, UNIFY , based on the fundamental observations.

1. Forcing a disaggregation could lead to chain disaggregation, and is therefore undesirable.

2. Typically, platoon-level engagements are specified in terms of percentage attrition, whereas tank-level engagements are specified in number of tanks lost.