University of Virginia, Department
of Computer Science
CS655: Programming Languages
Spring 2000
Project
Report
CS655 Group 4
28 April 2000
Abstract
In Java, a programmer who wants to define a group of related
classes that have similar behavior but manipulate different types cannot easily
do so. In this paper, we propose the addition of type parameters to facilitate
the construction of generic classes and interfaces. Our solution allows generic
code to impose type constraints on the actual type used, so that certain
guarantees about the actual type can be preserved. We then show how a working
implementation of our design, Java ParTy, allows safe, generic Java code without
changing the JVM bytecode or incurring any additional run-time type casting
overhead.
1 Description of the Problem
In Java, a programmer who wants to define a group of related classes that have similar behavior but manipulate different types cannot easily do so. An ideal solution would allow the programmer to define a generic abstraction of the class, and then allow the class to adapt according to the type with which it is instantiated. This solution would require only one generic class to be written for a potentially large number of different instantiations, greatly reducing the amount of code that needs to be written. In addition, the ideal solution would still be able to guarantee type safety before run time, so it would not introduce any new type errors.
Unfortunately, Java does not provide any simple and efficient method to achieve this kind of genericity. Currently, a programmer that wishes to use generic code must design classes that use parameters of type Object. Through type casting, a particular parameter can be cast into Object, and manipulated by the class. This solution provides no mechanism for restricting the possible types used in the instantiated class, risks type equivalence problems, and incurs the heavy run-time overhead of type casts.
However, Java was intended to be extendable as the need for new programming power arises [BLM97]. We believe that Java users would benefit from the additional power of generic abstractions in code, and in our project, we have designed and implemented an extension to Java that adds support for parametric polymorphism via parameterized types.
There have already been several attempts at offering support for parameterized types in Java [BLM97][OW97][BOSW98a]. Each of these previous attempts is cleverly designed to offer complete support for all flavors of parametric polymorphism. Accommodating all possible scenarios of generic abstractions is a difficult programming task, and the current extensions handle these problems quite well.
Instead of trying to provide a better version of the same complete extension, we take a different approach. Firstly, we have examined what properties of parameterized types are most valuable to Java users. We wish to answer the question: What makes parameterized types valuable to programmers? We believe that determining the usefulness of the extension is of primary importance, and we will then use this knowledge to guide our design, rather than attempting to provide full support first, and then reflect on its usefulness after the implementation is complete.
After choosing what makes parameterized types valuable to programmers, we
will finally design and implement a limited version of these features. We
aim to add only the most useful extensions to Java, and will not add complexity
to the language unless its usefulness outweighs its complexity.
2 Related Work
In our consideration of related work, we examined both the existing support for type parameterization in other languages, and the current work done on Java to support parameterized type extensions. We chose to examine the existing support in languages other than Java in order to deduce exactly what features of type parameterization were desirable ones. The languages that already support generic types generally have a large amount of code written that takes advantage of it. From examining this code, we were better able to decide which supported features could be considered useful, and which supported features were not. The languages under consideration included C++, Ada, Sather, Eiffel, Haskell and ML.
We chose to examine the currently proposed extensions so that we could better
understand the unique challenges in adding type parameterization to the Java
programming language. Although we reviewed a large amount of work in this area,
we concentrated our efforts on examining PolyJ and GJ, two separately proposed
extensions that are among the top proposals currently being considered by Sun
Microsystems [Bra99].
2.1 Parameterized Types in Other Programming Languages
2.1.1 C++
C++ supports type parameterization through the use of templates. Templates were added to C++ in 1989 with the release of C++ 2.0, and were not a part of the original language design [Str93]. The parameter list of a template definition can include both type and value information. The type information allows a particular type to be passed into the corresponding function or class so that it can be properly instantiated for that type. The value information is the same as a function value parameter, and contains a value of a known type. Although value template parameters are supported by C++ templates, they are rarely used [CD99].
Templated classes in C++ are instantiated heterogeneously, so that a copy of each class exists for each appropriate type that could be passed as a parameter to the template. This allows C++ to appear to have dynamic type binding when it actually provides static binding, and allows C++ to provide type parameterization without sacrificing code execution speed. However, it causes an increase in code size.
Templates do not impose any restrictions on what a type must provide in order for it to be an appropriate candidate as a type parameter. Thus, a templated class that attempts to use a method of a type for which the method does not exist can cause a run-time error. Other languages avoid this problem by including type constraints in the declaration of the templated class that specify necessary features of the candidate type.
Templates support primitive types as well as user-defined types as type parameters. In fact, a type parameter can be another templated class itself. A class nested inside another class can share the same templated type information. Also, declarations of a templated class can implicitly include type information, based on other parameters, and can avoid requiring the programmer to explicitly declare the parameterized type.
In general, C++ templates were added to provide a maximum amount of
flexibility on top of an already existing language. Although flexible, some of
the features of templates were excluded from our design decisions because we did
not consider them necessary to genericity.
2.1.2 Haskell and ML
Haskell and ML both support the Hindley-Milner type system [Mil78], which naturally lends itself to polymorphic types and generic code. In such functional languages, we find that some polymorphic types are, in a sense, strictly more general than others. For example, the type [a] is more general than [Char]. In other words, the latter type can be derived from the former by a suitable substitution for a. With regard to this generalization ordering, Haskell's type system possesses two important properties: First, every well-typed expression is guaranteed to have a unique principal type, and second, the principal type can be inferred automatically.
An expression's or function's principal type is the least general type that, intuitively, "contains all instances of the expression." For example, the principal type of head is [a]->a; the types [b]->a, a->a, or even a are too general, whereas something like [Int]->Int is too specific. The existence of unique principal types is the hallmark feature of the Hindley-Milner type system, which forms the basis of the type systems of Haskell, ML, Miranda, and several other functional languages [Kre98].
The Hindley-Milner type system requires that every expression be properly
typed, but the languages themselves do not provide methods for providing
additional type constraints. Without extra type constraints, another user of a
polymorphic function may be able to use an unsuitable type with the function,
causing unexpected behavior from that function.
2.2 Proposals for Parameterized Types in Java
2.2.1 GJ and Pizza
GJ, or Generic Java, was designed in 1998 by Gilad Bracha of JavaSoft, Martin Odersky of the University of South Australia, David Stoutamire of JavaSoft, and Philip Wadler of Bell Labs, Lucent Technologies. It was originally based on the handling of parametric types in Pizza [OW97], but uses a simpler type system, and provides greater support for backwards compatibility. All GJ code translates into normal Java virtual machine (JVM) code. The designers chose this method over modifying Java bytecodes because it preserves the safety and security of the Java platform, and would allow for forwards and backwards compatibility with previously existing Java code.
GJ programs look much like the equivalent Java programs, except they have more type information and fewer casts. The GJ compiler erases type parameters, replaces the type variables by their bounding type (typically Object), and adds the appropriate type casting. Thus, generic code written in GJ is translated into code for type Object, and casts are used to appropriately modify the code for each desired type. This method allows for homogeneous genericity, and avoids the code size "blowup" that occurs in heterogeneous implementations like the one found in C++ [BOSW98a].
The method used by GJ to support parameterized types could just as easily be
done by programmers themselves. By allowing the GJ compiler to handle type
casting instead of requiring the programmer to do it, GJ pushes the possibility
of run-time type errors into compile time, adding to type safety. However, GJ is
really a hack that does not change the underlying structure of Java to
accommodate for parameterized types in any way. Because of this, GJ code is
filled with run-time type casting, usually between the desired type and type
Object. This causes a significant decrease in code execution speed.
2.2.2 PolyJ
PolyJ was designed in 1997 by Andrew C. Myers, Joseph A. Bank and Barbara Liskov of Massachusetts Institute of Technology. PolyJ is an extension to Java that supports constrained parametric polymorphism. It allows abstractions to be parameterized so a single piece of code can implement many related abstractions.
Based on previous work done in CLU [LSAS77], PolyJ provides the where clause as the mechanism for constraining parameterized types. The where clause allows the programmer to state that a class conforms to a certain constraint without explicitly declaring the relationship when the class is defined. For example, consider a hashtable class in which it is necessary to obtain a hashcode for each object inserted. A where clause allows the programmer to restrict the set of possible types to those that provide a hashcode method. This differs from other project such as Pizza and the following Stanford project, which rely on the programmer to use only those types that are appropriate for the generic class. Because the power and safety of the where clause has been proven to be effective in PolyJ as well as in CLU, we have chosen to implement this feature in our language extension.
In order to provide good performance for parameterized code in both execution
speed and code size, PolyJ makes significant changes to the bytecodes and to the
JVM. This was found to be necessary to allow shared code while keeping constant
pool information separate among instantiations of a parameterized class. This
makes the implementation more costly and the virtual machine more complicated.
Any change to the bytecodes and bytecode verifier will require all safety and
security aspects of the system to be reevaluated. Furthermore, PolyJ code will
not be backwards compatible with any normal JVM which may discourage the current
Java programming population to use PolyJ.
2.2.3 Agesen, Freund, and Mitchell
This project [AFM97] was headed by Ole Agesen of Sun Microsystems Lab, and Stephen N. Freund and John C. Mitchell of Stanford University. It supports only the type relations implements and extends, which are used to extend a parameterized type for the Java programming language. This mechanism differs from the where clause constraint mechanism in that it cannot express constraints that involve the type parameter itself [BLM97].
Unlike other projects that change Java bytecodes, Agesen et al. inserts a
preprocess step into the load phase of the JVM. By delaying instantiation until
load time, they are able to achieve an appropriate balance between language
expressiveness, run-time efficiency and compiled code size, without changing the
language definition or making major modifications to the JVM. We anticipate that
this method will be easiest for Java programmers to adopt and use effectively,
and interfere minimally with any current or future developments in Java
implementations. Because of these benefits, we have chosen to take the same
approach in the ParTy implementation. Their implementation, however, lacks a
type constraint mechanism, which means that users may not write safe, generic
code.
3 Solution: Java ParTy
3.1 Language Design Criteria
ParTy aims to add only the most useful extensions to Java, and does not add
complexity to the language unless its usefulness outweighs its complexity. We
chose to model our design by combining the most desirable features of the Agesen
et. al. load-time extensions with the power and safety of the PolyJ language
design. Through this combination, we were able to provide a powerful construct
for designing generic classes and interfaces without making any major changes to
Java bytecodes or the JVM. ParTy intends to maximize programming potential
without modifying the original Java specification in any destructive way.
3.2 Language Definition: Syntax Specification
The following subsections cover all ParTy additions to Java syntax, as
specified in the Java Language Specification [GLS96].
3.2.1 Keywords
ParTy adds one new keyword to Java: the where clause.
Keyword: one of abstract default if private throw boolean do implements protected throws break double import public transient byte else instanceof return try case extends int short void catch final interface static volatile char finally long super while class float native switch where const for new synchronized continue goto package this(see JLS, section 3.9)
3.2.2 Type Syntax
A parameterized type consists of a class or interface type C and a parameter section <T1, T2, ..., Tn>. C must be the name of a parameterized class or interface, and the types in the parameter list <T1, T2, ..., Tn> must match the number of declared parameters of C, and each actual parameter must be a subtype of the formal parameter's bound type [GJ specification].
ParTy does not support parameterized methods outside a parameterized class, so a type variable cannot be introduced by a method alone. Therefore, we append the following to the type syntax of the Java Language Specification (JLS, section 4):
Type:
PrimitiveType
ReferenceType
(see JLS Section 4.1)
ReferenceType:
ClassOrInterfaceType
ArrayType
TypeVariable
TypeVariable:
Identifier
ClassOrInterfaceType:
ClassType
TypeArgumentsopt
InterfaceType
TypeArgumentsopt
ClassType:
Identifier
ClassType . Identifier
InterfaceType:
Identifier
InterfaceType . Identifier
ClassOrInterface:
Identifier
ClassOrInterfaceType . Identifier
TypeArguments:
< ReferenceTypeList
>
ReferenceTypeList:
PrimitiveType
ClassorInterfaceType
ReferenceTypeList , ClassorInterfaceType
3.2.2.1 Examples
LinkedList<int> A; // primitive types are permitted
Vector<String> B;
Collection<Collection<Integer>> C; // A Collection of
Integer-typed Collections
3.2.3 Class and Interface Declaration Syntax
A parameterized class declaration defines a set of types with constraints. One of these types is specified for each type parameter in the declaration when the class is instantiated.
ClassDeclaration:
ClassModifiersopt class Identifier TypeParametersopt
Superopt Interfacesopt ClassBody
InterfaceDeclaration:
InterfaceModifiersopt interface Identifier
TypeParametersopt
ExtendsInterfacesopt InterfaceBody
TypeParameters:
< TypeParameterList
> Constraintsopt
TypeParameterList:
TypeVariable
TypeParameterList , TypeVariable
Constraints:
where ConstraintList
ConstraintList:
ConstraintArguments
ConstraintList , ConstraintArguments
ConstraintArguments:
TypeVariable
Methodsopt TypeBoundopt
Methods:
{MethodHeaders}
MethodHeaders:
MethodHeader
MethodHeaders , MethodHeader
(see JLS Sect. 8.4 for MethodHeader)
TypeBound:
extends ClassType
implements InterfaceType
extends ClassType implements InterfaceType
3.2.3.1 Examples
class simpleClass<T> {
// class
body goes here
}
public class myConstrainedClass<Type1, Type2>
where { boolean lessThan(Type1 t) }
extends myBaseClass implements myInterface {
// class body goes here
}
3.2.4 Discussion
In general, ParTy adds support for type parameters within class or interface definitions, including the use of the where clause to impose additional constraints. A detailed discussion of design decisions influencing ParTy syntax follows.
We chose not to allow value information to be coupled with type information in the parameters, opposing C++-style parameterization. In our syntax, a reference type list can produce a type variable or another type parameter list followed by a type variable. Thus, it cannot produce any type identifier other than a type variable identifier. Encapsulating value information with type information adds no new power to the generic code; any value information can just as easily be passed as a functional parameter through a constructor. Rather, this encapsulation is harmful because it blurs the distinction between type and value parameters, and we prefer this distinction to be clear.
For example, this class declaration and constructor:
class notPermitted<type, value> {
type t;
int v;
public
notPermitted<type, value>() {
t = type;
v = value;
}
}
could have just as easily been written in ParTy as:
class permitted<type> {
type t;
int v;
public
permitted<type>(int value) {
t = type;
v = value;
}
}
and the latter version maintains a healthy separation of type and value parameterization.
We chose to allow multiple type parameters to be passed in a single instantiation of a parameterized class. In addition, we chose to allow type parameters to themselves be of a parameterized type. Syntactically, a reference type list can itself contain another reference type list. This allows for F-bounded polymorphism [CCH+ 89], where a type argument can itself be a parameterized type. This is necessary for any non-trivial application of generic classes, and is why we included this feature. One could easily imagine a large-scale application that takes advantage of passing parameterized classes as type parameters.
This recursive definition of type parameters can cause a small lexical problem with the angle brackets used to specify multiply nested type parameters. For instance, the double angle brackets in Vector<Seq<String>> may be incorrectly identified as the right-shift symbol >>. We have omitted the extra syntactic additions required to accommodate this problem for brevity. For a study of one solution to this problem, see Bracha et al. [BOSW98b].
Parameterized declarations can declare subtype relationships. E.g., SpecialStack can be declared to be a subtype of Stack:
class SpecialStack[T] extends Stack[T] { … }
Such a subtype declaration is legal if the new subtype satisfy the constraints of the supertype. It should be noted that any instantiation of T should appear the same in SpecialStack[T] and Stack[T]. For example, SpecialStack[String] can be stated to be subtype of Stack[String], but it is illegal to state that SpecialStack[String] is subtype of Stack[Object] even though Sting is a subtype of Object.
Formally, an instantiation is never a subtype of another instantiation that differs from it in one of the actual parameters. The reason for this restriction is easy to understand. Consider the following code:
Stack[Object] b = new Stack[String] ();
b.push(new
Object());
Stack[String] c = (Stack[String]) b;
String
x = c.pop();
If the subtype relationship were allowed, the initial assignment to b would be legal. The push(Object ob) method call would also be legal. Furthermore, the cast would succeed since b is a Stack[String] object, and therefore the assignment to x would be executed. But this assignment would cause x to refer to Object element, which is not type correct.
The type arguments can optionally be followed by any number of constraint arguments. These constraints require a type variable to have an associated method that satisfies the corresponding method header defined for it. The constraint is implemented with the where keyword, which has been added to the set of Java keywords.
Basically, the information in the where clause serves to isolate the implementation of a parameterized definition from its uses (i.e., instantiations). Thus, the where clauses permit separate compilation of parameterized implementations and their uses, while still enforcing type constraints, which is impossible in C++ and Modula3. We have chosen to support the where clause in our implementation for such reasons.
In implementing where clauses, the compiler not only checks whether a type t has particular method m declared in the where clause, but also checks the type of the arguments of m. It verifies whether a hypothetical call to m will succeed.
Though CLU provided the ability of applying additional constraints to individual methods, the ability also adds substantial complexity to the design and implementation. We omit it in our implementation. The following example shows additional constraints that we will not support:
interface SortedList<T>
where T { boolean
lt (T t); } // ALLOWED: constraint is applied to interface
{
…
void
output(OutputStream s)
where T { void output (OutputStream s); } // NOT ALLOWED: additional method
constraint
{…}
}
3.3 Language Definition: Semantics Specification
|
& Check ( ClassName < Type1, Type2, … Typen > ) [parameterized-class-instantiation] ClassName_Type1_Type2…_Typen Id; |
|
& ClassModifiersopt class ClassName <Type_Var1,Type_Var2,…,Type_Varn > where Type_Var1 Methodsopt TypeBoundopt, Type_Var2 Methodsopt TypeBoundopt, ... Type_Varn Methodsopt TypeBoundopt {ClassBody} & Check(Type1 Methodsopt TypeBoundopt) & Check(Type2 Methodsopt TypeBoundopt, ... & Check(Typen Methodsopt TypeBoundopt) [parameterized-class-check] ClassModifiersopt class ClassName_Type1_Type2…_Typen {ClassBody( Type1/Type_Var1, Type2/Type_Var2, Typen/Type_Varn) }1 return True |
|
& Type has Method1 & Type has Method2, ... & Type has Methodk & Type extends A implements B [extends-implements-check] return true |
|
& Type has Method1 & Type has Method2, ... & Type has Methodk & Type extends A [extends-check] return true |
|
& Type has Method1 & Type has Method2, ... & Type has Methodk & Type implements B [implements-check] return true |
Footnote: 1 Create new class, where ClassBody( Type1/Type_Var1, Type2/Type_Var2, … Typen/Type_Varn) is produced by replacing every occurrence of Type_Var1, Type_Var2…Type_Varn in ClassBody with Type1, Type2…Typen respectively. 2 k >= 0
|
3.4 Implementation
Figure 1 shows a schematic diagram of the Java Virtual Machine. When a running program refers to a class that has not been loaded, compiled bytecode passes through three largely independent components of the virtual machine:
(1) The Loader obtains the bytecode for the class from either a local file or a remote site;
(2) The Verifier validates the bytecode by checking that operation codes are valid, branches are to legitimate locations, methods have structurally correct signatures, and that every instruction obeys the Java type discipline;
(3) The Linker initializes static fields of the new class and may load
related classes.
Figure 1: Java Virtual Machine
Our general strategy was to compile a parameterized class into an
extended form of the class file. In addition to all the information usually
found in a class file, the extended file includes information about type
parameters and constraints. When the virtual machine attempts to load an
instantiation of a parameterized class or interface, a loader preprocess phase
transforms the parameterized class file into the particular desired
instantiation and then declares it as if it were a normal class. For example,
the contents of the class stack<T> may be used by the loader to
heterogeneously generate both stack<Integer> and
stack<String> classes if needed.
The following subsections describe our ParTy program model and corresponding
ParTy class file format that extends from general Java Class File format.
3.4.1 ParTy Program Model
Figure 2 illustrates the basic mode of ParTy program translation into JVM-legal class files:
Figure 2. ParTy class file translation.
(1) ParTy programs are compiled by a ParTy compiler:
3.4.2 ParTy Class File Format
A typical Java class file is a binary representation for a compiled class that can be loaded by any Java Virtual Machine. The constant pool contains strings representing all names mentioned in the class file, including class names, field and method names, type names, and so on. One convenient aspect of the class file format is that most type information generated when a class is compiled is stored in its constant pool as strings. The only exception is type information about primitive types [AFM97], which is embedded directly into the bytecode instructions.
Therefore, by extending the special strings format in the constant pool, we have created the ParTy class file format, containing parameterized types and their constraint information (see Fig. 3).
Figure 3: ParTy Class File Format extensions for parameterized classes.
Specifically:
the parameterized class file for the Uft8 string "stack" will be changed to: "stack_ParTy_String ".
In the load-time preprocessing phase, a parameterized class is
instantiated into different instantiations. The algorithm for transforming a
parameterized class file into binary representations is described in Section
3.4.1.
3.4.3 Discussion
| Proposal | Language Extension | Implementation |
| Constraints Support | Instantiation method | |
| C++ | No constraints | Heterogeneous; instantiated at compile time |
| PolyJ | Where clauses | Homogeneous; separate type info. |
| GJ/Pizza | Implements/extends | Homogeneous; shared type info. |
| ParTy | Where clauses | Heterogeneous; instantiated at load time |
Figure 4. Summary of the language extensions and implementation techniques
Several advantages can be achieved by delaying instantiation until load time. Firstly, the bytecode verifier and interpreter remain unaffected, and thus the type correctness and other properties of Java program execution are not compromised. Secondly, the ParTy class file format allows for separate compilation of a paramterized class and programs which create instances of it. Once the parameterized class file has been generated, a check against the constraints stored in that file guarantees type correctness of an instantiation. Thirdly, load-time instantiation provides size and speed tradeoffs. The existence of only a single compiled class file for all instantiations of a parameterized class makes binary representations compact on disk and allow efficient distribution of Java classes over a network.
The heterogeneous approach to generating typed classes used in our load-time preprocessor has some disadvantages also. Compared to a homogeneous implementation, memory demand on the system increases because a new class is generated for each instantiation. The increase is proportional to the number of differently-typed instantiations of the parameterized class. We may expect that the memory increase will be small relative to the memory requirements for the system as a whole. ParTy load-time instantiation of parameterized classes also increases the time spent on loading classes during program execution. The heterogeneous class production approach enables faster execution because the resulting classes often contain fewer run-time type checks, and we chose it for this reason.
Because the code blowup caused by the heterogeneous class file production
does not occur until load time, generic ParTy code does not require extra space
when compiled. This means that compiled ParTy code is equivalent in size to the
corresponding Java code. In fact, it may be smaller, as the ParTy code does not
need to contain any extra type casting. This property makes ParTy code ideal for
transmission over a network such as the Internet, where the code size is still a
major concern. The delay in code blowup allows small, efficient, generic code to
be sent quickly without penalty.
3.4.4 Experimental Implementation
We divide the implementation work into two parts: (1) a ParTy compiler, and
(2) the load-time preprocessor. So far, a prototype of the JVM load-time
preprocessor has been implemented and tested using Sun’s JDK 1.1.5 source
release. While a full compiler for parameterized classes has not yet been
implemented, this prototype is sufficient to test the effectiveness of load-time
instantiation. Figure 5 presents a detailed illustration of the ParTy class
translation in our experimental implementation.
Fig. 5. Experimental ParTy implementation parameterized class translation
into JVM bytecode.
We developed a special prototype compiler to compile our ParTy java source code to get ParTy class file. This special compiler will analyze ParTy source code, rewrite its parameterized class names into a valid syntactic form allowed by the normal Java Lnanguage Specification, create the necessary dummy classes for type variables, and then pass them to a normal JDK compiler. In this way, a ParTy class file is produced.
When instantiating a ParTy class at preload time, load-time preprocessor will:
Instantiation with primitive types is another implementation issue we did not
implement in current version. There two ways to implement it: Wrapper class or
Modify bytecode directly. Future work can add primitive type arguments into it.
3.4.5 Some detail of Load-time preprocessor implementation
The following is how our experimental load-time preprocessor goes:
Note: A and B are different name.
If so, class A should be parsed twice. Firstly, class A itself should be instantiated for other users of the class A. Secondly when instantiating, the paramters A obtained should be sent to class B so that it can be instantiated using these parameters.
The following is part of the type checking related to ParTy that is to be done in special compiler:
In the case of class <S> A extends class B{…}, no additional work is needed. In the case of class <S> A extends class <T> B{…}, it is merely no use. The following case is usual and useful:
class A<T> extends B<T> {…}
We evaluated ParTy in terms of its expressiveness and its performance. To
measure the power of expression added in a solution that provides only limited
support for type parameterization, we attempted to model several standard
classes that are often defined generically in languages that support genericity.
To measure performance, we tested the execution speed of generic ParTy code
against Java code that attmepts to provide the same flexibility through casting
between the actual type and type Object. Overall, we found that ParTy is a
successful tool for designing efficient, generic code, and that our technique
improves execution speed by up to 27% over standard Java.
4.1 Qualitative Evaluation
In testing the usefulness of ParTy code extensions, our group attempted to implement several classes that are typically defined generically in other languages that support type parameterization. We were able to easily create generic versions of the pair class, linked list class, linked list iterator class, and the ordered set class, all in ParTy. The Test1 class is a simple illustrative ParTy example that uses type parameterization.
/* Test1.pjava */
///////////////////////////////////////////////////////
// Test1.pjava is a generic list
class that holds
// an
array of type Type1. It provides a method to sort
// the members of the class. In additions to
ParTy, Test1 would
//
use the where clause to specify a constraint requiring a
// comparison operation from any type used to
instantiate a
// Test1
object.
import java.lang.String;
class Test1<Type1>
{
public Type1[] sort;
public Test1<Type1> (Type1[] x)
{
int
len=x.length;
sort = new
Type1[len];
for(int i=0; i<len;
i++)
sort[i]=x[i];
}
void
sort()
{
Type1
temp;
int
count; // for checking whether interchanges are done or not
int
n; // for counting number of passes
n=0;
count=1;
while
(n < sort.length-1 || count != 0)
{
count = 0;
for(int j=0;j <sort.length-1; j++)
{
if (sort[j].compareTo(sort[j+1])>0 )
{
temp = sort[j];
sort[j] = sort[j+1];
sort[j+1] = temp;
count++;
}
}
n++;
}
}
/* TestC.pjava */
///////////////////////////////////////////////////////////////////
// TestC is a simple example that
instantiates the generic Test1
// class with particular types, and then calls member
functions
// of that
class. TestC is written in ParTy.
import java.lang.String;
import java.lang.Math;
class TestC {
public static void
main(String[] args) {
int i;
int
temp;
Int[]
a = new Int[10];
for(i=0; i<10; i++)
{
Int tmp =new Int();
int j=(int)(Math.random()*100);
tmp.Set_Val(j);
a[i]=tmp;
}
Test1<Int> y = new Test1<Int>(a);
System.out.println("For the Int Array");
System.out.println("Before Sorting:");
for(i=0; i<10; i++)
{
temp=a[i].Get_Val();
System.out.print(temp);
System.out.print(" ");
}
System.out.println(" ");
y.sort();
System.out.println("After Sorting:");
for(i=0; i < 10; i++)
{
temp=y.sort[i].Get_Val();
System.out.print(temp);
System.out.print(" ");
}
System.out.println(" ");
System.out.println(" ");
System.out.println("For the String Array");
String s[] = {"apple", "cell", "wonderful", "cool", "dare", "cable", "high",
"TALL", "Jump"};
System.out.println("Before Sorting:");
for (i=0; i<s.length; i++)
{
System.out.print(s[i]);
System.out.print(" ");
}
System.out.println(" ");
Test1<java0lang0String> z = new
Test1<java0lang0String>(s);
z.sort();
System.out.println("After Sorting:");
for(i=0; i < z.sort.length; i++)
{
System.out.print(z.sort[i]);
ystem.out.print(" ");
}
System.out.println(" ");
}
}
/* Output of the User Program */
For the Int Array
Before Sorting:
84 60 59 60 32 55 2 85 61
25
After
Sorting:
2 25 32 55 59
60 60 61 84 85
For the String Array
Before Sorting:
apple cell wonderful cool dare cable
high TALL Jump
After
Sorting:
Jump TALL
apple cable cell cool dare high wonderful
Despite the success of ParTy in writing generic code, we found that some
complicated code could not easily be written in ParTy. Because ParTy does not
support where clause constraints for individual methods of a class, a user
cannot write code that distinguishes different restrictions for different
methods. The overall set of method constraints can be used in a class constraint
declaration, but this is overly restrictive. Furthermore, a user cannot
currently write code that instantiates a parameterized class using implicit type
information; instead, all type information in ParTy must be explicit. We omitted
these options after a careful review of genericity in other languages. These
features, although powerful, were not deemed valuable because they would add
unnecessary complexity to the original Java language.
4.2 Quantitative Evaluation
We performed a simple benchmark to examine the performance difference
between generic ParTy code and the corresponding Java hack that programmers must
use to simulate genericity in Java. Currently, type parameterization is
simulated by creating classes that operate on Objects. Any type derived from
Object can be passed to these classes at instantiation time, and even primitive
types can be passed by using the corresponding wrapper class appropriately.
However, the type casting overhead that the current Java method requires is
high, and the benchmark reveals its cost, relative to the performance of ParTy
code. Figure 6 illustrates the benchmark code in ParTy and Java, and figure 7
demonstrates the benchmarking results.
| public
class cast { public static void main(String[] args) { Integer myInt = new Integer(0); int I; long st, ft; ListNodeWithCasting
IntegerNode = st =
System.currentTimeMillis(); |
public
class nocast { public static void main(String[] args) { Integer myInt = new Integer(0); int i = 0; long st,
ft; st =
System.currentTimeMillis(); |
| cast.java benchmark code for Java | nocast.pjava benchmark code for ParTy |
Figure 6. Benchmark code used in quantitative evaluation.
Java: cast.java average execution speed |
ParTy: nocast.pjava average execution speed |
| 4527.8 milliseconds | 3279.0 milliseconds |
Figure 7. Benchmarking results.
From our results, we can conclude that ParTy yields up to 27.6% faster code
than the corresponding Java code. It should be noted that this figure does not
include the extra cost of running the ParTy load-time preprocessor. Because our
preprocessor is experimental, the code has not been tuned for performance, and
is not suitable for performance testing. Additionally, this figure does not
include the one-time cost of running the special ParTy compiler. However, this
cost does not effect run-time performance, and was omitted from the evaluation.
However, there is reason to believe that even after including the extra cost of
ParTy load-time instantiation, the general execution time of ParTy code will
still remain lower than the corresponding Java code, although this must be
pursued in further research.
Furthermore, support for primitive type instantiation of a type parameter must be added. Although this is tedious work, it has already been proven to be possible [BLM97]. Due to time constraints, this feature was omitted from the current working version of ParTy.
Lastly, we would like to integrate the load-time preprocessor into the JVM
ClassLoader. This will allow more benchmarking for comparing the overall speed
of ParTy code running in the modified JVM to the original Java code running in
the original JVM. Also, this would allow us to present ParTy as one complete
package.
We also describe some details of our particular implementation. By using a load-time preprocessor, we are able to postpone code blowup until run time, and achieve observed preliminary performance increases as high as 27% over the corresponding Java code with type casting. Most importantly, the load-time processing avoids the need to add, change, or remove any Java bytecodes, making our ParTy implementation backwards and forwards compatible with existing Java code.
We found that the constructs that ParTy provides facilitates the programming of the types of classes most users want to make generic. We provide one simple example, and demonstrate that our language supports the tools necessary to build other generic classes, such as a linked list.
Finally, we discuss the work that remains in ParTy to provide a fully robust
language extension. We believe that ParTy provides a useful, simple, and
efficient method of writing generic code in Java, and hope that our work will
make Java a better language.
References
[AFM97] Agesen, Ole, Freund, Stephen N., and Mitchell, John C. Adding Type Parameterization to the Java Language. In OOPSLA '97.
[Alp99] Alphonce, Carl. Introduction to Functional Programming and ML. From http://www.ugrad.cs.ubc.ca/spider/cs312/Lectures/CS312_12.html
[Bar80] Barnes, J. G. P.An Overview of Ada. From Software Practice and Experience, v. 10, 1980, pp. 851-887.
[BLM97] Bank, Joseph A., Liskov, Barbara, and Myers, Andrew A. Parameterized Types for Java. In Proceedings of the 24th ACM Symposium of Principles of Programming Languages, Paris, France, 1997, pp. 132-145.
[BOSW98a] Brasha, Gilad, Odersky, Martin, Stoutamire, David, and Wadler, Philip. GJ: Extending the Java programming language with type parameters. Available at the GJ web site.
[BOSW98b] Brasha, Gilad, Odersky, Martin, Stoutamire, David, and Wadler, Philip. GJ Specification. Available at the GJ web site.
[BOSW98c] Brasha, Gilad, Odersky, Martin, Stoutamire, David, and Wadler, Philip. Making the future safe for the past: Adding Genericity to the Java Programming Language. In OOPSLA '98, pp. 183-200.
[Bra99] Bracha, Gilad. JSR-000014: Add Generic Types To The Java Programming Language. From the Java website, http://java.sun.com/aboutJava/communityprocess/jsr/jsr_014_gener.html.
[CC96] Geoff A. Cohen, and Jeffrey S. Chase. Automatic Program Transformation with JOIE. Department of CS, Duke University.
[CCH+89] Canning, P., Cool, W., Hill, W., Mitchell, J., and Olthoff, W. F-bounded polymorphism for object-oriented programming. In Proceedings of the Conference on Functional Programming Languages and Computer Architecture, 1989, pp. 283-280.
[CD99] Cohoon, James P. and Davidson, Jack W. C++ Program Design: An Introduction to Programming and Object-Oriented Design, 2nd Edition. Boston: McGraw-Hill, 1999.
[DGLM95] Day, M., Gruber, R., Liskov, B., and Myers, A. Subtypes vs. Where Clauses: Constraining Parametric Polymorphism. From http://www.pmg.lcs.mit.edu/, 1995.
[GLS96] Gosling, James, Joy, Bill, and Steele, Guy. The Java Language Specification 1.0. Java Series, Sun Microsystems, 1996.
[GTR99] Gifford, David and Turbak, Franklyn, with Reistad, Brian. Applied Semantics of Programming Languages. Draft, 1999.
[Kre98] Kreutzer, Wolfgang. A Gentle Introduction to Haskell. From http://www.cosc.canterbury.ac.nz/~wolfgang/122-97/haskellTut.html.
[LSAS77] Liskov, Barbara, Snyder, A., Atkinson, R., and Schaffert, C. Abstraction Mechanisms in CLU. From Communications of the ACM, 20:8, 1977, pp. 564-576.
[LMM98] Barbara Liskov, Nick Mathewson, and Andrew Myers. Overview of PolyJ. Copyright Barbara Liskov 1998.
[Mil78] Milner, Robin. A Theory of Type Polymorphism in Programming. In Journal of Computer and System Sciences, Vol. 17, 1978, pp. 348-375.
[OW97] Odersky, Martin, and Wadler, Philip. Pizza into Java: Translating Theory into Practice. In Proceedings of the 24th ACM Symposium of Principles of Programming Languages, Paris, France, 1997, pp. 146-159.
[Str93] Stroustrup, Bjarne. A History of C++: 1979-1991. Paper and talk transcript from History of Programming Languages II conference, 1993.
| cs655g4: |
| University of Virginia CS 655: Programming Languages |