return to main page

overview
To test our approach, we implemented two policies for data management. The first policy specifies a pattern for disseminating data files upon initial publication among nodes on our experimental grid based on a tiered-distribution model, where each successively lower tier is composed of a greater number of sites than the tier above it, but each of these sites also has less storage capacity than sites in the tier above it. The dissemination policy assumes that all files have been initially published at the top tier site, Tier 0. Files are then disseminated to the lower tier sites such that if a file resides in a site at a given tier, it must also reside in some site in each tier above it. Also, the policy ensures that all files in a given tier are distributed equally among all sites in that tier.

The second policy stipulates that three copies of each data file on our experimental grid be maintained. The application obtains the list of data items and their locations from the RLS, and determines the number of copies of each item. The policy then determines whether the number of copies of that file is below three. If this is found to be the case, that file is replicated to other sites, subject to the constraints that no two copies of the same file should reside on the same storage element and the storage element has space for the new copy.

The policy experiments were run on a single core i686 GNU/Linux machine with 1 GB of memory, in conjunction with a cluster of eight nodes, each of which were i686 GNU/Linux machines with 2 GB of main memory. The first machine ran the Globus RLS server and our Drools-based Policy-Driven Data Placement Service, in addition to a GridFTP client. The data files, each 1 MB in size, were replicated or distributed among the eight nodes, which also each ran a GridFTP server.
dissemination policy
In the dissemination policy experiment, we used seven cluster nodes, which we classified into three hierarchical tiers of one, two, and four nodes, respectively. We tested this policy first using eight hundred files and then again with eight thousand files, with each file being 1 MB in size. Each time, we assumed that the eight hundred or eight thousand files had been published at the top tier node and registered in the RLS prior to the start of the experiment. We ran our Policy-Driven Placement Service and recorded timing information to measure how long it took for the dissemination policy to be enforced and the files completely disseminated. Timing measurements were taken using the Java library System.currentTimeMillis() method, which returns the time in milliseconds since January 1, 1970 UTC.

dissemination graph

At the beginning of the experiment, our Policy-Driven Data Placement Service queries the RLS to acquire the names and locations of files to be distributed, then encapsulates this information as facts to insert into the Drools rule engine, which determines which files to move and where to move them according the policy, which has been encoded in the rule en-gine as rules. Next, the placement service initiates third-party GridFTP transfers to disseminate the data. In doing this, the service first copies half (four hundred or four thousand) of the total files to each of the nodes in the second tier level and then copies one quarter of the total files (two hundred or two thousand) to each node in the third tier level to achieve complete dissemination of the data. The above graph shows the execution times of this policy with eight thousand files. With eight thousand files, it took approximately 1.2 seconds to query the RLS and approximately 8789.64 seconds to completely disseminate the data.
replication policy
In the second policy experiment, we assumed that prior to the start of the experiment there are two copies of each data file distributed on eight cluster nodes. Further, these 1 MB files exist in eight sets of one hundred files each and each node stores two different sets, for a total of two hundred files per node. Again, we took timing measurements to record how long it took for the policy to be completely enforced and all files replicated. For this, as before, we also used the Java library System.currentTimeMillis() method, which returns the time in milliseconds since January 1, 1970 UTC.

replication graph

As with the previous experiment, it begins with the Policy-Driven Data Placement Service querying the RLS to obtain the names and of the files and the number and locations of the replicas of each file. It again encapsulates this information as facts and inserts them into the Drools rule engine, from returns the names of the files to replicate via transfer, as well as the source and destination of the transfer. As specified in the rule-encoded policy, the destination node of these files is chosen such that this node does not already have another copy of the file in question, for the purpose of reliability, and the node also has no more than three hundred files already stored on it, to model real constraints on actual storage systems. The end result is that each file has three replicas, with each of the replicas of any given file is stored on a different node, and each node has three hundred files. The graph shows the results of executing the Policy-Driven Data Placement Service with this policy. It took approximately .28 seconds to query the RLS and approximately 805.78 sec-onds to completely replicate all eight hundred files.
performance test
In addition to testing these policies, we also conducted a simple test to measure the scalability of the rule engine, which is a necessary requirement for data-intensive high-performance scientific applications. The scalability test was conducted on a single core i686 GNU/Linux machine with 1 GB of memory. In conducting the test we kept a constant number of one thou-sand rules. We increased the number of facts by powers of ten, ranging from one to one billion. The facts were randomly chosen in such a way that in each case, approximately ten per cent of the facts matched the rules, so in each time, the number of rule firings was equal to one-tenth the number of facts. The consequence of each rule execution consisted of a simple increment operation. Taking timing measurements, we inserted the facts into the working memory, upon which the rule engine performed the matching and executed those resulting activations. We repeated this test ten times for each data point and took the average. To obtain the timing measurements, we used the Linux time command line utility. Before executing the test, we first precompiled the rules in order to exclude this from the measurements. Thus, only fact insertion, matching, and rule execution were measured.

performance graph

The above graph shows the result of these performance tests. Execution with ten facts took 4.9 seconds, increasing up to approximately 7.4 seconds with one hundred thousand facts. At one million facts, the execution time increased dramatically to approximately 40.8 seconds. Finally, although the java heap size was set to the maximum possible value, at ten million facts and beyond, Drools caused an out-of-memory error and did not complete execution. The graph shows the average execution times out of ten runs for each data points along with the standard deviations. In all cases but one, the standard deviation was very small. The exception to this was in the case with one million facts, which had a relatively large standard deviation.

current as of September 2008