11.0 Replaying & debugging applications

You may need to debug or analyze a Legion application in order to improve performance, find errors, increase fault tolerance, etc. There are two command-line tools that allow you to record and replay events associated with objects started by a particular application. The legion_record tool records the application's events in a local file, and the legion_replay tool replays those events, so that you can analyze and debug the application's objects.

Before running these utilities, however, you may want to identify which class objects will be instantiated by the application and assign them a common_name attribute, to make it easier to identify their instances. I.e., if you want to debug application Foo, which instantiates classes Bar and FooFoo, you should run:

$ legion_update_attribute /class/Bar -a \   
   "common_name('Bar')"
$ legion_update_attribute /class/FooFoo -a \
   "common_name('FooFoo')" 

The legion_record command executes your application in record mode so that it can later be played back with the legion_replay tool. All objects started by the application will record all relevant event activity in a local file. The utility also monitors the application and reports if it dies.

The legion_reply command is used in tandem with the legion_record tool to debug a particular application. It starts a debugging session for an object started in the application.

11.1 Sample record and replay

Here is the output from a record run with an application called AppClient. There are two objects involved: the AppClient object asks for the two numbers and passes them on to the TargetObject, which then divides the two numbers and prints the result. The TargetObject class has already been assigned a common_name attribute of TargetObject.

The legion_record command starts the application and puts the results in a file called DebugInfo. The application hits an error in the third calculation.

$ legion_record -uf DebugInfo AppClient
Initting Legion.
About to try and create the target object.
Enter two numbers to divide: 20 5
The answer is 4
Do you want to calc. some more (0 = no, 1 = yes)? 1
Enter two numbers to divide: 16 4
The answer is 4
Do you want to calc. some more (0 = no, 1 = yes)? 1
Enter two numbers to divide: 9 0
(Recorder detected an object death. Cleaning up.)
$ 

We can then use legion_replay to see exactly what happened. First, we run it with the -list option to see a summary of the session. The output shows the debug session name, ending status, and identification, and the two objects' session numbers and final status.

$ legion_replay -uf DebugInfo -list
Debug Session Name: 679982030
Session Status: An Object Died
______________________________
     
Session Number:         135511016
Object Identification: 1.376a2f24.06.4cb9001e.0000...
Session Status: Closed

Session Number:         134735824
Object Identification: TargetObject
Session Status: Object down/died
$  

The first object, the AppClient, is identified by its LOID but the second, the TargetObject, is identified by its common_name attribute, TargetObject.

We then run legion_replay again, this time to debug the TargetObject. Since no debugger is specified, the GNU gdb is used1. The output shows that the TargetObject fails when asked to divide by zero.

$ legion_replay -uf DebugInfo -local 134735824
(gdb) run
Starting program: /home/localtmp/mmm2a/OPR/Cached-TargetObject-Binary-1.16 
IN LegionPersistentBufferDir::inflate_universal
About to create buffer MayI
About to create buffer InvocationStore
About to create buffer LegionLibraryState
Warning: no tty object found in legion_tty_init

Program received signal SIGFPE, Arithmetic exception.
0x804cc96 in DoDivide__FGt4LRef1Z14LegionWorkUnit (wu={value = 0xbfffecfc, 
      baseValue = 0x8051494, flags = -88 '('})
    at
/home/localtmp/mmm2a/Legion/src/ServiceObjects/
Debugger/TargetObject.c:196
196             Result = Parm1 / Parm2;
(gdb) p Parm2
$1 = 0
(gdb) 

1. GNU gdb 4.17.0.4 with Linux/x86 hardware watchpoint and FPU support
Copyright 1998 Free Software Foundation, Inc.
This GDB was configured as "i386-redhat-linux" back

Directory of Legion 1.7 Manuals
[Home] [General] [Documentation] [Software]
[Testbeds] [Et Cetera] [Map/Search]

Free JavaScripts provided by The JavaScript Source

legion@Virginia.edu
http://legion.virginia.edu/