11.0 Replaying & debugging applications

You may need to debug or analyze a Legion application in order to improve performance, find errors, increase fault tolerance, etc. There are two command-line tools that allow you to record and replay events associated with objects started by a particular application. Use legion_record to record the application's events in a local file and legion_replay to replay those events, so that you can analyze and debug the application's objects.

The legion_record utility executes your application in record mode so that it can later be played back with legion_replay. All objects started by the application will record all relevant event activity in a local file. This utility also monitors the application and reports if it dies.

The legion_reply utility is used in tandem with legion_record to debug a particular application. It starts a debugging session for an object started in the application.

Before running these utilities, however, you may want to identify which class objects will be instantiated by the application and assign them a common_name attribute, to make it easier to identify their instances. I.e., if you want to debug application Foo, which instantiates classes Bar and FooFoo, you should run:

$ legion_update_attributes /class/Bar -a \   
   "common_name('Bar')"
$ legion_update_attributes /class/FooFoo -a \
   "common_name('FooFoo')"

11.1 Sample record and replay

Here is the output from a record run with an application called AppClient. There are two objects involved: the AppClient object asks for the two numbers and passes them on to the TargetObject, which then divides the two numbers and prints the result. The TargetObject class has already been assigned a common_name attribute of TargetObject.

The legion_record command starts the application and puts the results in a file called DebugInfo. The application hits an error in the third calculation.

$ legion_record -uf DebugInfo AppClient
Initting Legion.
About to try and create the target object.
Enter two numbers to divide: 20 5
The answer is 4
Do you want to calc. some more (0 = no, 1 = yes)? 1
Enter two numbers to divide: 16 4
The answer is 4
Do you want to calc. some more (0 = no, 1 = yes)? 1
Enter two numbers to divide: 9 0
(Recorder detected an object death. Cleaning up.)
$

We can then use legion_replay to see exactly what happened. First, we run it with -list to see a summary of the session. The output shows the debug session name, ending status, and identification, and the two objects' session numbers and final status.

$ legion_replay -uf DebugInfo -list
Debug Session Name: 679982030
Session Status: An Object Died
______________________________
     
Session Number:         135511016
Object Identification: 1.376a2f24.06.4cb9001e.0000...
Session Status: Closed

Session Number:         134735824
Object Identification: TargetObject
Session Status: Object down/died
$

The first object, the AppClient, is identified by its LOID but the second, the TargetObject, is identified by its common_name attribute, TargetObject.

We then run legion_replay again, this time to debug the TargetObject. Since no debugger is specified, the GNU gdb is used.1 The output shows that the TargetObject fails when asked to divide by zero.

$ legion_replay -uf DebugInfo -local 134735824
(gdb) run
Starting program: /home/localtmp/mmm2a/OPR/Cached-TargetObject-Binary-1.16 
IN LegionPersistentBufferDir::inflate_universal
About to create buffer MayI
About to create buffer InvocationStore
About to create buffer LegionLibraryState
Warning: no tty object found in legion_tty_init

Program received signal SIGFPE, Arithmetic exception.
0x804cc96 in DoDivide__FGt4LRef1Z14LegionWorkUnit
     (wu={value = 0xbfffecfc, baseValue = 0x8051494, flags = -88 '('})
    at
/home/localtmp/mmm2a/Legion/src/ServiceObjects/Debugger/TargetObject.c:196
196             Result = Parm1 / Parm2;
(gdb) p Parm2
$1 = 0
(gdb)

1. GNU gdb 4.17.0.4 with Linux/x86 hardware watchpoint and FPU support. Copyright 1998 Free Software Foundation, Inc. This GDB was configured as "i386-redhat-linux".

Directory of Legion 1.8 Manuals
[Home] [General] [Documentation] [Software]
[Testbeds] [Et Cetera] [Map/Search]

Free JavaScripts provided by The JavaScript Source

legion@Virginia.edu
http://legion.virginia.edu/