2.10 Program support

legion_ft_initialize

This is a script that initializes the CheckpointStorage object. This object is used by MPI fault tolerance implementation and must be run before you can use the legion_mpi_run fault-tolerant flags (page 90). You only need to run it once, however.

legion_FTPd
     [-help] [-v] [-p <port number>] [-a] [-m <max connections>]
     [-t <transfer block size>] [-o <max outstanding invocations>] [-d]
     [-w] [-e] [-rt <RMI timeout>] [-dt <detail timeout>]

Starts the Legion FTP daemon from the command line. This daemon is a Legion-enabled server tool for accessing Legion context space via FTP. Once you've started the daemon, you can use a standard FTP client software to view, store, and retrieve objects from your context space.

Once started, the daemon's operating parameters cannot be changed. To terminate, send a SIGINT or ^-c signal.

Please be aware that if you use the -a flag, other users will be able to use your daemon to view and use your context space. Your password will also travel in the clear over the network.

The Legion FTP daemon will only work for systems running Legion version 1.7 or higher. It implements the minimum requirements of an FTP server as outlined in RFC-959 and RFC-1123.

The following options are supported:

-help

Print the usage information and default values for selected parameters and exit.

-v

Operate in verbose mode. All commands and responses will be echoed to stdout

-p <port number>

Specify a listening port for incoming FTP connections.

-a

All machines other than the local host to connect to the daemon.

-m <max connections>

Specify the maximum number of simultaneous FTP connections.

-t <transfer block size>

Specify the block size in bytes for Legion file object invocations.

-o <max outstanding invocations>

Specify the maximum number of outstanding Legion remote method invocations. If the -d flag is also enabled, only <max outstanding invocations> will simultaneously be queried for details. This flag also restricts the number of simultaneous file transfer block requests/sends.

-d

Specify that object details be retrieved during directory listings.

-w

If the -d flag has been enabled, Unix-style world access rights (r/w/x) will be displaced during directory listings. The SITE CHMOD command will also be enabled, allowing modification of individual object's world rights.

-e

Encrypt all Legion communications. If you set this option and do not enable the -a flag you will have the highest level of data security.

-rt <RMI timeout>

Specify the Legion remote method invocation (RMI) time out value in seconds.

-dt <detail timeout>

Specify the time out value for looking up object details (use in conjunction with the -d flag).

legion_link 
     [-CC <compiler>] [-Fortran] 
     [-FC <compiler>] [-pvm] [-mpi] 
     [-L<library path>] [-l<library>] 
     [-v] [-o <output file>] 
     [-bfs] <object file list> 
     [-debug] [-help]

This command links together a set of object files and libraries to produce an executable program and automatically binds the produced executables to the Legion libraries. It is similar to the Unix ld command. It can be used with object files and libraries created from C, C++, or Fortran source code.

If any Fortran object files are linked, the -Fortran flag must be included. If you specify a Fortran compiler with the -FC flag, the -Fortran flag is automatically implied.

The following options are available with this command:

-CC <compiler>

Select a C++ compiler to perform linkage. The default value is based on the default compiler used by the Legion system on your platform.

-Fortran

This flag must be included if any Fortran object files are linked.

-FC <compiler>

Select a Fortran compiler. This flag implies the -Fortran flag, but the two can be used together. Currently supported <compiler> options are:

fort

Compaq Fortran compiler (Alpha Linux only)

g77

Gnu g77

pgf77

Portland Group pgf77 compiler (Intel Linux)

pgf90

Portland Group pgf90 compiler (Intel Linux)

-pvm

Link the produced executable to the Legion PVM compatibility library.

-mpi

Link the produced executable to the Legion MPI compatibility library.

-L<library path>

Include the <library path> in the set of directories searched for libraries. There is no space between the "-L" and the library path.

-l<library>

Link against the specified library. There is no space between the "-l" and the library name.

-v

Provide a verbose output as the command is running.

-o <output path>

Specify the local path of the resulting program. The default is a.out.

-bfs

Link the produced executable to the Legion Basic Fortran Support (BFS) library.

-debug

Catch and print Legion exceptions.

-help

Print command syntax and exit.

legion_manage_job
     [-help] [-k] [-cp <priority>]
     [-fr] [-nc] [-q <context name>] <ticket number>

This tool manages a job started with the legion_nq (page 93) command-line tool. You must provide the job's ticket number. If the job has finished or failed, legion_manage_job will by default copy back output files, clean up temporary files, and mark the job's record for deletion. Use -nc to override this.

Use -fr to force the job to restart. Only the job's owner and the system administrator can perform these operations.

The following parameter is used with this command:

<ticket number>

The job ticket number (assigned when you ran legion_nq, page 93)

The following options are supported:

-help

Print this help message.

-k

Kill the job (job owner and system administrators only).

-cp <priority>

Change the job's priority.

-fr

Force the job to restart (job owner and system administrators only).

-nc

No file clean up.

-q <context path>

Specify a JobQueue object. Default is /etc/JobQueue.

legion_manage_queue
     [-help] [-purge]
     [-maxrs <slots>] [-q <context name>]

Use this command to manage a Legion JobQueue object. If you run it without flags, you will see all jobs that the object is currently managing.

If you are a system administrator, you can also use the -maxrs flag to set the maximum number of job slots and the -purge flag to kill all of the object's jobs.

The following options are supported:

-help

Print this help message.

-purge

Kill all jobs/reset queue (system administrators only).

-maxrs <slots>

Set number of total run slots (system administrators only).

-q <context path>

Specify queue object. Default is /etc/JobQueue.

legion_mpi_debug 
     [-q] {[-c] <program instance context>} 
     [-help]

A utility program that allows the user to examine the state of MPI objects and print out their message queues. The <program instance context> is the context that is holding your program's instances. Normally it is /mpi/instances/<program_name> (/home/<user_name>/mpi/instances/<program_name> in a secure system) unless you specified otherwise when you started the program (i.e., if you used the -p flag when you ran legion_mpi_run). This command will return a list of all of the program's instances and what each one is doing. This is a handy way to debug a deadlock.

The following option is available with this command:

-q

List the contents of the queues.

-help

Print command syntax and exit.

If you use -q, the command will return a list of instances that are waiting to be received by the queues.

There are a few limitations: if an MPI object doesn't respond, the command will hang and not go on to query additional objects. An MPI object can only respond when it enters the MPI library; if it is in an endless computational loop, it will never reply. Output goes to stdout.

legion_mpi_probe
     [-help] [-debug] [-v[erbose]]
     [-all] [-list]
     [-in <context path name>] [-out <context path name>]
     [-IN <local file name>] [-OUT <local file name>]
     [-showscratch] [-stat <remote file name>]
     <pid context name>

Checks an MPI program started on a remote host with legion_mpi_run (page 85). This command resembles legion_probe_run (page 96) in that you can check on jobs, move input and output files between your local and execution hosts. It doesn't use a probe file, however, since you can contact your job through its pid context.

The following parameter must be used:

<pid context name>

The context that contains the MPI instance LOIDs associated with the job.

The following options are supported:

-help

Print command syntax and exit.

-debug

Run in debug mode.

-v[erbose]

Run in verbose mode.

-all

Perform operation on all instances (default is instance 0).

-list

List all files in the remote program's working directory.

-in <context path name>

Context path of a Legion file object whose contents should be copied into the remote program's current working directory after execution begins. The local file will have the same name as the <context path name> basename.

-out <context path name>

Context path of a Legion file object whose contents should be copied out of the remote program's current working directory after the program terminates. The local source file will have the same name as the <context path name> base-name. This tool will not check to see if the job has finished, so it may copy out an incomplete file.

-IN <local file name>

Similar to the -in but operates on a file in the legion_mpi_probe local execution environment (i.e., the file named in <local path name>).

-OUT <local file name>

Similar to the -out option but operates on a file in legion_mpi_probe's local execution environment (i.e., the file named in <local path name>).

-stat <remote file name>

Print the status of a particular file in the remote job's present working directory.

-showscratch

Print the archive context scratch space. If the run hasn't yet finished or if legion_mpi_run dies or is killed before the job finishes, this space will contain -OUT files that haven't yet been copied out.

legion_mpi_register 
     <class name> <binary local path name> <platform type> 
     [-help

MPI implementation generally require that MPI executables reside in a given place on a disk. Legion's implementation objects have a similar requirement. This command registers the executables of different architectures for use with Legion's MPI. It creates MPI-specific contexts, a class object, and an implementation for the program, and registers the name in Legion context space. If Legion security has been enabled the newly created contexts will be placed in the user's /home/<user_name>/mpi context except if the user is admin or not logged in. If Legion security has not been enabled, the newly created contexts will be placed in the /mpi context.

The command can be executed several times, if you have compiled your program on several architecture.

The following option is supported:

-help

Print command syntax and exit.

legion_mpi_run
     {-f <option file> [<flags>]} 
	| {-n <number of processors> [<flags>] <program class>
     	[<arg1> <arg2>... <argn>]} 

Starts an MPI program. The <program class> argument requires the full context path for the class (the class is created through legion_mpi_register: if you have not previously registered the program, you'll need to do so before you can run it). Parameters used for this command are:

-f <option file>

Allows users to start multiple MPI applications via an option file. All applications must be registered with Legion and have a common MPI_COMM_WORLD.

-n <number of processors>

Specify the number of processors on which the program will run.

Supported <flags> are1:

-v

Verbose option (up to four can be specified for increased detail).

-h <host set context>

Specify a set of hosts on which the program will run. The context path should not resolve to a single host but to a context containing the names of one or more hosts. You'll need to create it by hand. In the default setting, Legion will pick a compatible host and try to run the object. If this fails Legion will try another compatible host.

-HF <local specification file>

Use a specification file in your local file space to schedule objects from the current execution (see page 89 for more information).

-hf <Legion specification file>

Use a specification file in context space to schedule objects from the current execution (see page 89 for more information).

-in <Legion input file>

Tells Legion to copy the contents of a Legion file object to a file in the remote program's current working directory before execution. The remote file will have the same name as the Legion file. Note that this file will only be passed to node 0 unless you use -a.

This flag can be used multiple times to indicate multiple input files. You can also use wild cards to specify groups of files.

 
-out <output file>

Name of an output file that should be copied from the remote program's current working directory into your context space after the program terminates.2 Note that this file will only be passed to node 0 unless you use the -a flag.

The new file object will have the same name as the original output file object unless -a is used, in which case the name will be <name>.<mpi object id>. This flag can be used multiple times. You can also use wild cards to specify groups of files.

Output files that are not found in the program's current working directory will be copied but will appear as empty files. If the program crashes, the files will be left in the program's current working directory.

 
-IN <local input file>

Tells Legion to copy a local file to the remote program's current working directory before execution. The remote file will have the same name as the local file. Note that this file will only be passed to node 0 unless you use -a.

This flag can be used multiple times to indicate multiple input files. You can also use wild cards to specify groups of files.

 
-OUT <output file>

Name of an output file that should be copied from the remote program's current working directory into your local directory after the program terminates.3 Note that this file will only be passed to node 0 unless you use the -a flag.

The new file will have the same name as the original output file unless -a is used, in which case the name will be <name>.<mpi object id>. This flag can be used multiple times. You can also use wild cards to specify groups of files.

Output files that are not found in the program's current working directory will be copied but will appear as empty files. If the program crashes, the files will be left in the program's current working directory.

 
-a

Pass the files in the -in/-out/-IN/-OUT parameters to all nodes. The default setting passes these files only to node 0.

Files in the -out/-OUT parameters will be copied to files that use the following conventions: <name>.<mpi object id>, where <name> is the <output file> specified in the -out/-OUT parameters.

 
-stdin <context path>

Map standard input to a Legion file object. By default, this file object will only be passed to node 0 unless you include -A.

-stdout <context path>

Map standard output to a Legion file object. By default, this file object will only be passed to node 0 unless you include -A.

-stderr <context path>

Map standard error to a Legion file object. By default, this file object will only be passed to node 0 unless you include -A.

-STDIN <local file>

Map standard input to a local file. By default, this file will only be passed to node 0 unless you include -A.

-STDOUT <local file>

Map standard output to a local file. By default, this file will only be passed to node 0 unless you include -A.

-STDERR <local file>

Map standard error to a local file. By default, this file will only be passed to node 0 unless you include -A.

-A

Pass the files in the -std*/-STD* parameters to all nodes. The default setting gives these files only to node 0.

-p <context path>

Specify a context path for the program's instance LOIDs. If the context does not exist it will be created. The default path is:

/mpi/instances/<program_name>

If security is enabled, the default is:

/home/<user_name>/mpi/instances/<program_name>
<host context name>

Run the first process (process zero) on this node.

-d <Unix path name>

Specify that all children change to the specified directory before they begin to run.

-S

Print statistics at exit.

-D <variable_name>=<value>

Set the environment variable named in <variable_name> to a specified value on all MPI processes after they have called MPI_Init(). This option may be repeated multiple times to set additional environment variables.

-help

Print command syntax and exit.

Assuming that you do not use an option file, the MPI application named in the legion_mpi_run command will be started with the given flags. For example, to run MPI program vdelay on two hosts you would enter:

 $ legion_mpi_run -n 2 /mpi/programs/vdelay

Use legion_ls to examine the running objects of your application. The /mpi/instances/<program_name> context is the default location of your application's instances4:

$ legion_ls /mpi/instances/vdelay

If you run multiple versions of an application simultaneously, you can use the -p flag to specify a different context to hold your instances.

Specification file

A specification file is a text file that contains a list of hosts and the number of Legion objects that can be scheduled on those hosts. Objects from your application(s) will be scheduled on these hosts. You can use legion_make_schedule to produce this file or create it by hand. List one host and one integer indicating the number of objects the host can create (the default is 1) per line. List hosts by their Legion context names. A host can be listed multiple times in one file, in which case the integer values accumulate. E.g.:

/hosts/BootstrapHost	5
/hosts/slowMachine1
/hosts/bigPBSqueue 	100
/hosts/slowMachine2 	1

The file can be in your local directory or context space. Use -hf or -HF to use the file when running legion_mpi_run. All of the objects from this Legion MPI run will be created using vector creates.

Wild cards

You can also use wild cards to work with groups of input/output files. The following wild cards can be used with -in/-out and -IN/-OUT:

*

Match 0 or more characters.

?

Match any one character.

[-]

Match any character listed between the brackets (use these to specify a range of characters).

\

Treat the character as a literal.

For example, if you wanted to identify done.1, done.2, done.3 ... done.9 as your inputs, you could use square brackets to identify them as a group:

$ legion_mpi_run -n 2 -IN done.[0-9] \
  /mpi/programs/mpiFoo

You can use wild cards on the command line or in an option file. They can only be used with file names, however, not with directories.

Option file

The option file contains a list of all MPI applications that you wish to execute as well as any necessary arguments, as they would appear on the command line. All applications must have been registered (with legion_mpi_register) and use a common MPI_COMM_WORLD. The file must list one binary (and any arguments) per line. Each line can also contain one or more legion_mpi_run flags (but not -f). For example:

-n 2 /mpi/programs/mpitest
-n 3 /mpi/programs/mpitest_c

This would start a run of five instances (two of mpitest and three of mpitest_c).

If you use an option file, Legion will ignore the -n flag, program name, and any arguments given on the command line. Any <flags> will be treated as defaults and applied to all processes executed by this command (unless otherwise specified in the option file).

Fault tolerant mode

There is a set of special legion_mpi_run flags for running in a fault tolerant mode.

-ft

Turn on fault tolerance.

-s <checkpoint server>

Specifies the checkpoint server to use.

-R <checkpoint server>

Recovery mode.

-g <ping interval>

Ping interval. Default is 90 seconds.

-r <reconfiguration interval>

When failure is detected (an object has not responded in the last x seconds) restart the application from the last consistent checkpoint. Default value is 360 seconds.

These flags are used in specific combinations. You must use the -ft in all cases and you must use either -s or -R. The -g and -r flags are optional. Please see section 10.1.10 in the Basic User Manual for more information on running in fault tolerant mode.

legion_native_mpi_config_host 
     [<wrapper>] 
     [-debug] [-help]

This command sets native MPI properties on a host. The following options are supported:

<wrapper>

Specify a wrapper script that locates mpirun on the host. The default specifies the legion_native_mpich_wrapper script, which is for an MPICH implementation.

-debug

Catch and print Legion exceptions.

-help

Print command syntax and exit.

legion_native_mpi_init 
     [<architecture>] 
     [-debug] [-help]

This command installs the legion_native_mpi_backend class in a native MPI host.

The following options are supported:

<architecture>

Specify an architecture for which an implementation for this class can be registered.

-debug

Catch and print Legion exceptions.

-help

Print command syntax and exit.

legion_native_mpi_register 
     <class name> <binary path> <architecture> 
     [-help]

Register a native MPI program with your Legion system. The example below registers /myMPIprograms/charmm (the binary path) as using a Linux architecture.

$ legion_native_ mpi_register charmm \ 
    /myMPIprograms/charmm linux

You can run register a program multiple times, perhaps with different architectures or different platforms. If you have not registered this program before, this will create a context in the current context path (the context's name will be the program's class name -- in this case, charmm) and registers the name in Legion context space.

The following option is supported:

-help

Print command syntax and exit.

legion_native_mpi_run 
     [-v] [-a <architecture>] [-h <host context path>]
     [-IN <local input file>] [-OUT <local result file>]
     [-in <Legion input file>] [-out <Legion result file>]
     [-n <nodes>] [-t <minutes>] [-legion] 
     [-help] [-debug]
     <program context path> [<arg 1> <arg 2> ... <arg n>]

Start a native MPI program.

The following parameters are used for this command:

-h <host context path>

Specify the host on which the program will run. The default setting is the system's default placement, which is to pick a compatible host and try to run the object. If the host fails, the system will try another compatible host.

-v

Verbose option.

-a <architecture>

Specify the architecture to run on.

-IN <local input file>

Specify an input file that should be copied to the remote run from the local file system.

-OUT <local result file>

Specify a result file that should be copied back from the remote run to the local file system.

-in <Legion input file>

Specify an input file that should be copied to the remote run from the Legion context space.

-out <Legion result file>

Specify a result file that should be copied out from the remote run into Legion context space.

-n <nodes>

Specify the number of nodes for the remote MPI run.

-t <minutes>

Specify the amount of time requested for the remote MPI run. This option is only meaningful if the host selected to run the remote program enforces time limits for jobs.

-legion

Indicate whether the application makes Legion calls (see section section 10.2.5 in the Basic User manual).

-help

Print command syntax and exit.

-debug

Catch and print Legion exceptions.

<arg1> <arg2> ... <argn>

Arguments to be passed to the remote MPI program.

legion_nq
     [-help] [-w] [-v]
     [-a <arch> ] [-h <host context path>]
     [-in <context name>] [-out <context name>]
     [-IN <local file name> ] [-OUT <local file name>]
     [-f <options file> ] [-novrun] [-t <minutes>]
     [-n <nodes>] [-r <minutes>] [-p <priority>] [-d]
     <program class> [ <arg1> <arg2> ... <argn>]

The legion_nq command submits jobs to a Legion JobQueue object. When a job is submitted, the JobQueue stores all necessary information for handling this job and returns a "ticket" which can be used for future references to the job. This command can be used in place of legion_run (page 100) to start a job on a remote Legion host. You cannot use legion_probe_run (page 96) to probe a legion_nq job. You can use legion_manage_job (page 82) to manage the job after it has started.

You can use the -in/-out and -IN/-OUT options to copy input and output files between your local host, the remote host, and context space. You can also use an options file.

Use -p to set a job's priority in the priority queue. Priorities are used when the number of submitted jobs is larger than the number of jobs that are allowed to execute simultaneously. If so, the job with the highest priority will be allowed to execute when a slot becomes available.

The command will propagate tty information to the JobQueue object so that stdout and stderr output can to redirected to legion_nq window

JobQueue

A JobQueue can be used to start, monitor, and clean up remote jobs. The JobQueue object is owned by the Legion system administrator (i.e., the admin user) and takes care of preparing, starting, and monitoring the job's progress. It assigns a ticket to each submitted job and a job record, which hold job-related information, and puts the job in a priority queue. The system administrator sets a maximum number of simultaneously running jobs (with legion_manage_queue, page 83) and the JobQueue tries to start jobs as slots become available.

If the job starts successfully, its record is moved the list of running jobs. If it fails, its record is moved to the list of failed jobs. The job will be retried after another job has been started or has failed or the number of job slots is increased. Waiting jobs can be given higher and lower levels of priority by the job owners and/or the system administrator.

The JobQueue does not block and wait for a job to finish. When a job terminates (for whatever reason), its job proxy object (an object on the remote host which actually starts the job) notifies the JobQueue object.

The JobQueue also periodically pings running jobs, updates job statistics, kills jobs that have either overrun their maximum allocated running time, and removes jobs records marked for deletion.

A Legion system can have multiple JobQueue objects, responsible for different parts of the system's resources. You can use legion_manage_queue to manage individual JobQueue objects.

Note that jobs run with the JobQueue still must be registered, with either the legion_register_program (page 99) or legion_reg-ister_runnable (page 99).

Parameter

The following parameter is used with this command:

<program class>

Specifies the program class of which an instance should be executed. This program class should have been previously created with the legion_register_program (page 99) or legion_register_runnable (page 99) command.

Options

The following options are supported

-help

Print this help message.

-w

Display standard I/O from the remote program.

-v

Verbose mode.

-a <architecture>

Specify the preferred Legion platform type on which the remote program should run.

-h <host context path>

Specify the context path of the Legion host object on which the remote program should run.

-in <context path>

Copy the specified Legion file to the current working directory of the remote program.

-out <context path>

Copy the specified Legion file's contents out of the current working directory of the remote program into Legion context space. The remote object's path should be the same as the <context path>.

-IN <local file name>

Send the specified local file to the current working directory of the remote program.

-OUT <local file name>

Copy a file out of the current working directory of the remote program into the specified path in your local file system.

-f <options file>

Obtain options from the specified file. An options file can contain one or more legion_nq flags delimited by spaces, tabs, or newlines. An options file is especially useful if your program uses several -in/-out or -IN/-OUT flags. You can include any flags but not arbitrary command-line arguments.

-novrun

Disable use of the virtual architecture on virtual hosts. The remote program will run on the physical architecture for the selected host. This option is not usually appropriate for user tasks. WARNING: the registered implementation for the virtual architecture of the selected host should also be usable on the physical architecture of the selected host if this option is used (e.g., a script).

-t <minutes>

Specify a requested amount of time (in minutes). This option is only meaningful if the host selected to run the remote program enforces time limits for jobs. Otherwise, this option is not required.

-n <nodes>

If the remote program is a native parallel job (e.g., a program written to use the local vendor MPI implementation), this option selects the number of nodes that should be allocated for the job. This option is not meaningful if the host selected to run the remote program is a uniprocessor, or does not support multi-node allocations.

-r <minutes>

Specify the maximum amount of time (in minutes) that the queuing system will wait for the program to complete before attempting to restart it. Default is 0 (no restarts).

-p <priority>

Specify job priority for the queuing system. Value range from 0 to 9, 0 being the highest priority. The default is 5. Users can assign values from 5 to 9. Only system administrators can assign values 0 to 4 (after job submission) with the legion_manage_job (page 82) command.

-q <context path>

Specify a JobQueue object. Default is /etc/JobQueue.

-d

Deactivate -IN files and temporary contexts to save process slots.

<arg1> <arg2> ... <argn>

Provide arbitrary command-line arguments for the program.

legion_probe_run
     [-help] [-debug] [-v[erbose]]
     [-in <context path>] [-out <context path>]
     [-IN <local file path>] [-OUT <local file path>]
     [-showscratch] [-setscratch <context path>]
     [-pwd] [-hostname] [-list] 
     [-stat <remote file path>] [-delete <remote file name>]
     [-statjob] [-signal <number>] [-kill]
     {-l <probe LOID> | [-p[robe]] <local file path>}

This command checks a program started on a remote host with legion_run (page 100). It can also pass input files to or pick up output files from the remote host, check individual files associated with the job, and clean up the remote host space. It can be used with blocking or nonblocking runs, although it is especially useful with nonblocking runs.

You must have used the -p option when starting legion_run in order to use this command: this option creates a local file called a probe file that can contact your remote job. If you do not have a probe file for your job, legion_probe_run will be unable to locate your job.

Note that if you start a nonblocking run, the remote host will hold the results for six hours after the job terminates. If you do nothing, the host will tar up the job's current working directory and move it into your context scratch space. If you run legion_probe_run at any point (whether to retrieve output files or just to check the job's status) during those six hours, the clock restarts and the remote host will wait for another six hours before tarring and moving the job's directory. If you do not use the -kill option to clean up the job's directory, it will eventually be tarred and moved.

One of the following parameters must be used:

-p[robe] <local file name>

The name of the local probe file associated with the desired job.

-l <probe LOID>

The probe's LOID (may not be supported in all releases.)

The following options are supported. Options may be repeated. The order of specified options (except -v[erbose], -debug, -help, and -p[robe]) determines order in which tasks will be done.

-debug

Catch and print Legion exceptions.

-help

Print command syntax and exit.

-v[erbose]

Run command in verbose mode.

-in <context path name>

Context path of a Legion file object whose contents should be copied into the remote program's current working directory.

The local file will have the same name as the <context path name> basename.

 
-out <context path name>

Context path of a Legion file object whose contents should be copied out of the remote program's current working directory. If the file doesn't exist, you will get an error message.

The local source file will have the same name as the <context path name> basename. Output files that are not found in the program's current working directory are skipped.

 
-IN <local file name>

Similar to the -in option, but operates on a file in legion_probe_run's local execution environment (i.e., the file named in <local file name>).

-OUT <local file name>

Similar to the -out option, but operates on a file in legion_probe_run's local execution environment (i.e., the file named in <local file name>).

-setscratch <context path>

Specify which part of your context space should be used as scratch space for this remote job. Default is /tmp.

-showscratch

Print the currently set context scratch space. This will be the most recently specified space.

-pwd

Print the remote job's present working directory. This is useful if you set this directory with the -d option when you ran legion_run.

-hostname

Print the remote host's DNS name.

-list

Print a list of files in the remote job's current working directory.

-stat <remote file name>

Print the status of a particular file in the remote job's current working directory.

-delete <remote file name>

Delete the specified file from the remote working directory.

-statjob

Print a summary of the remote job's status. Possible values are "Running", "Error", and "Done".

-signal <number>

Send a signal to remote job (see kill -l).

-kill

Kill the remote job (if it hasn't yet terminated) and clean up its remote current working directory. Please note that if you have not yet retrieved output files, they will be lost.

If you started legion_run in nonblocking mode and you do not clean up the remote directory after the job finishes, the remote host will wait six hours and then tar and move the entire directory to your context scratch space.

If you started legion_run in blocking mode and you then kill the remote job, legion_run will hang and must be killed by hand.

legion_pvm_register 
     <class path name> <binary local path name> <platform type> 
     [-help]

Registers a PVM task implementation. This setup is not necessary for tasks that will only be started from the command line (tasks that will not be "spawned"). Given the class named in <class path name>, the binary named in <binary local path register>, and the architecture type (currently Linux, Solaris, or SGI), this command creates an implementation object that can then be used by the Legion PVM Library.

Once you've registered an application, you can run it. You do not need a special command to run a registered PVM program. If necessary, you can examine Legion context space with either

$ legion_ls /pvm

to list the Tids of running PVM tasks, or

$ legion_ls /pvm/tasks

to list the registered task classes. You can also use Legion class utilities to examine the class state (e.g., legion_list_instances [page 17]).

For example, to register a Linux binary named matmult in Legion, enter:

$ legion_pvm_register /pvm/tasks/matmult matmult linux

The following option is supported:

-help

Print command syntax and exit.

legion_register_program 
     <program class> <executable path> <legion arch> 
     [-debug] [-help]

This command allows uses to register an independent (i.e., not linked to the Legion libraries) executable program (specified by the <executable path> argument) and make the program available for use within the Legion system. If the program was not previously registered, a new class object the registered program will be associated with a Legion class object, named in <program class>, the class and the context path will be created. The registered program will execute only on hosts of the architecture specified in <legion arch>.

Programs registered through legion_register_program can be executed with the legion_run command (page 100). See also legion_register_runnable for information about registering Legion programs (below).

The following parameters are used with this command:

<program class>

The Legion context path name of the class with which the registered program should be associated.

<executable path>

The local file path of the executable program to register. This can be any program that could be run from the shell command prompt, including scripts, and binary executable generated by any programming language.

<legion arch>

The platform type on which the program should be executed.

The following options are supported:

-debug

Catch and print Legion exceptions.

-help

Print command syntax and exit.

For more information please see Independent programs in the Basic User Manual.

legion_register_runnable
     <program class> <executable path> <legion arch> 
     [-debug] [-help]

The legion_register_runnable command is similar to the legion_register_program command in that it allows programs to be registered for execution through the legion_run utility. However, whereas the legion_register_program tool is used to register independent programs, legion_register_runnable is used to register Legion-linked programs: programs that are linked against the Legion libraries and export the "runnable" object interface.

The following parameters are used with this command:

<program class>

The Legion context space path of the class with which the registered program should be associated.

<executable path>

The local file path of the executable program to register. This program should be a valid Legion object implementation that has been linked with the Legion library, and that exports the Legion "runnable" interface.

<legion arch>

The platform type on which the program should be executed.

The following options are supported:

-debug

Catch and print Legion exceptions.

-help

Print command syntax and exit.

For more information please see Legion-linked programs in the Basic User Manual.

legion_run 
     [-help] [-debug] [-v[erbose]] [-w] [-a <arch>] [-h <host>]
     [-block] [-non[-]block] [-d[ir] <dir>] [-D <var=value>]
     [-in <contextfile>] [-out <contextfile>]
     [-IN <localfile>] [-OUT <localfile>]
     [-stdin <localfile>] [-stdout <localfile>] [-stderr <localfile>]
     [-novrun] [-p[robe] <localfile>] [-setscratch <context>]
     [-meta <option=value>]
     [-f <options file>] <program class> [<arg1> ... <argn>]

The legion_run command executes a single instance of a program associated with the program class specified in the <program class> argument. Legion will randomly select a host to execute the program (observing the restriction that only hosts with an acceptable architecture may be selected). Arbitrary command-line arguments may be specified for the remote program.

Any number of input and output files may be specified for a single execution of legion_run (i.e., the -in/-out and -IN/-OUT options can be repeated).

Please note that the -t and -n flags are deprecated as of version 1.8. Instead, you can use -meta nodes=<num> for -t and -meta minutes=<num> for -n.

The following parameters are used with this command:

<program class>

Specifies the program class of which an instance should be executed. This program class should have been previously created with legion_register_program (page 99) or legion_register_runnable (page 99).

The following optional parameters are supported:

-debug

Catch and print Legion exceptions.

-help

Print command syntax and exit.

-v[erbose]

Run command in verbose mode.

-w

Specifies that the set tty object should be used. If no tty object has been set, the flag will be ignored. If you have not created and set a tty object for your current window, you will not be able to see the command's output and an error message will appear. Please see About Legion tty objects in the Basic User Manual for more information.

-a <architecture>

Allows users to specify what kind of architecture the program should be executed on.

-h <host context path>

Specify a remote host to run the program.

-non[-]block

Specify that the legion_run command be nonblocking. That is, once it starts your program on the remote host, it will exit and free up your command line.5 This command's default setting is to block.

-block

Specify that the legion_run command block. This is the default setting: when you run it, the command will continue to run at the command line until the remote job is finished.

-d[ir] <remote host directory>

Specify that the program run on a specified directory on the remote host.

-D <var=value>

Set environment variable <var> with <value> and run.

-in <context path name>

Copy the specified Legion file object into the remote program's current working directory before execution begins. The local file will have the same name as the <context path name> basename.

-out <context path name>

Copy the specified Legion file object out of the remote program's current working directory after the program terminates. The local source file will have the same name as Output files are copied out regardless of the cause of program termination, so partial results will be available if the program crashes. Output files not found in the program's current working directory will be skipped.

-IN <local file name>

Similar to the -in option, but operates on a file in the local execution environment of legion_run (i.e., the file named in <local file name>).

-OUT <local file name>

Similar to the -out option, but operates on a file in the local execution environment of legion_run (i.e., the file named in <local file name>).

-stdin <local file name>

Specify a local file from which the remote job gets its standard input.

-stdout <local file name>

Specify a local file to which the remote job writes its standard output.

-stderr <local file name>

Specify a local file to which the remote job writes its standard error.

-novrun

Do not run on a virtual host. I.e., disable use of the "virtual architecture" on virtual hosts. The remote program will run on the physical architecture for the selected host. This option is usually not appropriate for user tasks. Warning: the registered implementation for the virtual architecture of the selected host should be usable on the physical architecture of the selected host (e.g., a script).

-meta <option=value>

Specify metaoptions for the run. These options may be meaningful only for certain combinations of jobs and machines. Listed below is the set of supported options and their meaning.

nodes=<min>[,<max>]

Number of nodes for job (default is 1).

tasks_per_node=<num>

Number of tasks per node (default is 1). Total number of tasks is product of allocated nodes and tasks_per_node.

queue=<name>

Name of a queue on host.

priority=<str>

Priority to run under.

minutes=<num>

Duration of run.

-p[robe] <local file name>

Store the remote job's tracking name in a file on your local host. You must use this option if you wish to use the legion_probe_run tool.

-setscratch <context path>

Specify a portion of your context space to be used as scratch space.

-f <option file>

Allows users to specify options for running legion_run in a separate file rather than listing them on the command line. This is useful for programs that make extensive use of the -in/-out or -IN/-OUT options. The file can contain one or more of the legion_run flags delimited by spaces, tabs, or blank lines. The program class name and any arbitrary command-line arguments may not be included in this file. No other information should be included.

<arg1> <arg2> ... <argn>

Allows users to provide arbitrary command-line arguments for the program.

The possible architectures that can be used with the -a flag are limited. At the moment they are:

Architecture

Corresponds to

Comments

solaris

Sun workstations running Solaris 5.x

 

sgi[n32|n64]

SGI workstations running IRIX 6.5

 

linux

x86 running Red Hat 6.x Linux

 

alpha_linux

DEC Alphas running Red Hat 6.x Linux

 

alpha_DEC

DEC Alphas running OSF1 v4

 

rs6000

IBM RS/6000s running AIX 4.2.1

 

hppa_hpux

HPUX 11

 

t90

Cray T90s running Unicos 10.x

virtual hosts only (see page 69 in the System Administrator Manual)

t3e

Cray T3E running Unicos/mk 2.x

virtual hosts only (see page 69 in the System Administrator Manual)

For more information on this command please see Running a Legion application in the Basic User Manual.

legion_run_multi 
     [-help] [-debug] [-v[erbose]] [-z[ero]] [-r[estart]]
     [-e[xec] command] [-x <exceptfile>] [-a[rch] <architecture>] 
     [-d[ir] <dir>] [-D <var=value>] 
     {-n <number of processors> | -s[chedule] <schedfile>} 
     -f[ile] <specfile> [-t[ime] <timefile>] [--] <program class> 
     [<arg1> <arg2> ... <argn>] 

This command runs a previously registered serial program, named in <program class name>, with all of the different input files, using a simple specification file, named in <specification file name>, to describe the names of the expected input and output files. You must provide the number of processors and/or a schedule file.

Before you run this command, you must have registered the program name and created a specification file. The specification file format is:

 keyword	filename	pattern

You must fill in all three fields on each line. One file per line.

Specification file

The first field, the keyword, provides information about the file named in the second field. Possible keywords are IN/in, OUT/out, CONSTANT/constant, and stdin/stdout/stderr. The case determines the file's location (except for std*):

constant/in/out	Legion context space
CONSTANT/IN/OUT/	local file space
stdin/stdout/stderr

The CONSTANT/constant keywords indicate identical input for multiple runs (e.g., input that is used for each run, such as a password). The std* keywords refer to local files that stdin, stdout, and stderr use. A separate stdout and stderr are created for each run. If you do not indicate a stdout file, standard output will be sent to your Legion tty object. Multiple IN/in, OUT/out, and CONSTANT/constant files may be specified, one per line.

The second field is the name of an input or output file that your program expects to read or write to in the local directory of whichever host it runs on.

The third field is the naming pattern for input and output files. You may use "*" as a pattern holder for IN/in and OUT/out files. The pattern holder allows you to specify large groups of files in different directories for input and output.

For example, suppose that you want to run program Foo. Foo needs two input files, FooFoo and MyFoo, and a password. It also produces one output file, Bar. Your specification file would look like this:

IN		FooFoo		/mylocaldir/foo*.ent
in		MyFoo		/mycontext/bar*.gif
OUT		Bar		/mylocaldir/output*.stuff
CONSTANT	secret		/etc/passwd

Legion looks in local space for files that fit the specified patterns and finds five sets of files:

IN LOCAL SPACE

IN CONTEXT SPACE

/mylocaldir/foo1.ent
/mylocaldir/foo2.ent
/mylocaldir/foo3.ent
/mylocaldir/foo4.ent
/mylocaldir/foo5.ent
    /mycontext/bar1.gif
/mycontext/bar2.gif
/mycontext/bar3.gif
/mycontext/barA.gif
/mycontext/barB.gif

Notice that the first three files in each column have common patterns (1, 2, and 3) whereas the last two do not (4 and 5, A and B). The files with mismatched patterns are discarded, leaving three sets of input files, or three jobs that can be run:

IN LOCAL SPACE

IN CONTEXT SPACE

/mylocaldir/foo1.ent
/mylocaldir/foo2.ent
/mylocaldir/foo3.ent
    /mycontext/bar1.gif
/mycontext/bar2.gif
/mycontext/bar3.gif

These jobs can be identified by their pattern: 1, 2, and 3. Before going any further, Legion checks for any previously created output files. If it finds a file called /mylocaldir/output1.stuff it knows that job 1 has already been run. The remaining jobs, 2 and 3, have not. Legion then copies the remaining four input files to the two hosts that will run the jobs:

/mylocaldir/foo2.ent
/mycontext/bar2.gif
/mylocaldir/foo3.ent
/mycontext/bar3.gif

copied to -->

HostA/FooFoo
HostA/MyFoo
HostB/FooFoo
HostB/MyFoo

The jobs are run and two output files, each called Bar, are created on HostA and on HostB. HostA/Bar's contents are copied to /mylocaldir/output2.stuff, a file in the original host's local directory space, signalling that job 2 is finished. HostB/Bar's contents are copied to /mylocaldir/output3.stuff, signalling that job 3 is finished.

A more complex specification file might look like this:

IN		FILE1	 	/usr/localtmp/dump/pdb*.ent
in		NEXT2		/home/an4m/pictures/*.gif
OUT		THEN3		 output*
out		FILE4		./dssp*log
CONSTANT	Crypt	 	/etc/passwd
constant	Salt-File	/home/admin/user-list

The program must be written so as to read/write the files named in the second field from its current directory. In this case, the registered program must read FILE1, NEXT2, Crypt, and Salt-File, and write THEN3 and FILE4. When legion_run_multi is called on the registered program with the above specification file, it looks for matching patterns:

/usr/localtmp/dump/pdb1egs2.ent  

in your local directory

/home/an4m/pictures/1egs2.gif

in your Legion context

These are sent as input files for run 1egs2, which generates two output files:

output1egs2  

in your local directory

./dssp1egs2log

in your Legion context

Exception file

The exceptions file is an addendum to the specification file. It specifies additional switches for specific patterns. This allows you greater flexibility and control when running large batches of jobs. The information in the exceptions file is passed directly to legion_run or, if you use the -e flag, to whichever command you have specified (legion_run_multi command is essentially a script that calls multiple legion_runs or whichever command specified).

For example, suppose that the following jobs match the patterns in your specification file:

ABC
DEFGH
123
9999
GIRAFFE

The specification file already provides input/output/constant information for all of these jobs and command line flags provide additional instructions. However, you wish to fine-tune these instructions for just jobs DEFGH and 123. In the exceptions files, you can specify these two jobs and give your additional instructions:

DEFGH -h /hosts/special_host_name
123 -v -IN /tmp/myJob/ExtraStuff

The first line tells Legion that you wish to run job DEFGH on a specific host. The second line says that job 123 requires an additional input file and that you wish to run the job in a verbose form.

These instructions combine with any other specifications that were provided on the command line and/or in the specification file.

Limitations, caveats, warnings

  1. Be sure that you use legitimate syntax in the exceptions file. Do not specify flags that do not exist for that particular command (e.g., there are no -CONSTANT/-constant flags in legion_run).
  2. If you specify a host in the exceptions file and an architecture on the command line (with the -a flag), be sure that they match. If you specify conflicting architectures Legion will be unable to process that particular job (but it will run any other jobs).

Parameters

You must specify at least one of the following two parameters:

-n <number of processors>

Specify the number of processes running at one time.

-s[chedule] <schedule file>

Specify a set of host context paths and how many processes are to be run on each host (e.g., /hosts/Foo 3 to run up to three jobs on host object Foo). The format of the schedule file is as below. One host per line.

/hosts/firsthost	2
/hosts/secondhost	1
/hosts/thirdhost	7

You must include both of the following parameters:

-f[ile] <specification file path>

Local path name of specification file.

<program class name>

The program's Legion class name (created when the program was registered in Legion).

Options

The following options are available with this command:

-help

Print command syntax and exit.

-debug

Turn debugging mode on. In this mode, legion_run_multi prints copious output regarding which patterns must be run. This flag also turns on the verbose mode (-v).

-v[erbose]

Provide a verbose output as the command is running.

-z[ero]

Consider input and output files of zero size to be valid. If this option is used, a zero-sized input file can be used to start a job and a zero-sized output file can be used to signal a completed job.

-r[estart]

Restart incomplete jobs (will attempt five times max).

-e[xec] <command>

Run some other <command>. The default is legion_run.

-x <exceptions file path>

The exceptions file's Unix path.

-a[rch] <architecture>

Platform architecture for remote hosts.

-d[ir] <dir>

Remote directory to which each job must change before execution.

-D <var=value>

Environment variable to be set for each job.

-t[ime] <timefile>

Specify a file for recording the job history of the processes. This file will list jobs by their name and gives the time that each job was checked, when it was checked (measured in seconds since January 1, 1970), the status that was returned (e.g., Started, Error), and (if known) the host that it was run on. The file is in your local directory.

--

End switches for legion_run_ multi.

<arg1> <arg2>... <argn>

Provide arbitrary command-line arguments for the registered program.

The possible architectures that can be used with the -a flag are limited. At the moment they are:

Architecture

Corresponds to

Comments

solaris

Sun workstations running Solaris 5.x

 

sgi

SGI workstations running IRIX 6.5

 

linux

x86 running Red Hat 6.x Linux

 

alpha_linux

DEC Alphas running Red Hat 6.x Linux

 

alpha_DEC

DEC Alphas running OSF1 v4

 

rs6000

IBM RS/6000s running AIX 4.2.1

 

hppa_hpux

HPUX 11

 

t90

Cray T90s running Unicos 10.x

virtual hosts only (see page 69 in the System Administrator Manual)

t3e

Cray T3E running Unicos/mk 2.x

virtual hosts only (see page 69 in the System Administrator Manual)

rpc.lmountd
     [{-f <exports file local path>} |
      {--exports-file=<exports file local path>}]
     [{-d | --debug} <facility>]
     [{-P | --port} <port number>]
     [-F | --foreground] [-h | --help]
     [-n | --allow-non-root] [-p | --promiscuous]
     [-t | --no-spoof-trace] [-v | --version]

The Legion NFS mount daemon enables you to NFS-mount a Legion file system to your local machine.6 Your local machine must have access to a working Legion net. Please note that you must also run lnfsd (below) in order to start an Legion NFS session. Currently, the Legion NFS mount daemon only works on Linux and Alpha Linux platforms. Please contact us at <legion-help@cs.virginia.edu> if you need to run it on another architecture.

Legion context space is mounted via the same means used to mount a normal NFS file system (i.e., mount). When you NFS-mount context space, lmountd checks your MOUNT request against your local exports file (the list of exported file systems). If your machine is listed, lmountd creates and returns a file handle and adds an entry in /etc/rmtab.

To end the session, run umount. Upon receipt of your UMOUNT request, lmountd removes your client's entry from rmtab. The client kernel unmounts the file system.

Please note that a Legion NFS mount does not give you permission to view or use Legion objects: if your Legion system requires a Legion login you must run legion_login (page 68) as a separate step.

The lmountd should be run locally.

Please read the Security and Example sections of the rpc.lnfsd command (page 111).

Exports file

The exports file contains a list of what file systems are available for export and which machines are allowed to mount those file systems. The lmountd reads the root of context space as /legion, so it will listed in the exports file in the following format:

/legion <host name>(rw)

The (rw) indicates that the named host is allowed to read and write requests to /legion.

The export file's default path is /etc/exports but you can specify another path with the -f/--exports-file flag.

The following options are supported:

-f | --exports file <exports file local path>

Specify a local path for the exports file (page 109). Default is /etc/exports. For security reasons, the Legion NFS daemons are run locally, so only the local host should be listed in this file. In the future, this will be mandatory and this option will be eliminated.

-d | --debug <facility>

Log operations verbosely. The current legal values for <facility> are call for logging RPC calls and arguments, fhcache for file handle cache operation, and auth for authentication routines.

-P | --port <port number>

Specify a port number for lmountd. Default is a random number under 1024. Please note that unlike NFS mountd, lmountd does not register with portmap (this ensures that NFS RPC requests bound for NFS servers or mount daemons are not inadvertently intercepted). So while you do not have to specify a port, this implies that you should.

-F | --foreground

Run lmountd in the foreground. If debugging is also requested, it will be sent to stderr.

-h | --help

Print command syntax and exit.

-n | --allow-non-root

Allow incoming NFS-mount requests to be honored even if they do not originate from reserved IP ports. Please read the Security section of rpc.lnfsd (page 111), before using this option.

-p | --promiscuous

Run in promiscuous mode, so that the server will serve any host on the network. This violates lnfsd's security premises, so it is deprecated.

-t | --no-spoof-trace

By default, lmountd logs every access by unauthorized clients. This option turns off logging of such spoof attempts for all hosts listed explicitly in the exports file.

-v | --version

Print the current version number of the program.

The daemon recognizes the following signals:

SIGHUP

Causes lmountd to reread the exports file and any access restrictions in /etc/hosts.allow and /etc/hosts.deny. If you have altered these files and wish your changes to take effect, you must send SIGHUP to lnfsd as well.

SIGTERM

Kills the daemon. This is the preferred method, since it allows proper termination of the object.

SIGUSR1

If lmountd is invoked with debugging options, this signal toggles generation of debugging information

rpc.lnfsd
     [{-f <exports file local path>} |
      {--exports-file=<exports file local path>}]
     [{-d | --debug} <facility>]
     [{-P | --port} <port number>]
     [-F | --foreground] [-h | --help]
     [-l | --log-transfers] [-n | --allow-non-root]
     [-p | --promiscuous] [-t | --no-spoof-trace]
     [-v | --version]

The Legion NFS process server daemon interposes an NFS client and the Legion file system.7 It is a user-level process derived from Linux nfsd. It accepts NFS RPC requests from a kernel client and translates them to invocations on Legion objects. The results are returned to the daemon, translated into an RPC reply, and forwarded to the client. Currently, the Legion NFS process server daemon only works on Linux and Alpha Linux platforms. Please contact us at <legion-help@cs.virginia.edu> if you need to run it on another architecture.

The lnfsd should be run locally.

You must also run lmountd (with rpc.lmountd, page 109) in order to mount Legion NFS. Both of these daemons must be on the client Legion machine. We recommend that you run them as root (see Security, below, for more information).

Security

The two Legion NFS daemons, lnfsd and lmountd, provide an NFS-like file system while preserving Legion's stronger security features. The NFS client authenticates itself to lnfsd by passing the requesting user's Unix uid. This authentication is predicated on trust between the client's kernel (and thus NFS) and the NFS daemon (in this case, lnfsd). The interface between the NFS kernel client and lnfsd follows NFS protocol.

We recommend that lnfsd run as root.

Both lnfsd and lmountd are responsible for bridging semantic differences between Unix and Legion security mechanisms and must translate a Unix uid to Legion credentials. Users must be logged in with the legion_login (page 68) utility to access objects in a secure Legion system. This utility creates a session file, corresponding to a Unix uid, in users' local /tmp directory. An object's ACL list determines who has access rights to the object. In an NFS-mounted Legion system, once a Unix uid has been mapped to Legion credentials access to Legion objects is authenticated and authorized via Legion mechanisms. This prevents unauthorized access to the object through Unix.

We expect that the daemons will execute with root privileges on the client's host and will accept access only from privileged ports. Assuming that root and root-privileged processes (including the NFS kernel client) are not compromised, lnfsd and mlountd are secure from malicious attacks. File handles are never exposed on the wire.

If a malicious remote user becomes root on the client host and impersonates a valid user by using a valid uid on an NFS request, he may attempt to subvert lnfsd via a bogus RPC request. These type of attacks are avoided by cryptographically signing file handles.

A malicious user may also attempt to subvert lmountd. However, the daemon will only respond to the local host, so a rogue user cannot obtain the file handle needed for subsequent transactions.

Legion uses fine-grained ACLs for authorization: they record individual and group users' abilities to invoke individual methods. Translations between Legion ACLs and Unix permission bits are therefore lossy. Individual methods must be classified as read, write, execute, or some combination of these in order to correspond to Unix permission bits.

Legion provides a global name space, including individual accounts, which may not correspond to any particular Unix account on a local machine. Legion does not assign group ownership to an object, so object ownership and groups within a Legion NFS mounted file system do not directly adhere to Unix semantics. A Legion object's owner may not hold an account on the client machine, so the owner's Legion credentials cannot be mapped to a Unix uid. Therefore, all objects in a Legion NFS mounted file system appear to be owned by the user accessing the objects. Unix permissions bits are set according to that user's ability to access the object. Group permissions mirror other permissions.

Example

To NFS-mount a Legion system, you need to start the two Legion NFS daemons, lmountd and lnfsd, then run mount. Both lmountd and lnfsd need static ports, which are passed to mount. We suggest that you use the -P option to manually select a port. These steps might look something like this:

$ rpc.lmountd -f /local_path/exports -P 2001 
$ rpc.lnfsd -f /local_path/exports -P 2000 
$ mount -t nfs -o   port=2000,mountport=2001 \
    MyHost:/legion /legion-dir

The first command starts lmountd and the second lnfsd. The third mounts Legion in local file space. Note that the first two commands provide a path for the exports file and a port number. If you do not manually select a port one will be chosen, but you'll need to know the port number and pass it to mount.

Note that lmountd interprets the root of Legion context space (/) as /legion. In the mount command, therefore, you must pass it /legion as the root of the directory to be mounted. In the example above, /legion-dir directory is the local path pointing to the mounted directory but you can use any existing path.

The following options are supported:

-f | --exports-file <exports file local path>

Specify a local path for the exports file. Default is /etc/exports. For security reasons, the Legion NFS daemons are run locally, so only the local host should be listed in this file. In the future, this will be mandatory and this option will be eliminated.

-d | --debug <facility>

Log operations verbosely. The current legal values for <facility> are call for logging RPC calls and arguments, fhcache for file handle cache operation, and auth for authentication routines.

-P | --port <port number>

Specify a port number for lnfsd. The default is 2049. Note that unlike a normal NFS server, lnfsd does not register with portmap (this ensures that NFS RPC requests bound for NFS servers or mount daemons are not inadvertently intercepted).

-F | --foreground

Run in the foreground. If debugging is also requested, it will be sent to stderr.

-h | --help

Print command syntax and exit.

-l | --log-transfers

Try to catch all files retried from and written to lnfsd. For each file stored or retrieved, a single line containing the client's IP address and the file's name is written to the system log daemon.

-n | --allow-non-root

Allow incoming NFS requests to be honored even if they do not originate from reserved IP ports. Please see the Security section on page 111 before using this option.

-p | --promiscuous

Run in promiscuous mode, so that the server will serve any host on the network. This violates lnfsd's security premises, so it is deprecated.

-t | --no-spoof-trace

By default, lnfsd logs every access by unauthorized clients. This option turns off logging of such spoof attempts for all hosts listed explicitly in the exports file.

-v | --version

Print the current version number of the program.

The following signals are supported:

SIGHUP

Reread the exports file and flush the file handle cache.

SIGIOT

When compiled with the -DCALL_PROFILING option, this signal will cause lnfsd to dump the average execution times per NFS operation into /tmp/nfsd.profile.

SIGTERM

Kills the daemon. This is the preferred method, since it allows proper termination of the object.

SIGUSR1

If lnfsd is invoked with debugging options, this signal toggles generation of debugging information.

Known bugs

Currently, chown is unsupported, since the mapping between a Unix target UID and the corresponding Legion principal may not be available.

Legion contexts do not share the "append-only" implementations of Unix directories. On some platforms (such as x86 Linux), this discrepancy manifests itself as an error when you attempting to remove large directories that require multiple invocations of readdir. For example, if you run "rm -rf" on a large context some subcontext entries may be left intact. You will see an error message indicating that the parent context could not be deleted because it was not empty. To get around this, keep retrying the operation until all entries are removed and the parent context can be deleted.


1. There is set of special flags for running in a fault tolerant mode listed on page 90.

2. The output files will be copied out when the program calls MPI_Finalize() or MPI_Abort().

3. The output files will be copied out when the program calls MPI_Finalize() or MPI_Abort().

4. In a secure system, the default context is /home/<user_name>/mpi/instances/<program_name>.

5. In all cases, legion_run blocks for ten seconds after you start the command. This is to be sure that there are no fatal errors that prevent the job from starting, such as a dead remote host or conflicting parameters.

6. Legion NFS mount relies on the mount and umount Unix command-line utilities. Please see the associated man pages for information about using these tools. For more information about NFS mounts, you can also see the exports, mountd, and nfsd man pages. The two Legion NFS daemons and supporting materials were derived from the Unix User-Space NFS Server Version 2.2. Key contributors/authors were Mark Shand, Don Becker, Rick Sladkey, and Olaf Kirch.

7. Legion NFS mount relies on the mount and umount Unix command-line utilities. Please see the associated man pages for information about using these tools. For more information about NFS mounts, you can also see the exports, mountd, and nfsd man pages. The two Legion NFS mount daemons and supporting materials were derived from the Unix User-Space NFS Server Version 2.2. Key contributors/authors were Mark Shand, Don Becker, Rick Sladkey, and Olaf Kirch.

Directory of Legion 1.8 Manuals
[Home] [General] [Documentation] [Software]
[Testbeds] [Et Cetera] [Map/Search]

Free JavaScripts provided by The JavaScript Source

legion@Virginia.edu
http://legion.virginia.edu/