|Before running the command|
|1. ||What does this command do?||[go]|
|2. ||What are the prerequisites?||[go]|
|3. ||What is a probe file?||[go]|
|4. ||Can I have more than one probe file for a job?||[go]|
|5. ||What if I don't know the probe file name?||[go]|
|5. ||Can other people use my probe files to check my jobs?||[go]|
|Moving input and output files|
|1. ||How do I pass an input file to a program that has already started?||[go]|
|2. ||How do I pick up an output file?||[go]|
|3. ||What if my output file hasn't been created yet?||[go]|
|4. ||Can I check to see what files are in the job's current working directory?||[go]|
|5. ||Can I check a particular file's status?||[go]|
|1. ||How do I find out what host the program is on?||[go]|
|2. ||How do I find out what directory the job is in?||[go]|
|3. ||How do I check my job's status?||[go]|
|The -kill option|
|1. ||What does the -kill option do?||[go]|
|2. ||What if I don't clean up after a terminated job?||[go]|
|3. ||What happens if I kill an active job?||[go]|
|4. ||What is context scratch space and how do I find out where it is?||[go]|
|5. ||How do I change my context scratch space?||[go]|
|6. ||How do I get my job's directory out of context scratch space?||[go]|
|1. ||Some hints to make life easier||[go]|
What does this command do?
- The legion_probe_run command checks a programs started with legion_run. You can get the name of the remote host, the remote job's directory, and the job's status. You can also move input and output files between your local host, the remote host, and context space. The command is fully documented here.
What are the prerequisites?
- You need to know the name of the job's probe file. This command is run after you have started a job on a remote host with legion_run. When you run legion_run you can create a probe file by means of the -p flag. This file is created in your local execution environment, so you'll need to run legion_probe_run in the same environment.
What is a probe file?
- A probe file is a text file that contains information for contacting the remote job. It is only associated with the one job.
Can I have more than one probe file for a job?
- No. You also can't reuse a probe file. You can create serial files with the same name, however: if an old probe file still exists under that name, Legion will simply write over it.
What if I don't know the probe file name?
- This command can't contact the job if it doesn't read the job's probe file, so you are out of luck if you don't know the file's name. We suggest that you give the probe file a name that is logically related to the job name, to keep things simple.
Can other people use my probe files to check my jobs?
How do I pass an input file to a program that has already started?
- You can use the -in or -IN flags to copy files from context space or your local host onto the remote host. For example:
This will copy "myFile" from your local host to whatever host is running "Foo".
You can use the -list option to check to see what files are already in the job's current working directory if you are not sure which files are already there.
$ legion_probe_run -p Fooprobe -IN myFile
How do I pick up an output file?
- You can use the -out or -OUT flags to copy files the remote host into context space or your local host. For example:
This will copy "myFile" from whatever host is running "Foo" into a Legion file object called myOutput.
$ legion_probe_run -p probeName -out myOutput
What if an output file hasn't been created yet?
- If you try to pick up an output file that doesn't exist, you'll get an error message. Try again later.
Can I check to see what files are in the job's current working directory?
- Yes. Use the -list option to see a list of files in the job's current working directory.
Can I check a particular file's status?
- Yes. Use the -stat flag to check on a particular file.
How do I find out what host the program is on?
- The -hostname option will return the remote host's DNS name.
How do I find out what directory the job is in?
- The -pwd option will return the job's current working directory.
How do I check my job's status?
- The -statjob option will return the job's status. The status will be shown to as "Running", "Error", or "Done".
What does the -kill option do?
- This option kills the job, if it is still running, deletes its working directory, and deletes its probe file. If the job has already finished running, it deletes the working directory and probe file. This option can be used in either blocking or nonblocking runs. In nonblocking runs, it should be used to clen up the remote directory after the job is finished.
What if I don't clean up after a terminated job in a nonblocking run?
- The remote host will hold on to the job's working directory for six hours. If you do nothing, the host will tar and compress the directory and move it into your context scratch space. If you run legion_probe_run during that period to check the job's status or pick up output files, the clock will restart and you will have another six hours.
What happens if I kill an active job?
- If you kill an active job, you will lose all data from that job on the remote host. The entire working directory is deleted, including all input and output files.
What is context scratch space and how do I find out where it is?
- Context scratch space is simply a portion of your context space that has been designated as a backup storage spot for any remote jobs that are not cleaned up by either legion_run or the user. The default space is /tmp but you can reset it. You can check to see where it is set by either looking at the $LEGION_SCRATCH_CONTEXT variable or using the legion_probe_run -showscratch flag.
How do I change my context scratch space?
- There are several ways:
- You can set the $LEGION_SCRATCH_CONTEXT variable,
- Use legion_run's -setscratch or -D flag, or
- Use legion_probe_run's -setscratch flag.
How do I get my job's directory out of context scratch space?
- First you need to get it out of context space and onto your local host. Use the legion_cp tool:
$ legion_cp -localdest <tar_file> <local_file_name.tar.Z>
Be sure to give the tar file a "tar.Z" suffix. You then need to run uncompress and tar.
$ uncompress <local_file_name.tar.Z>
$ uncompress <local_file_name.tar>
These are common Unix tools available on all platforms. If you can't use them, please contact us at firstname.lastname@example.org.
Some hints to make life easier
- Determine whether or not you'll need to run this command before you start your program.
- Give your probe file a name that is reasonably related to the program name.
- If you are running the program in nonblocking mode, calculate how long it will run so that you can check on it after it finishes.