From Beocat
Jump to: navigation, search
No edit summary
No edit summary
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Submitting your first job ==
== SGE is gone, long live [[SlurmBasics|slurm]] ==
To submit a job to run under SGE, we use the <code>qsub</code> command. qsub (queue submit) takes the commands you give it, and runs it through the scheduler, which finds the optimum place for your job to run. With over 150 nodes and 2500 cores to schedule, as well as differing priorities, hardware, and individual resources, the scheduler's job is not trivial.
 
There are a few things you'll need to know before running qsub.
* How many cores you need. Note that unless your program is created to use multiple cores (called "threading"), asking for more cores will not speed up your job. This is a common misperception. '''Beocat will not magically make your program use multiple cores!''' For this reason the default is 1 core.
* How much time you need. Many users when beginning to use Beocat neglect to specify a time requirement. The default is one hour, and we get asked why their job died after one hour. We usually point them to the [[FAQ]].
* How much memory you need. The default is 1GB. If your job uses significantly more than you ask, your job will be placed on hold until you fix your request.
* Any advanced options. See the [[AdvancedSGE]] page for these requests. For our basic examples here, we will ignore these.
 
So let's now create a small script to test our ability to submit jobs. Create the following file (either by copying it to Beocat or by editing a text file and we'll name it <code>myhost.sh</code>. Both of these methods are documented on our [[LinuxBasics]] page.
<syntaxhighlight lang="bash" line>
#!/bin/sh
hostname
</syntaxhighlight>
 
Be sure to make it executable
chmod u+x myhost.sh
 
Now, let's first run it on the headnode. As I write this, I'm logged into the headnode named 'minerva'. When I run it, it looks like this:
% ./myhost.sh
minerva
 
So, now lets submit it as a job and see what happens. Here I'm going to use three options
* <code>-l mem=</code> tells how much memory I need. In my example, I'm using our system minimum of 512 MB, which is more than enough. Note that your memory request is '''per core''', which doesn't make much difference for this example, but will as you submit more complex jobs.
* <code>-l h_rt=</code> tells how much runtime I need. This can be in the form of ''seconds'', or ''hours'':''minutes'':''seconds''. This is a very short job, so 60 seconds should be plenty. Note that if you submit a job that needs to run for days or weeks, you'll need to translate that into hours. This can't be changed after the job is started please make sure you have requested a sufficient amount of time.
* <code>-pe single 1</code> tells SGE that I need only a single core on one machine. The [[AdvancedSGE]] page has much more on the "Parallel Environment" switch.
 
% '''ls'''
myhost.sh
% '''qsub -l h_rt=60 -l mem=512M -pe single 1 ./myhost.sh'''
INFO: Requested resources for this job: pe=single cores=1 time(HH:MM:SS or SS)=60 mem(per-core)=512M
Your job 1483446 ("myhost.sh") has been submitted
 
Since this is such a small job, it is likely to be scheduled almost immediately, so a minute or so later, I now see
% '''ls'''
myhost.sh
myhost.sh.po1483446
myhost.sh.pe1483446
myhost.sh.e1483446
myhost.sh.o1483446
 
The four additional files that were created are in the form ''scriptname''.''XX''.''jjjjjjj'' - where ''scriptname'' is the script you submitted, ''XX'' is 'po' (parallel output), 'pe' (parallel error), 'e' (error), or 'o' (output), and ''jjjjjjj'' is the job number (which is given when you submitted your job, in this case 1483446).
 
If everything goes as planned, the po, pe, and e jobs will be blank...
% '''cat myhost.sh.po1483446'''
% '''cat myhost.sh.pe1483446'''
% '''cat myhost.sh.e1483446'''
...and the .o file will show the hostname of the node that ran the job...
% '''cat myhost.sh.o1483446'''
mage03
 
== Monitoring Your Job ==
 
The <B>kstat</B> perl script has been developed at K-State to provide you with all the available information about your jobs on Beocat.  <B>kstat --help</B> will give you a full description of how to use it.
 
<B>kstat -z</B>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
This will give you a good summary of your jobs that are running and in the queue.
 
<b>
<font color=Brown>Hero43 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>
<font color=Blue>24 of 24 cores &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>
<font color=black>Load 23.4 / 24 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>
<font color=Red>495.3 / 512 GB used</font><br>
&nbsp;&nbsp;&nbsp;&nbsp;
<font color=lightgreen>daveturner &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>
<font color=black>unafold &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1234567 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>
<font color=cyan>1 core run &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>
<font color=black> 127 MB &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>
<font color=black> 0 d  5 h 35 m </font><br>
&nbsp;&nbsp;&nbsp;&nbsp;
<font color=green>daveturner &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>
<font color=black>octopus &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1234568 &nbsp;&nbsp;&nbsp;&nbsp;</font>
<font color=cyan>16 core run &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>
<font color=red> 125 GB &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>
<font color=black> 8 d 15 h 42 m </font><br>
<font color=green> ##################################  BeoCat Queue    ################################### </font><br>
&nbsp;&nbsp;&nbsp;&nbsp;
<font color=green>daveturner &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>
<font color=black>NetPIPE &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1234569 &nbsp;&nbsp;&nbsp;&nbsp; </font>
<font color=cyan>2 core &nbsp;&nbsp;&nbsp;</font>
<font color=green> qw &nbsp;</font>
<font color=black> 2h &nbsp;</font>
<font color=black> 4 GB &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>
<font color=red> killable &nbsp;&nbsp;&nbsp;&nbsp;</font>
<font color=black> 0 d 1 h 2 m </font><br>
</b>
 
<b>kstat</b> produces a separate line for each host.  Use <b>kstat -h</b> to see information on all hosts without the jobs.
For the example above we are listing our jobs and the hosts they are on.
 
Host names -
yellow background means reserved,
red background means down,
red means owned by the
group in orange on the right.<BR>
Core usage - yellow for empty, cyan for partially used, blue for all cores used.<BR>
Load level - yellow or yellow background indicates the node is being inefficiently used.  Red just means more threads than cores.<br>
Memory usage - yellow or red means most memory is used.  Exceeding memory will show disk swap in background red.
 
Each job line will contain the username, program name, job ID number, number of cores, maximum memory used, whether the job is killable, and the
amount of time the job has run.  If the job is still in the queue, it may contain information on the requested run time and memory per core and the time
shown is how long the job has been in the queue.
 
In this case, I have 2 jobs running on Hero43.  <i>unafold</i> is using 1 core while <i>octopus</i> is using 16 cores.  The most useful information here
is the memory being used in each case.  While <i>unafold</i> is taking very little memory, <i>octopus</i> is using 125 GB and the red
font indicates that it is close to the amount requested.  If the memory on a job is over the requested amount it will have a
red background and you should request more memory in future runs.  If the memory is flashing with a red background, you are more
than 50% over your requested amount and your code will be forced to use disk swap which can slow it down enormously.  You're usually
better off killing the job and restarting with an appropriate memory request.
If the code accesses large files, there may be an IO value reported.  This number is not very accurate.
 
<b> kstat -d 7 </b>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
This will show you information about the jobs that have completed in the last 7 days.
 
<b> kstat -c </b>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
This provides a global view of Beocat showing how many cores each person is using.
 
 
<font color=black>
You can also use the '<code>status</code>' command. By default this command will give you all of your queued and running jobs. If you have no jobs in the queue or running, it will simply quit with no output.
 
% status
Running jobs for user: kylehutson
  job-ID  # name                      start time          running in
-------------------------------------------------------------------
1550047  1 test1.sh                  05/22/2014 16:09:34 batch.q
Waiting jobs for user: kylehutson
  job-ID  # name                      submit time
--------------------------------------------------------
1550048 80 test2.sh                  05/22/2014 16:09:33
 
As you can see, this gives the start time for running jobs and submit time for jobs that have not yet started. You can use the '<code>-r</code>' switch to view a relative time.
 
% status -r
Running jobs for user: kylehutson
  job-ID  # name                      start time          running in
-------------------------------------------------------------------
1550047  1 test1.sh                        -0d 00:00:27 batch.q
Waiting jobs for user: kylehutson
  job-ID  # name                      submit time
--------------------------------------------------------
1550049 80 test2.sh                        -0d 00:00:03
 
That leaves you with the question, though - when will my waiting job start? We can't really give an absolute answer to that question, but we can show you where you stand relative to others. The tool for this is '<code>qstat</code>'.
 
Here is an edited version (just for space sake) of the jobs on Beocat at the time I'm writing this.
% qstat                                                                              0  4:10PM
job-ID  prior  name      user        state submit/start at    queue                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
1437729 0.25310 couplings  emilieg      Rr    04/22/2014 17:22:15 highmem.q@mage11.beocat          60
1457075 0.20816 au9_D2H_MC nvkarimova  r    04/30/2014 11:29:03 highmem.q@mage07.beocat          40
1457078 0.20815 au8_C3V_MC nvkarimova  r    05/03/2014 04:25:42 highmem.q@mage09.beocat          40
1460509 0.17885 MF_ncbd1  khlee        r    04/30/2014 11:13:22 long.q@scout64.beocat            16
1478510 0.13297 TDSE4.scri huiwei      r    05/05/2014 19:01:53 highmem.q@mage09.beocat            1
1478511 0.13297 TDSE5.scri huiwei      r    05/05/2014 19:01:53 highmem.q@mage09.beocat            1
1488568 0.08904 F_ncbd_270 khlee        r    05/15/2014 16:55:20 batch.q@elf61.beocat              24
1488553 0.83423 p.1bxl.10  beugels2    r    05/12/2014 16:08:16 batch.q@elf29.beocat              8
[...]
1550045 0.00093 cufflinks_ wrutter      r    05/22/2014 15:33:56 long.q@paladin12.beocat            4
1541856 0.35737 l149h      mmohan      qw    05/20/2014 11:53:32                                    8 1
1541857 0.35466 l149h      mmohan      qw    05/20/2014 11:53:34                                    8 1
1541858 0.35199 l149h      mmohan      qw    05/20/2014 11:53:36                                    8 1
[...]
1541888 0.31023 p.1zy3.43  beugels2    qw    05/20/2014 12:13:38                                    8
1549982 0.29499 p.1g5j.30  beugels2    qw    05/22/2014 05:25:02                                    8
1532569 0.04061 H9_310_Bil zhiguang    qw    05/18/2014 16:50:14                                  32
1549979 0.03024 zzz_bwrang blkpawn      qw    05/21/2014 23:58:40                                  80 8-36:1
1549937 0.02694 Mira_denov wrutter      qw    05/21/2014 14:15:14                                  60
1549989 0.02647 zzz_bwrcor blkpawn      qw    05/22/2014 11:43:19                                  80 1-3:1
1451971 0.02127 run.sh    zhangs84    hRq  04/25/2014 18:54:21                                    1
1549913 0.01965 H9_298_Mle zhiguang    qw    05/21/2014 10:03:42                                  32
1549914 0.01965 H9_310_Ml2 zhiguang    qw    05/21/2014 10:04:29                                  32
1549957 0.01741 H_5310BN  zhiguang    qw    05/21/2014 17:01:46                                  32
1549915 0.01412 pbwa.sh    jpoland      qw    05/21/2014 10:11:18                                  16
1549985 0.01211 H5_298_Bil zhiguang    qw    05/22/2014 09:32:34                                  32
1549987 0.00917 F_1VII    khlee        qw    05/22/2014 10:40:49                                  24
1549988 0.00917 C_1VII    khlee        qw    05/22/2014 10:40:59                                  24
1550046 0.00239 ni_bulk    songliu      qw    05/22/2014 15:47:57                                    8
1489958 0.00000 align      liu3zhen    hqw  05/12/2014 17:55:43                                  128
 
The column I want you to pay attention to first is the 'state'. The two most common statuses you'll see are 'r' ('Running') and 'qw' ('Queued and Waiting'). ''Generally speaking'', those jobs that are higher on the list will start running before the ones lower on the list. This way you can see your relative position. Another useful tool is to see how busy Beocat is. [http://ganglia.beocat.ksu.edu/ http://ganglia.beocat.ksu.edu/] will give you those statistics. Depending on the resources you ask for, a job you submit may start immediately or may take up to several weeks, depending on the priority of your job, the resources available, and the requested resources of the jobs ahead of you in the queue.
 
If you want to read more, continue on to our [[AdvancedSGE]] page.

Latest revision as of 22:10, 19 September 2018

SGE is gone, long live slurm