MediaWiki API result

This is the HTML representation of the JSON format. HTML is good for debugging, but is unsuitable for application use.

Specify the format parameter to change the output format. To see the non-HTML representation of the JSON format, set format=json.

See the complete documentation, or the API help for more information.

{
    "batchcomplete": "",
    "continue": {
        "gapcontinue": "Training_Videos",
        "continue": "gapcontinue||"
    },
    "query": {
        "pages": {
            "30": {
                "pageid": 30,
                "ns": 0,
                "title": "SlurmBasics",
                "revisions": [
                    {
                        "contentformat": "text/x-wiki",
                        "contentmodel": "wikitext",
                        "*": "== The CentOS/Slurm nodes ==\n\nWe have converted Beocat from Gentoo Linux to CentOS Linux on December 26th of 2017.  Any applications or libraries from the old system must be recompiled.  We also converted Beocat to use the Slurm scheduler instead of SGE.  You will therefore also need to convert all your old qsub scripts over to sbatch scripts.  We have developed tools to make this process as easy as possible.  \n\n<H3>Using Modules</H3>\n\nIf you're using a common code that others may also be using, we may already have it compiled in a module.  You can list the modules available and load an application as in the example below for Vasp.\n\neos>  <B>module avail</B><BR>\neos>  <B>module load VASP</B><BR>\neos>  <B>module list</B>\n\nWhen a module gets loaded, all the necessary libraries are also loaded and the paths to the libraries and executables are automatically set up.  Loading Vasp for example also loads the OpenMPI library needed to run it and adds the path to the MPI commands and Vasp executables.   To see how the path is set up, try executing <B><I>which vasp_std</I></B>.  The module system allows you to easily switch between different version of applications, libraries, or languages as well.\n\nIf you are using a custom code or one that is not installed in a module, you'll need to recompile it yourself.  This process is easier under CentOS as some of the work just involves loading the necessary set of modules.  The first step is to decide whether to use the Intel compiler toolchain or the GNU toolchain, each of which includes the compilers and other math libraries.  The module commands for each are below, and you can load these automatically when you log in by adding one of these module load statements to your .bashrc file.  See <B>/homes/daveturner/.bashrc</B> as an example, where I put the module load statements .\n\nTo load the Intel compiler tool chain including the Intel Math Kernel Library (and OpenMPI):<BR>\neos>  <B>module load iomkl</B><BR>\n\nTo load the GNU compiler tool chain including OpenMPI, OpenBLAS, FFTW, and ScalaPack load foss (free open source software):<BR>\neos>  <B>module load foss</B><BR>\n\nModules provide an easy way to set up the compilers and libraries you may need to compile your code.  Beyond that there are many different ways to compile codes so you'll just need to follow the directions.  If you need help you can always email us at <B>beocat@cs.ksu.edu</B>.\n\n<H3>Converting your qsub script for sbatch using <I>kstat.convert</I></H3>\n\nIf you already have a qsub script, I have created a new perl program called kstat.convert that will automatically convert your qsub script over to an sbatch script.\n\n<B>kstat.convert --sge qsub_script.sh --slurm slurm_script.sh</B>\n\nBelow is an example of a simple qsub script and the resulting sbatch script after conversion.\n\n<syntaxhighlight lang=\"bash\">\n#!/bin/bash\n#$ -j y\n#$ -cwd\n#$ -N netpipe\n#$ -P KSU-CIS-HPC\n\n#$ -l mem=4G\n#$ -l h_rt=100:00:00\n#$ -pe single 32\n\n#$ -M youreID@ksu.edu\n#$ -m ab\n\nmpirun -np $NSLOTS NPmpi -o np.out\n</syntaxhighlight>\n\n<syntaxhighlight lang=\"bash\">\n#!/bin/bash -l\n#SBATCH --job-name=netpipe\n\n#SBATCH --mem-per-cpu=4G   # Memory per core, use --mem= for memory per node\n#SBATCH --time=4-04:00:00   # Use the form DD-HH:MM:SS\n#SBATCH --nodes=1\n#SBATCH --ntasks-per-node=32\n\n#SBATCH --mail-user=youreID@ksu.edu\n#SBATCH --mail-type=ALL   # same as =BEGIN,FAIL,END\n\nmpirun -np $SLURM_NPROCS NPmpi -o np.out\n</syntaxhighlight>\n\nThe sbatch file uses <B>#SBATCH</B> to identify command options for the scheduler where the qsub file uses <B>#$</B>.  Most options are similar but simply use a different syntax.  The memory can still be defined on a per core basis as with SGE, or you can use <B>--mem=128G</B> to specify the total memory per node if you'd prefer.  The <B>--nodes=</B> and <B>--ntasks-per-node=</B> provide an easy way to request the core configuration you want.  If your code can be distributed across multiple nodes and you don't care what the arrangement is, you can instead just specify the number of cores using <B>--ntasks=</B>.  For more in depth documentation on converting from SGE to Slurm follow the links below:\n\nhttps://srcc.stanford.edu/sge-slurm-conversion<BR>\nhttps://slurm.schedmd.com/sbatch.html\n\n<H3>Submitting jobs to Slurm</H3>\n\nOnce your qsub script has been converted to an sbatch script and you have an application compiled for CentOS, you can submit the job using the <B>sbatch</B> command.\n\neos> <B>sbatch sbatch_script.sh</B><BR>\neos> <B>kstat  --me</B>\n\nThis will submit the script and show you a list of your jobs that are running and the jobs you have in the queue.  By default the output for each job will go into a <B>slurm-###.out</B> file where ### is the job ID number.  If you need to kill a job, you can use the <B>scancel</B> command with the job ID number.\n\n== Submitting your first job ==\nTo submit a job to run under Slurm, we use the <B><I>sbatch</I></B> (submit batch) command.  The scheduler finds the optimum place for your job to run. With over 300 nodes and 7500 cores to schedule, as well as differing priorities, hardware, and individual resources, the scheduler's job is not trivial and it can take some time for a job to start even when there are empty nodes available.\n\nThere are a few things you'll need to know before running sbatch.\n* How many cores you need. Note that unless your program is created to use multiple cores (called \"threading\"), asking for more cores will not speed up your job. This is a common misperception. '''Beocat will not magically make your program use multiple cores!''' For this reason the default is 1 core.\n* How much time you need. Many users when beginning to use Beocat neglect to specify a time requirement. The default is one hour, and we get asked why their job died after one hour. We usually point them to the [[FAQ]].\n* How much memory you need. The default is 1 GB. If your job uses significantly more than you ask, your job will be killed off.\n* Any advanced options. See the [[AdvancedSlurm]] page for these requests. For our basic examples here, we will ignore these.\n\nSo let's now create a small script to test our ability to submit jobs. Create the following file (either by copying it to Beocat or by editing a text file and we'll name it <code>myhost.sh</code>. Both of these methods are documented on our [[LinuxBasics]] page.\n<syntaxhighlight lang=\"bash\" line>\n#!/bin/sh\nsrun hostname\n</syntaxhighlight>\n\nBe sure to make it executable\n chmod u+x myhost.sh\n\nSo, now lets submit it as a job and see what happens. Here I'm going to use five options\n* <code>--mem-per-cpu=</code> tells how much memory I need. In my example, I'm using our system minimum of 512 MB, which is more than enough. Note that your memory request is '''per core''', which doesn't make much difference for this example, but will as you submit more complex jobs.\n* <code>--time=</code> tells how much runtime I need. This can be in the form of \"minutes\", \"minutes:seconds\", \"hours:minutes:seconds\", \"days-hours\", \"days-hours:minutes\" and \"days-hours:minutes:seconds\". This is a very short job, so 1 minute should be plenty. This can't be changed after the job is started please make sure you have requested a sufficient amount of time.\n* <code>--cpus-per-task=1</code> tells Slurm that I need only a single core per task. The [[AdvancedSlurm]] page has much more on the \"cpus-per-task\" switch.\n* <code>--ntasks=1</code> tells Slurm that I only need to run 1 task. The [[AdvancedSlurm]] page has much more on the \"ntasks\" switch.\n* <code>--nodes=1</code> tells Slurm that this must be run on one machine. The [[AdvancedSlurm]] page has much more on the \"nodes\" switch.\n* <code>--nodes=4 --ntasks-per-node=16 --constraint=elves</code> requests 4 nodes with 16 cores on each and to only use the Elves.\n\n % '''ls'''\n myhost.sh\n % '''sbatch --time=1 --mem-per-cpu=512M --cpus-per-task=1 --ntasks=1 --nodes=1 ./myhost.sh'''\n salloc: Granted job allocation 1483446\n\nSince this is such a small job, it is likely to be scheduled almost immediately, so a minute or so later, I now see\n % '''ls'''\n myhost.sh\n slurm-1483446.out\n\n % '''cat slurm-1483446.out'''\n mage03\n\n== Monitoring Your Job ==\n\nThe <B>kstat</B> perl script has been developed at K-State to provide you with all the available information about your jobs on Beocat.  <B>kstat --help</B> will give you a full description of how to use it.\nThe Slurm version of kstat is very similar to the SGE version, with the exception that the actual memory usage of each job is not always available so the\nmemory requested is reported, and the memory usage on each node is not always accurate since Slurm includes disk cache.  We are continuing to look\nfor better ways to get the memory usage for each job, but at the moment you may need to use [http://ganglia.beocat.ksu.edu/ Ganglia] and look at the\nmemory graph for the node you are running on to get an accurate idea of the memory being used by your application.\n\nEos>  kstat --help\n\n USAGE: kstat [-q] [-c] [-g] [-l] [-u user] [-p NaMD] [-j 1234567] [--part partition]\n       kstat alone dumps all info except for the core summaries\n       choose -q -c for only specific info on queued or core summaries.\n       then specify any searchables for the user, program name, or job id\n \n kstat                 info on running and queued jobs\n kstat -q              info on the queued jobs only\n kstat -c              core usage for each user\n kstat -g              gpu nodes only\n kstat -l -h           long list - prints full node list\n kstat -u daveturner   job info for one user only\n kstat --me            job info for my jobs only\n kstat -j 1234567      info on a given job id\n kstat --nocolor       do not use any color\n \n --------------------------------------------------------------------------\n   Multi-node jobs are highlighted in Magenta\n      The switch and nodes/switch are on the right\n      highlighted in Yellow when nodes are spread across multiple switches\n   Shared jobs are highlighted in Cyan\n   Memory requested is reported along with the total used when available\n      Total RSS / Total VMSize / Total requested\n   Runtime is colorized with yellow then red for jobs nearing their time limit\n   Time in the queue is colorized yellow then red for jobs waiting long times\n --------------------------------------------------------------------------\n\nkstat can be used to give you a summary of your jobs that are running and in the queue:<BR>\n<B>Eos>  kstat --me</B><BR>\n\n<b>\n<font color=Brown>Hero43 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=Blue>24 of 24 cores &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=black>Load 23.4 / 24 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=Red>495.3 / 512 GB used</font><br>\n&nbsp;&nbsp;&nbsp;&nbsp;\n<font color=lightgreen>daveturner &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=black>unafold &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1234567 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=cyan>1 core &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=green>running &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=black> 4gb req &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=black> 0 d  5 h 35 m </font><br>\n&nbsp;&nbsp;&nbsp;&nbsp;\n<font color=green>daveturner &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=black>octopus &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1234568 &nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=cyan>16 core &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=green>running &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=red> 128gb req &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=black> 8 d 15 h 42 m </font><br>\n<font color=green> ##################################   BeoCat Queue    ################################### </font><br>\n&nbsp;&nbsp;&nbsp;&nbsp;\n<font color=green>daveturner &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=black>NetPIPE &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1234569 &nbsp;&nbsp;&nbsp;&nbsp; </font>\n<font color=cyan>2 core &nbsp;&nbsp;&nbsp;</font>\n<font color=red> PD &nbsp;</font>\n<font color=black> 2h &nbsp;</font>\n<font color=black> 4gb req &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=black> 0 d 1 h 2 m </font><br>\n</b>\n\n<b>kstat</b> produces a separate line for each host.  Use <b>kstat -h</b> to see information on all hosts without the jobs.\nFor the example above we are listing our jobs and the hosts they are on.\n\nCore usage - yellow for empty, red for empty on owned nodes, cyan for partially used, blue for all cores used.<BR>\nLoad level - yellow or yellow background indicates the node is being inefficiently used.  Red just means more threads than cores.<br>\nMemory usage - yellow or red means most memory is used.<BR>\nIf the node is owned the group name will be in orange on the right.  Killable jobs can still be run on those nodes.<BR>\n\nEach job line will contain the username, program name, job ID, number of cores, the status which may be colored red for killable jobs, \nthe maximum memory used or memory requested, and the amount of time the job has run.  \nJobs in the queue may contain information on the requested memory and run time, priority access, constraints, and\nhow long the job has been in the queue.\nIn this case, I have 2 jobs running on Hero43.  <i>unafold</i> is using 1 core while <i>octopus</i> is using 16 cores.  Slurm did not provide\nany information on the actual memory use so the memory request is reported  \n\n<B>Detailed information about a single job</B>\n\nkstat can provide a get a great deal of information on a particular job including a very rough estimate of when it will run.  This time is a worst case scenario as this will\nbe adapted as other jobs finish early.  This is a good way to check for job submission problems before contacting us.  kstat colorizes the more important\ninformation to make it easier to identify.\n\nEos>  kstat -j 157054\n \n ##################################   Beocat Queue    ###################################\n  daveturner  netpipe     157054   64 cores  PD       dwarves fabric  CS HPC     8gb req   0 d  0 h  0 m\n \n JobId 157054  Job Name  netpipe\n   UserId=daveturner GroupId=daveturner_users(2117) MCS_label=N/A\n   Priority=11112 Nice=0 Account=ksu-cis-hpc QOS=normal\n   Status=PENDING Reason=Resources Dependency=(null)\n   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0\n   RunTime=00:00:00 TimeLimit=00:40:00 TimeMin=N/A\n   SubmitTime=2018-02-02T18:18:31 EligibleTime=2018-02-02T18:18:31\n   Estimated Start Time is 2018-02-03T06:17:49 EndTime=2018-02-03T06:57:49 Deadline=N/A\n   PreemptTime=None SuspendTime=None SecsPreSuspend=0\n   Partitions killable.q,ksu-cis-hpc.q AllocNode:Sid=eos:1761\n   ReqNodeList=(null) ExcNodeList=(null)\n   NodeList=(null) SchedNodeList=dwarf[01-02]\n   NumNodes=2-2 NumCPUs=64 NumTasks=64 CPUs/Task=1 ReqB:S:C:T=0:0:*:*\n   TRES 2 nodes 64 cores 8192  mem gres/fabric 2\n   Socks/Node=* NtasksPerN:B:S:C=32:0:*:* CoreSpec=*\n   MinCPUsNode=32 MinMemoryNode=4G MinTmpDiskNode=0\n   Constraint=dwarves DelayBoot=00:00:00\n   Gres=fabric Reservation=(null)\n   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)\n   Slurm script  /homes/daveturner/perf/NetPIPE-5.x/sb.np\n   WorkDir=/homes/daveturner/perf/NetPIPE-5.x\n   StdErr=/homes/daveturner/perf/NetPIPE-5.x/0.o157054\n   StdIn=/dev/null\n   StdOut=/homes/daveturner/perf/NetPIPE-5.x/0.o157054\n   Switches=1@00:05:00\n \n #!/bin/bash -l\n #SBATCH --job-name=netpipe\n #SBATCH -o 0.o%j\n #SBATCH --time=0:40:00\n #SBATCH --mem=4G\n #SBATCH --switches=1\n #SBATCH --nodes=2\n #SBATCH --constraint=dwarves\n #SBATCH --ntasks-per-node=32\n #SBATCH --gres=fabric:roce:1\n \n host=`echo $SLURM_JOB_NODELIST | sed s/[^a-z0-9]/\\ /g | cut -f 1 -d ' '`\n nprocs=$SLURM_NTASKS\n openmpi_hostfile.pl $SLURM_JOB_NODELIST 1 hf.$host\n opts=\"--printhostnames --quick --pert 3\"\n \n echo \"*******************************************************************\"\n echo \"Running on $SLURM_NNODES nodes $nprocs cores on nodes $SLURM_JOB_NODELIST\"\n echo \"*******************************************************************\"\n \n mpirun -np 2 --hostfile hf.$host NPmpi $opts -o np.${host}.mpi\n mpirun -np 2 --hostfile hf.$host NPmpi $opts -o np.${host}.mpi.bi --async --bidir\n mpirun -np $nprocs NPmpi $opts -o np.${host}.mpi$nprocs --async --bidir\n\n\n<B>Completed jobs and memory usage</B>\n\nkstat -d #\n\nThis will provide information on the jobs you have currently running and those that have completed\nin the last '#' days.  This is currently the only reliable way to get the memory used per node for your job.\nThis also provides information on whether the job completed normally, was canceled with <I>scancel</I>, \ntimed out, or was killed because it exceeded its memory request.\n\nEos>  kstat -d 10\n\n ###########################  sacct -u daveturner  for 10 days  ###########################\n                                      max gb used on a node /   gb requested per node\n  193037   ADF         dwarf43           1 n  32 c   30.46gb/100gb    05:15:34  COMPLETED\n  193289   ADF         dwarf33           1 n  32 c   26.42gb/100gb    00:50:43  CANCELLED\n  195171   ADF         dwarf44           1 n  32 c   56.81gb/120gb    14:43:35  COMPLETED\n  209518   matlab      dwarf36           1 n   1 c    0.00gb/  4gb    00:00:02  FAILED\n\n<B>Summary of core usage</B>\n\nkstat can also provide a listing of the core usage and cores requested for each user.<BR>\n\nEos>  kstat -c\n \n ##############################   Core usage    ###############################\n   antariksh       1512 cores   %25.1 used     41528 cores queued\n   bahadori         432 cores   % 7.2 used        80 cores queued\n   eegoetz            0 cores   % 0.0 used         2 cores queued\n   fahrialkan        24 cores   % 0.4 used        32 cores queued\n   gowri             66 cores   % 1.1 used        32 cores queued\n   jeffcomer        160 cores   % 2.7 used         0 cores queued\n   ldcoates12        80 cores   % 1.3 used       112 cores queued\n   lukesteg         464 cores   % 7.7 used         0 cores queued\n   mike5454        1060 cores   %17.6 used       852 cores queued\n   nilusha          344 cores   % 5.7 used         0 cores queued\n   nnshan2014       136 cores   % 2.3 used         0 cores queued\n   ploetz           264 cores   % 4.4 used        60 cores queued\n   sadish           812 cores   %13.5 used         0 cores queued\n   sandung           72 cores   % 1.2 used        56 cores queued\n   zhiguang          80 cores   % 1.3 used       688 cores queued\n\n\nIf you want to read more, continue on to our [[AdvancedSlurm]] page."
                    }
                ]
            },
            "18": {
                "pageid": 18,
                "ns": 0,
                "title": "Tips and Tricks",
                "revisions": [
                    {
                        "contentformat": "text/x-wiki",
                        "contentmodel": "wikitext",
                        "*": "Beocat has a number of tools to make your work easier, some which you may not know about. This is a simple list of these programs and some basic usage scenarios.\n\n== Submitting your job to run the fastest ==\n=== Size your jobs to use the fastest nodes ===\n==== Specify the proper number of cores ====\nBeocat (nor any other computer or cluster) can make your job run on more than one core at a time if your program isn't designed to take advantage of this. Many people think \"I can run this on 40 cores and it will run 40 times faster\". This isn't true.\n\nWhile we have many programs that are designed to take advantage of multiple cores, do not assume this is the case\n\n==== Optimize your jobs for speed, not for number of cores ====\nIt seems that many people pick an arbitrary large number of cores for their jobs. 20 seems to be a common one. However, some of our fastest nodes have 16 cores. It's quite likely if your job will fit on an Elf (16 cores, 8 GB/RAM/core (64 GB RAM total)), it will run faster with 16 cores than by specifying more cores and having it run on slower nodes.\n\n==== Don't request resources you don't need ====\nThe most common culprit here is people specifying they need infiniband when the job is run on a single node. This limits the scheduling such that a perfectly good node for your job may be idle while your job is still waiting.\n\n== Programs that make using Beocat easier ==\n=== [[wikipedia:nmon|nmon]] ===\nThe name is short for \"Nigel's Monitor\", it's a program written by Nigel Griffiths from IBM.\n=== [http://www.ibm.com/developerworks/aix/library/au-nmon_analyser/ nmon analyser] ===\nA tool for producing graphs and spreadsheets from output generated by nmon.\n=== [http://hisham.hm/htop/ htop] ===\nA prettier, easier to use top. Shows CPU and memory usage in an easy-to-digest format.\n=== [http://www.gnu.org/software/screen/ screen] ===\nA sort of terminal multiplexer, allows you to run many terminal programs at once without mixing them up. Also allows you to disconnect and reconnect sessions. There is a good explanation of how to use screen at [http://www.mattcutts.com/blog/a-quick-tutorial-on-screen/ http://www.mattcutts.com/blog/a-quick-tutorial-on-screen/].\n=== Ganglia ===\nThe web-based load monitoring tool for the cluster. [http://ganglia.beocat.cis.ksu.edu http://ganglia.beocat.ksu.edu] . From there, you can see how busy Beocat is.\n=== [http://dag.wieers.com/home-made/dstat/ dstat] ===\nA very detailed performance analyzer.\n\n== Increasing file write performance ==\nCredit for this goes to [http://moo.nac.uci.edu/~hjm/bduc/BDUC_USER_HOWTO.html#writeperfongl http://moo.nac.uci.edu/~hjm/bduc/BDUC_USER_HOWTO.html#writeperfongl]\n\n=== Use gzip ===\nIf you have written your own code or are using an app that writes zillions of tiny chunks of data to STDOUT, and you are storing the results on Beocat, you should consider passing the output thru gzip to consolidate the writes into a continuous stream. If you don\u2019t do this, each write will be considered a separate IO event and the write performance will suffer.\n\nIf, however, the STDOUT is passed thru gzip, the wallclock runtime decreases even below the usual runtime and you end up with an output file that it already compressed to about 1/5 the usual size.\n\nThe here\u2019s how to do it:\n\n someapp --opt1 --opt2 --input=/path/to/input_file | gzip > /path/to/output_file\n=== Use named pipes ===\nNamed pipes are special files that don't actually write to the filesystem, and can be used to communicate between processess. Since these pipes are in memory rather than directly to disk, they can be used to buffer writes:\n\n<syntaxhighlight lang=\"bash\">\n# Create the named pipe\nmkfifo /path/to/MyNamedPipe\n\n# Write some data to it\nMyProgram --infile=/path/to/InputData1 --outfile=/path/to/MyNamedPipe &\nMyOtherProgram < /path/to/InputData2 > /path/to/MyNamedPipe\n\n# Extract the output\ncat < /path/to/MyNamedPipe > $HOME/MyOutput\n## OR, we could compress the output\ngzip < /path/to/MyNamedPipe > $HOME/MyOutput.gz\n\n# Delete the named pipe like you would a file\nrm /path/to/MyNamedPipe\n</syntaxhighlight>\nOne cautionary word. Unlike normal files, named pipes cannot be used between machines, but can be used among processes running on the same machine. So, if you're running an MPI job that will be running completely on one node, you could setup a named pipe and do all your writes to that pipe, and then flush it at the end, but if you're running a multi-node MPI job and your named pipe is on a shared filesystem (like $HOME), each process will need to flush its named pipe to a regular file before the job quits.\n=== Use one big file instead of many small ones ===\nThis may seem to be a non-issue, but it's a performance problem we've seen on Beocat many times. I love the term coined by UCI at the link above. They call making many small files \"Zillions Of Tiny files (ZOTfiles)\". Using files like this is an inefficient use of our shared resources. A tiny file by itself is no more inefficient than a huge one. If you have only 100bytes to store, store it in single file. However, the problems start compounding when there are many of them. Because of the way data is stored on disk, 10 MB stored in ZOTfiles of 100bytes each can easily take up NOT 10MB, but more than 400MB - 40 times more space. Worse, data stored in this manner makes many operations very slow - instead of looking up 1 directory entry, the OS has to look up 100,000. This means 100,000 times more disk head movement, with a concommittent decrease in performance and disk lifetime. We have had Beocat users with several million files of less than 1kB each. Just creating a directory listing with ls would take nearly 1/2 hour. Not only is that inefficient for you, but it also degrades the performance of everybody using that filesystem and degrades our backups as well.\n\nPlease use large files instead of ZOTfiles any chance you can!\n\nAs a defense against too much abuse of tiny files, there is a limit of 100,000 entries in any directory in our shared filesystem space.\n\n== Programming for Performance ==\n=== BLAS ===\nBLAS (Basic Linear Algebra Subroutines) is a standard set of linear algebra subroutines. The standard was set so that software could be written against a standardized library interface, and optimized libraries could be \"plug-and-play.\" There are lots of implementations of the BLAS libraries, with the most common ones being [http://software.intel.com/en-us/intel-mkl/ Intel's MKL] and [http://developer.amd.com/tools/cpu/acml/pages/default.aspx AMD's ACML].\n\n==== Beocat BLAS Libraries ====\nSince BLAS is a modular standard, we have installed a few (free) BLAS libraries.\n\n* The BLAS reference library: An unoptimized reference library\n* [http://developer.amd.com/tools/cpu/acml/pages/default.aspx AMD's ACML]: Optimized BLAS library for AMD systems\n* [http://www.openblas.net OpenBLAS]: Optimized BLAS library for some AMD, and most Intel sytems\n\nThe default BLAS library is OpenBLAS.\n\n==== Using a different BLAS library ====\nIf you want or need to use a different BLAS library, list the available libraries with 'ls -1 /etc/env.d/alternatives/blas' (Ignore _current and _current_list)\n\n $ ls -1 /etc/env.d/alternatives/blas\n _current\n _current_list\n acml-gfortran64\n acml-gfortran64-openmp\n acml-ifort64\n acml-ifort64-openmp\n mkl32-dynamic\n mkl32-dynamic-openmp\n mkl32-gfortran\n mkl32-gfortran-openmp\n mkl32-intel\n mkl32-intel-openmp\n mkl64-dynamic\n mkl64-dynamic-openmp\n mkl64-gfortran\n mkl64-gfortran-openmp\n mkl64-int64-dynamic\n mkl64-int64-dynamic-openmp\n mkl64-int64-gfortran\n mkl64-int64-gfortran-openmp\n mkl64-int64-intel\n mkl64-int64-intel-openmp\n mkl64-intel\n mkl64-intel-openmp\n openblas-openmp\n reference\nTo change your default BLAS version you need to determine which shell you are using:\n\n===== CSH or TCSH =====\nIf your tool simply uses pkg-config to find the right blas, you can just run the following:\n<syntaxhighlight lang=\"bash\">\nsetenv PKG_CONFIG_PATH /etc/env.d/alternatives/blas/openblas-openmp/usr/lib64/pkgconfig\n</syntaxhighlight>\nWhere the openblas-openmp is replaced with name of your preferred BLAS. You can put that line in your job script, or in your ~/.cshrc file.\n\nIf it needs actual library names and options for the compiler, after you have run the above you can run these to get the right arguments/library names for your compiler\n<syntaxhighlight lang=\"bash\">\npkg-config --cflags blas\npkg-config --libs blas\n</syntaxhighlight>\n===== SH, BASH, or ZSH =====\nIf your tool simply uses pkg-config to find the right blas, you can just run the following:\n<syntaxhighlight lang=\"bash\">\nexport PKG_CONFIG_PATH=/etc/env.d/alternatives/blas/openblas-openmp/usr/lib64/pkgconfig\n</syntaxhighlight>\nWhere the openblas-openmp is replaced with name of your preferred BLAS. Put the output of that script in your job script, or in your ~/.bashrc or ~/.zshrc file.\n\nIf it needs actual library names and options for the compiler, after you have run the above you can run these to get the right arguments/library names for your compiler\n<syntaxhighlight lang=\"bash\">\npkg-config --cflags blas\npkg-config --libs blas\n</syntaxhighlight>\n=== LAPACK ===\nLAPACK (Linear Algebra PACKage) is a standard set of linear algebra subroutines. Like BLAS, these are very optimized, but LAPACK handles a different set of functions. The standard was set so that software could be written against a standardized library interface, and optimized libraries could be \"plug-and-play.\" There are lots of implementations of the LAPACK libraries, with the most common ones being [http://software.intel.com/en-us/intel-mkl/ Intel's MKL] and [http://developer.amd.com/tools/cpu/acml/pages/default.aspx AMD's ACML].\n\n==== Beocat LAPACK Libraries ====\nSince LAPACK is a modular standard, we have installed a few (free) LAPACK libraries.\n* [http://www.netlib.org/lapack/ The LAPACK reference library]: An unoptimized reference library\n* [http://developer.amd.com/tools/cpu/acml/pages/default.aspx AMD's ACML]: Optimized LAPACK library for AMD systems\n\nThe default LAPACK library is ACML.\n\n==== Using a different LAPACK library ====\nIf you want or need to use a different LAPACK library, list the available libraries with 'ls -1 /etc/env.d/alternatives/lapack' (Ignore _current and _current_list)\n\n $ ls -1 /etc/env.d/alternatives/lapack\n _current\n _current_list\n acml-gfortran64\n acml-gfortran64-openmp\n acml-ifort64\n acml-ifort64-openmp\n mkl32-dynamic\n mkl32-dynamic-openmp\n mkl32-gfortran\n mkl32-gfortran-openmp\n mkl32-intel\n mkl32-intel-openmp\n mkl64-dynamic\n mkl64-dynamic-openmp\n mkl64-gfortran\n mkl64-gfortran-openmp\n mkl64-int64-dynamic\n mkl64-int64-dynamic-openmp\n mkl64-int64-gfortran\n mkl64-int64-gfortran-openmp\n mkl64-int64-intel\n mkl64-int64-intel-openmp\n mkl64-intel\n mkl64-intel-openmp\n reference\nTo change your default LAPACK version you need to determine which shell you are using:\n\n===== CSH or TCSH =====\nIf your tool simply uses pkg-config to find the right lapack, you can just run the following:\n<syntaxhighlight lang=\"bash\">\nsetenv PKG_CONFIG_PATH /etc/env.d/alternatives/lapack/acml-ifort64/usr/lib64/pkgconfig\n</syntaxhighlight>\nWhere the acml-ifort64 is replaced with name of your preferred LAPACK. You can put that line in your job script, or in your ~/.cshrc file.\n\nIf it needs actual library names and options for the compiler, after you have run the above you can run these to get the right arguments/library names for your compiler\n<syntaxhighlight lang=\"bash\">\npkg-config --cflags lapack\npkg-config --libs lapack\n</syntaxhighlight>\n===== SH, BASH, or ZSH =====\nIf your tool simply uses pkg-config to find the right lapack, you can just run the following:\n<syntaxhighlight lang=\"bash\">\nexport PKG_CONFIG_PATH=/etc/env.d/alternatives/lapack/acml-ifort64/usr/lib64/pkgconfig\n</syntaxhighlight>\nWhere the acml-ifort64 is replaced with name of your preferred LAPACK. Put the output of that script in your job script, or in your ~/.bashrc or ~/.zshrc file.\n\nIf it needs actual library names and options for the compiler, after you have run the above you can run these to get the right arguments/library names for your compiler\n<syntaxhighlight lang=\"bash\">\npkg-config --cflags lapack\npkg-config --libs lapack\n</syntaxhighlight>\n\n=== [http://openmp.org/wp/ OpenMP] ===\nOpenMP is a set of directives for C, C++, and Fortran which greatly simplifies parallelizing applications on a single node. There is a good tutorial for OpenMP at [https://computing.llnl.gov/tutorials/openMP/ https://computing.llnl.gov/tutorials/openMP/]\nTo compile an OpenMP-enabled program, you need to tell GCC that OpenMP is available this is done like:\n gcc -fopenmp myOpenMPprogram.c\nBy default OpenMP will use all available cores for its computation, which is a problem for shared resources like Beocat.\n\nTo make use of only the cores assigned to you, you must first make sure you have requested the 'single' parallel environment and in your job script you will need something like the following (before the application you are trying to run):\n\n==== bash, sh, zsh ====\n<syntaxhighlight lang=\"bash\">\nexport OMP_NUM_THREADS=${NSLOTS}\n</syntaxhighlight>\n\n==== csh or tcsh ====\n<syntaxhighlight lang=\"bash\">\nsetenv OMP_NUM_THREADS ${NSLOTS}\n</syntaxhighlight>"
                    }
                ]
            }
        }
    }
}