- 1 Drinking from the Firehose
- 2 Toolchains
- 3 Most Commonly Used Software
- 3.1 OpenMPI
- 3.2 R
- 3.3 Java
- 3.4 Python
- 3.5 Spark
- 3.6 Perl
- 3.7 Octave for MatLab codes
- 3.8 MatLab compiler
- 3.9 COMSOL
- 3.10 .NET Core
- 4 Installing my own software
Drinking from the Firehose
For a complete list of all installed modules, see ModuleList
A toolchain is a set of compilers, libraries and applications that are needed to build software. Some software functions better when using specific toolchains.
We provide a good number of toolchains and versions of toolchains make sure your applications will compile and/or run correctly.
These toolchains include (you can run 'module keyword toolchain'):
- GNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK.
- GNU Compiler Collection (GCC) based compiler toolchain based on FOSS with CUDA support.
- GNU Compiler Collection (GCC) based compiler toolchain, including MVAPICH2 for MPI support. DEPRECATED
- GNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support.
- GCC based compiler toolchain __with CUDA support__, and including OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK. DEPRECATED
- Intel Cluster Toolchain Compiler Edition provides Intel C/C++ and Fortran compilers, Intel MKL & OpenMPI.
You can run 'module spider $toolchain/' to see the versions we have:
$ module spider iomkl/
If you load one of those (module load iomkl/2017b), you can see the other modules and versions of software that it loaded with the 'module list':
$ module list Currently Loaded Modules: 1) icc/2017.4.196-GCC-6.4.0-2.28 2) binutils/2.28-GCCcore-6.4.0 3) ifort/2017.4.196-GCC-6.4.0-2.28 4) iccifort/2017.4.196-GCC-6.4.0-2.28 5) GCCcore/6.4.0 6) numactl/2.0.11-GCCcore-6.4.0 7) hwloc/1.11.7-GCCcore-6.4.0 8) OpenMPI/2.1.1-iccifort-2017.4.196-GCC-6.4.0-2.28 9) iompi/2017b 10) imkl/2017.3.196-iompi-2017b 11) iomkl/2017b
As you can see, toolchains can depend on each other. For instance, the iomkl toolchain, depends on iompi, which depends on iccifort, which depend on icc and ifort, which depend on GCCcore which depend on GCC. Hence it is very important that the correct versions of all related software are loaded.
With software we provide, the toolchain used to compile is always specified in the "version" of the software that you want to load.
If you mix toolchains, inconsistent things may happen.
Most Commonly Used Software
We provide lots of versions, you are most likely better off directly loading a toolchain or application to make sure you get the right version, but you can see the versions we have with 'module spider OpenMPI/':
We currently provide (module spider R/):
We provide a small number of R modules installed by default, these are generally modules that are needed by more than one person.
Installing your own R Packages
To install your own module, login to Beocat and start R interactively
module load R R
Then install the package using
Follow the prompts. Note that there is a CRAN mirror at KU - it will be listed as "USA (KS)".
After installing you can test before leaving interactive mode by issuing the command
Running R Jobs
You cannot submit an R script directly. 'sbatch myscript.R' will result in an error. Instead, you need to make a bash script that will call R appropriately. Here is a minimal example. We'll save this as submit-R.sbatch
#!/bin/bash #SBATCH --mem-per-cpu=1G # Now we tell qsub how long we expect our work to take: 15 minutes (D-H:MM:SS) #SBATCH --time=0-0:15:00 # Now lets do some actual work. This starts R and loads the file myscript.R module load R R --no-save -q < myscript.R
Now, to submit your R job, you would type
We currently provide (module spider Java/):
We currently provide (module spider Python/)
If you need modules that we do not have installed, you should use virtualenv to setup a virtual python environment in your home directory. This will let you install python modules as you please.
Setting up your virtual environment
# Load Python module load Python/3.6.3-iomkl-2017beocatb
(After running this command Python is loaded. After you logoff and then logon again Python will not be loaded so you must rerun this command every time you logon.)
- Create a location for your virtual environments (optional, but helps keep things organized)
mkdir ~/virtualenvs cd ~/virtualenvs
- Create a virtual environment. Here I will create a default virtual environment called 'test'. Note that
virtualenv --helphas many more useful options.
- Lets look at our virtual environments (the virtual environment name should be in the output):
- Activate one of these
(After running this command your virtual environment is activated. After you logoff and then logon again your virtual environment will not be loaded so you must rerun this command every time you logon.)
- You can now install the python modules you want. This can be done using pip.
pip install numpy biopython
Using your virtual environment within a job
Here is a simple job script using the virtual environment test
#!/bin/bash module load Python/3.6.3-iomkl-2017beocatb source ~/virtualenvs/test/bin/activate export PYTHONDONTWRITEBYTECODE=1 python ~/path/to/your/python/script.py
Spark is a programming language for large scale data processing. It can be used in conjunction with Python, R, Scala, Java, and SQL. Spark can be run on Beocat interactively or through the Slurm queue.
To run interactively, you must first request a node or nodes from the Slurm queue. The line below requests 1 node and 1 core for 24 hours and if available will drop you into the bash shell on that node.
srun -J srun -N 1 -n 1 -t 24:00:00 --mem=10G --pty bash
We have some sample python based Spark code you can try out that came from the exercises and homework from the PSC Spark workshop.
mkdir spark-test cd spark-test cp -rp /homes/daveturner/projects/PSC-BigData-Workshop/Shakespeare/* .
The sample code requires 'nltk' and 'numpy' packages, so the first time you run it, you need to create the virtualenv and install these packages.
module load Python mkdir ~/virtualenvs cd ~/virtualenvs virtualenv spark-test source ~/virtualenvs/spark-test/bin/activate pip install nltk pip install numpy
On any subsequent runs, you can then just enter that virtualenv without running all of the above commands:
module load Python source ~/virtualenvs/spark-test/bin/activate
Then load the Spark module (Python should already be loaded from above), change to the sample directory, fire up pyspark, and run the sample code.
module load Spark cd ~/spark-test/Shakespeare pyspark >>> exec(open("shakespeare.py").read())
You can work interactively from the pyspark prompt (>>>) in addition to running scripts as above.
The Shakespeare directory also contains a sample sbatch submit script that will run the same shakespeare.py code through the Slurm batch queue.
#!/bin/bash -l #SBATCH --job-name=shakespeare #SBATCH --mem=10G #SBATCH --time=01:00:00 #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 # Load Spark and Python (version 3 here) module load Spark module load Python spark-submit shakespeare.py
When you run interactively, pyspark initializes your spark context sc. You will need to do this manually as in the sample python code when you want to submit jobs through the Slurm queue.
# If there is no Spark Context (not running interactive from pyspark), create it try: sc except NameError: from pyspark import SparkConf, SparkContext conf = SparkConf().setMaster("local").setAppName("App") sc = SparkContext(conf = conf)
The system-wide version of perl is tracking the stable releases of perl. Unfortunately there are some features that we do not include in the system distribution of perl, namely threads.
If you need a newer version (or threads), just load one we provide in our modules (module spider Perl/):
Submitting a job with Perl
Much like R (above), you cannot simply 'sbatch myProgram.pl', but you must create a submit script which will call perl. Here is an example:
#!/bin/bash #SBATCH --mem-per-cpu=1G # Now we tell qsub how long we expect our work to take: 15 minutes (H:MM:SS) #SBATCH --time=0-0:15:00 # Now lets do some actual work. module load Perl perl /path/to/myProgram.pl
Octave for MatLab codes
module load Octave/4.2.1-foss-2017beocatb-enable64
The 64-bit version of Octave can be loaded using the command above. Octave can then be used to work with MatLab codes on the head node and to submit jobs to the compute nodes through the sbatch scheduler. Octave is made to run MatLab code, but it does have limitations and does not support everything that MatLab itself does.
#!/bin/bash -l #SBATCH --job-name=octave #SBATCH --output=octave.o%j #SBATCH --time=1:00:00 #SBATCH --mem=4G #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 module purge module load Octave/4.2.1-foss-2017beocatb-enable64 octave < matlab_code.m
Beocat also has a single-user license for the MatLab compiler and the most common toolboxes including the Parallel Computing Toolbox, Optimization Toolbox, Statistics and Machine Learning Toolbox, Image Processing Toolbox, Curve Fitting Toolbox, Neural Network Toolbox, Sumbolic Math Toolbox, Global Optimization Toolbox, and the Bioinformatics Toolbox.
Since we only have a single-user license, this means that you will be expected to develop your MatLab code with Octave or elsewhere on a laptop or departmental server. Once you're ready to do large runs, then you move your code to Beocat, compile the MatLab code into an executable, and you can submit as many jobs as you want to the scheduler. To use the MatLab compiler, you need to load the MATLAB module to compile code and load the mcr module to run the resulting MatLab executable.
module load MATLAB mcc -m matlab_main_code.m -o matlab_executable_name
If you have addpath() commands in your code, you will need to wrap them in an "if ~deployed" block and tell the compiler to include that path via the -I flag.
% wrap addpath() calls like so: if ~deployed addpath('./another/folder/with/code/') end
NOTE: The license manager checks the mcc compiler out for a minimum of 30 minutes, so if another user compiles a code you unfortunately may need to wait for up to 30 minutes to compile your own code.
Compiling with additional paths:
module load MATLAB mcc -m matlab_main_code.m -I ./another/folder/with/code/ -o matlab_executable_name
Any directories added with addpath() will need to be added to the list of compile options as -I arguments. You can have multiple -I arguments in your compile command.
Here is an example job submission script. Modify time, memory, tasks-per-node, and job name as you see fit:
#!/bin/bash -l #SBATCH --job-name=matlab #SBATCH --output=matlab.o%j #SBATCH --time=1:00:00 #SBATCH --mem=4G #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 module purge module load mcr ./matlab_executable_name
For those who make use of mex files - compiled C and C++ code with matlab bindings - you will need to add these files to the compiled archive via the -a flag. See the behavior of this flag in the compiler documentation. You can either target specific .mex files or entire directories.
Because codes often require adding several directories to the Matlab path as well as mex files from several locations, we recommend writing a script to preserve and help document the steps to compile your Matlab code. Here is an abbreviated example from a current user:
#!/bin/bash -l module load MATLAB cd matlabPyrTools/MEX/ # compile mex files mex upConv.c convolve.c wrap.c edges.c mex corrDn.c convolve.c wrap.c edges.c mex histo.c mex innerProd.c cd ../.. mcc -m mongrel_creation.m \ -I ./matlabPyrTools/MEX/ \ -I ./matlabPyrTools/ \ -I ./FastICA/ \ -a ./matlabPyrTools/MEX/ \ -a ./texturesynth/ \ -o mongrel_creation_binary
Again, we only have a single-user license for MatLab so the model is to develop and debug your MatLab code elsewhere or using Octave on Beocat, then you can compile the MatLab code into an executable and run it without limits on Beocat.
For more info on the mcc compiler see: https://www.mathworks.com/help/compiler/mcc.html
Beocat has no license for COMSOL. If you want to use it, you must provide your own.
module spider COMSOL/ ---------------------------------------------------------------------------- COMSOL: COMSOL/5.3 ---------------------------------------------------------------------------- Description: COMSOL Multiphysics software, an interactive environment for modeling and simulating scientific and engineering problems This module can be loaded directly: module load COMSOL/5.3 Help: Description =========== COMSOL Multiphysics software, an interactive environment for modeling and simulating scientific and engineering problems You must provide your own license. export LM_LICENSE_FILE=/the/path/to/your/license/file *OR* export LM_LICENSE_FILE=$LICENSE_SERVER_PORT@$LICENSE_SERVER_HOSTNAME e.g. export LM_LICENSE_FILEemail@example.com More information ================ - Homepage: https://www.comsol.com/
Running COMSOL in graphical mode on a cluster is generally a bad idea. If you choose to run it in graphical mode on a compute node, you will need to do something like the following:
# Connect to the cluster with X11 forwarding (ssh -Y or mobaxterm) # load the comsol module on the headnode module load COMSOL # export your comsol license as mentioned above, and tell the scheduler to run the software srun --nodes=1 --time=1:00:00 --mem=1G --pty --x11 comsol -3drend sw
mozes@[eunomia] ~ $ module load dotNET-Core-SDK
create an application
Following instructions from here, we'll create a simple 'Hello World' application
mozes@[eunomia] ~ $ mkdir Hello
mozes@[eunomia] ~ $ cd Hello
mozes@[eunomia] ~/Hello $ export DOTNET_SKIP_FIRST_TIME_EXPERIENCE=true
mozes@[eunomia] ~/Hello $ dotnet new console The template "Console Application" was created successfully. Processing post-creation actions... Running 'dotnet restore' on /homes/mozes/Hello/Hello.csproj... Restoring packages for /homes/mozes/Hello/Hello.csproj... Generating MSBuild file /homes/mozes/Hello/obj/Hello.csproj.nuget.g.props. Generating MSBuild file /homes/mozes/Hello/obj/Hello.csproj.nuget.g.targets. Restore completed in 358.43 ms for /homes/mozes/Hello/Hello.csproj. Restore succeeded.
Edit your program
mozes@[eunomia] ~/Hello $ vi Program.cs
Run your .NET application
mozes@[eunomia] ~/Hello $ dotnet run Hello World!
Build and run the built application
mozes@[eunomia] ~/Hello $ dotnet build Microsoft (R) Build Engine version 15.8.169+g1ccb72aefa for .NET Core Copyright (C) Microsoft Corporation. All rights reserved. Restore completed in 106.12 ms for /homes/mozes/Hello/Hello.csproj. Hello -> /homes/mozes/Hello/bin/Debug/netcoreapp2.1/Hello.dll Build succeeded. 0 Warning(s) 0 Error(s) Time Elapsed 00:00:02.86
mozes@[eunomia] ~/Hello $ dotnet bin/Debug/netcoreapp2.1/Hello.dll Hello World!
Installing my own software
Installing and maintaining software for the many different users of Beocat would be very difficult, if not impossible. For this reason, we don't generally install user-run software on our cluster. Instead, we ask that you install it into your home directories.
In many cases, the software vendor or support site will incorrectly assume that you are installing the software system-wide or that you need 'sudo' access.
As a quick example of installing software in your home directory, we have a sample video on our Training Videos page. If you're still having problems or questions, please contact support as mentioned on our Main Page.