From Beocat
Jump to: navigation, search
(Not done yet - initial cut)
 
(This is an issue that I ran into and I want to post it on the wiki so others can learn from my mistake. If there's a better way to fix this issue, please let me know.)
 
(22 intermediate revisions by 3 users not shown)
Line 4: Line 4:
 
== Most Commonly Used Software ==
 
== Most Commonly Used Software ==
 
=== [http://www.open-mpi.org/ OpenMPI] ===
 
=== [http://www.open-mpi.org/ OpenMPI] ===
Version 1.4.3
+
Version 2.0.1
  
 
=== [http://www.scilab.org Scilab] ===
 
=== [http://www.scilab.org Scilab] ===
Version 5.4.0
+
Version 6.0.0
  
 
=== [http://www.r-project.org/ R] ===
 
=== [http://www.r-project.org/ R] ===
Version 3.0.3
+
Version 3.3.1
  
 
==== Modules ====
 
==== Modules ====
Line 17: Line 17:
 
==== Installing your own modules ====
 
==== Installing your own modules ====
 
To install your own module, login to Beocat and start R interactively
 
To install your own module, login to Beocat and start R interactively
 
+
<syntaxhighlight lang="bash">
R
+
R
 
+
</syntaxhighlight>
 
Then install the package using
 
Then install the package using
 
+
<syntaxhighlight lang="rsplus">
install.packages("PACKAGENAME")
+
install.packages("PACKAGENAME")
 
+
</syntaxhighlight>
 
Follow the prompts. Note that there is a CRAN mirror at KU - it will be listed as "USA (KS)".
 
Follow the prompts. Note that there is a CRAN mirror at KU - it will be listed as "USA (KS)".
  
 
After installing you can test before leaving interactive mode by issuing the command
 
After installing you can test before leaving interactive mode by issuing the command
 
+
<syntaxhighlight lang="rsplus">
library("PACKAGENAME")
+
library("PACKAGENAME")
 
+
</syntaxhighlight>
 
==== Running R Jobs ====
 
==== Running R Jobs ====
  
You cannot submit an R script directly. 'qsub myscript.R' will result in an error. Instead, you need to make a bash script that will call R appropriately. Here is a minimal example. We'll save this as submit-R.qsub
+
You cannot submit an R script directly. '<tt>qsub myscript.R</tt>' will result in an error. Instead, you need to make a bash [[AdvancedSGE#Running_from_a_qsub_Submit_Script|script]] that will call R appropriately. Here is a minimal example. We'll save this as submit-R.qsub
  
<code>
+
<syntaxhighlight lang="bash">
 
  #!/bin/bash
 
  #!/bin/bash
# First, lets tell the qsub command which resources we need
 
# lets start with memory (in this case I ask for 1 gigabyte).
 
# For help on these, see [[SGEBasics]]
 
 
 
  #$ -l mem=1G
 
  #$ -l mem=1G
 
  # Now we tell qsub how long we expect our work to take: 15 minutes (H:MM:SS)
 
  # Now we tell qsub how long we expect our work to take: 15 minutes (H:MM:SS)
 
 
  #$ -l h_rt=0:15:00
 
  #$ -l h_rt=0:15:00
 
   
 
   
# Lets output a little useful information This will put something like "Starting the job at: Thu Jan 26 10:43:26 CST 2012" in your output file
+
  # Now lets do some actual work. This starts R and loads the file myscript.R
echo -n "Starting the job at: "
 
date
 
 
  # Now lets do some actual work. A lot of our users use R, so we'll go over that
 
# This starts R and loads the file myscript.R
 
 
  R --no-save -q < myscript.R
 
  R --no-save -q < myscript.R
   
+
  </syntaxhighlight>
# like before, this is just useful information
 
echo -n "Ending the job at: "
 
date
 
</code>
 
  
 
Now, to submit your R job, you would type
 
Now, to submit your R job, you would type
qsub submit-R.qsub
+
<syntaxhighlight lang="bash">
 +
qsub submit-R.qsub
 +
</syntaxhighlight>
  
 
=== [http://www.java.com/ Java] ===
 
=== [http://www.java.com/ Java] ===
Versions 1.6 and 1.7
+
We support 4 versions of the Java VM on Beocat. [[wikipedia:IcedTea|IcedTea]] 7 and 8 (based on [[wikipedia:OpenJDK|OpenJDK]]), Oracle JDK 1.7 (Java 7), and Oracle JDK 1.8 (Java 8).
  
We support 4 versions of the Java VM on Beocat. [http://en.wikipedia.org/wiki/IcedTea IcedTea] 6 and 7 (based on [http://en.wikipedia.org/wiki/OpenJDK OpenJDK]), Sun JDK 1.6 (Java 6), and Oracle JDK 1.7 (Java 7).
+
We allow each user to select his or her Java version individually. If you do not select one, we default to IcedTea 8. This was changed from Oracle JDK 1.7 on May 29, 2015 due to a EOL notice from Oracle.
 
 
We allow each user to select his or her Java version individually. If you do not select one, we default to Sun JDK 1.7.
 
  
 
==== Selecting your Java version ====
 
==== Selecting your Java version ====
 
First, lets list the available versions. This can be done with the command <code>eselect java-vm list</code>
 
First, lets list the available versions. This can be done with the command <code>eselect java-vm list</code>
 +
<pre>
 +
% eselect java-vm list
 +
Available Java Virtual Machines:
 +
  [1]  icedtea-bin-7
 +
  [2]  icedtea-bin-8  system-vm
 +
  [3]  oracle-jdk-bin-1.7
 +
  [4]  oracle-jdk-bin-1.8
 +
</pre>
 +
If you'll note,  icedtea-bin-8 (marked "system-vm") is the default for all users. If you have a custom version set, it will be marked with "user-vm". Now if you wanted to use icedtea-7, you could run the following:
 +
<syntaxhighlight lang="bash">
 +
eselect java-vm set user 1
 +
</syntaxhighlight>
 +
Now, we see the difference when running the above command
 +
<pre>
 +
% eselect java-vm list
 +
Available Java Virtual Machines:
 +
  [1]  icedtea-bin-7 user-vm
 +
  [2]  icedtea-bin-8  system-vm
 +
  [3]  oracle-jdk-bin-1.7
 +
  [4]  oracle-jdk-bin-1.8
 +
</pre>
 +
To verify you are seeing the correct java, you can run <code>java -version</code>
 +
<pre>
 +
% java -version
 +
java version "1.7.0_121"
 +
OpenJDK Runtime Environment (IcedTea 2.6.8) (Gentoo icedtea-7.2.6.8)
 +
OpenJDK 64-Bit Server VM (build 24.121-b00, mixed mode)
 +
</pre>
 +
 +
=== [http://www.python.org/about/ Python] ===
 +
 +
We have several versions of Python available:
 +
* [http://docs.python.org/2.7/ CPython 2.7]
 +
* [http://docs.python.org/3.4/ CPython 3.4]
 +
* [http://pypy.org/ PyPy 5.4.1] (Python 2.7.10)
 +
* [http://pypy.org/ PyPy3 5.5.0-alpha0] (Python 3.3.5)
 +
 +
For the uninitiated PyPy provides [[wikipedia:Just-in-time_compilation|just-in-time compilation]] for python code. While it doesn't support all modules, code which does run under PyPy can see a significant performance increase.
  
% eselect java-vm list
+
If you just need python and its default modules, you can use python2 python3 or pypy as you would any other application.
Available Java Virtual Machines:
 
  [1]  icedtea-bin-6
 
  [2]  icedtea-bin-7
 
  [3]  oracle-jdk-bin-1.7  system-vm
 
  [4]  sun-jdk-1.6
 
  
If you'll note, oracle-jdk-bin-1.7 (marked "system-vm") is the default for all users. If you have a custom version set, it will be marked with "user-vm". Now if you wanted to use icedtea-6, you could run the following:
+
If, however, you need modules that we do not have installed, you should use [http://www.doughellmann.com/projects/virtualenvwrapper/ virtualenvwrapper] to setup a virtual python environment in your home directory. This will let you install python modules as you please.
  
  eselect java-vm set user 1
+
==== Setting up your virtual environment ====
 +
* [[LinuxBasics#Shells|Change your shell]] to bash
 +
* Make sure ~/.bash_profile exists
 +
<syntaxhighlight lang="bash">
 +
if [ ! -f ~/.bash_profile ]; then cp /etc/skel/.bash_profile ~/.bash_profile; fi
 +
</syntaxhighlight>
 +
* Add a line like <code>source /usr/bin/virtualenvwrapper.sh</code> to your .bashrc.
 +
<syntaxhighlight lang="bash">
 +
echo "source /usr/bin/virtualenvwrapper.sh" >> ~/.bashrc
 +
</syntaxhighlight>
 +
* '''''CRITICAL:''''' Logout, and then log back in
 +
* Show your existing environments
 +
<syntaxhighlight lang="bash">
 +
workon
 +
</syntaxhighlight>
 +
* Create a virtual environment. Here I will create a default virtual environment called 'test', a python2 virtual environment called 'testp2', a python3 virtual environment called 'testp3', and a pypy environment called testpypy. Note that <code>mkvirtualenv --help</code> has many more useful options.
 +
<syntaxhighlight lang="bash">
 +
mkvirtualenv -p $(which python2) testp2
 +
mkvirtualenv -p $(which python3) testp3
 +
  mkvirtualenv -p $(which pypy) testpypy
 +
</syntaxhighlight>
 +
* Lets look at our virtual environments
 +
<pre>
 +
%workon
 +
testp2
 +
testp3
 +
testpypy
 +
</pre>
 +
* Activate one of these
 +
<pre>
 +
%workon testp2
 +
</pre>
 +
* You can now install the python modules you want. This can be done using <tt>pip</tt>.
 +
<syntaxhighlight lang="bash">
 +
pip install numpy biopython
 +
</syntaxhighlight>
  
Now, we see the difference when running the above command
+
==== Using your virtual environment within a job ====
 +
Here is a simple job script using the virtual environment testp2
 +
<syntaxhighlight lang="bash">
 +
#!/bin/bash
 +
source /usr/bin/virtualenvwrapper.sh
 +
workon testp2
 +
~/path/to/your/python/script.py
 +
</syntaxhighlight>
 +
==== A note on [http://mpi4py.scipy.org/docs/usrman/index.html mpi4py] ====
 +
If you are wanting to use mpi with your python script and are using a virtual environment, you will need to send the correct environment variables to all of the mpi processes to make the virtual environment work.
 +
<syntaxhighlight lang="bash">
 +
#!/bin/bash
 +
# sample mpi4py submit script
 +
source /usr/bin/virtualenvwrapper.sh
 +
workon testp2
 +
# figure out the location of the python interpreter in the virtual environment
 +
PYTHON_BINARY=$(which python)
 +
# mpirun the python interpreter within the virtual environment
 +
# if you don't use the interpreter within the virtual environment, i.e. just using 'python'
 +
# the system python interpreter (without access to your other modules) will be used.
 +
mpirun ${PYTHON_BINARY} ~/path/to/your/mpi-enabled/python/script.py
 +
</syntaxhighlight>
 +
If you are using comm.send and comm.recv for communication with python objects and receive an output message like the one below, you will need to use [https://support.beocat.ksu.edu/BeocatDocs/index.php/AdvancedSGE#Infiniband infiniband] to allow MPI to communicate properly.
 +
<syntaxhighlight lang="xml">
 +
--------------------------------------------------------------------------
 +
[[33053,1],52]: A high-performance Open MPI point-to-point messaging module
 +
was unable to find any relevant network interfaces:
 +
 
 +
Module: OpenFabrics (openib)
 +
  Host: host
 +
 
 +
Another transport will be used instead, although this may result in
 +
lower performance.
 +
--------------------------------------------------------------------------
 +
</syntaxhighlight>
 +
 
 +
==== A note on [http://www.scipy.org/ scipy] ====
 +
SciPy requires numpy, unfortunately it doesn't properly define a dependency on numpy, so you just have to install it first.
 +
<syntaxhighlight lang="bash">
 +
source /usr/bin/virtualenvwrapper.sh
 +
workon testp2
 +
pip install numpy
 +
# now scipy needs lapack and it doesn't detect the system one. lets fix it
 +
export LAPACK=/usr/lib/libreflapack.so
 +
export BLAS=/usr/lib/libopenblas_openmp.so
 +
pip install scipy
 +
</syntaxhighlight>
 +
 
 +
=== [http://www.perl.org/ Perl] ===
 +
The system-wide version of perl is tracking the stable releases of perl. Unfortunately there are some features that we do not include in the system distribution of perl, namely threads.
 +
==== Submitting a job with Perl ====
 +
Much like R (above), you cannot simply '<tt>qsub myProgram.pl</tt>', but you must create a [[AdvancedSGE#Running_from_a_qsub_Submit_Script|submit script]] which will call perl. Here is an example:
 +
<syntaxhighlight lang="bash">
 +
#!/bin/bash
 +
#$ -l mem=1G
 +
# Now we tell qsub how long we expect our work to take: 15 minutes (H:MM:SS)
 +
#$ -l h_rt=0:15:00
 +
# Now lets do some actual work.
 +
perl /path/to/myProgram.pl
 +
</syntaxhighlight>
 +
==== Getting Perl with threads ====
 +
* Setup perlbrew
 +
** [[LinuxBasics#Shells|Change your shell]] to bash
 +
** Install perlbrew
 +
<syntaxhighlight lang="bash">
 +
curl -L http://install.perlbrew.pl | bash
 +
</syntaxhighlight>
 +
** Make sure that ~/.bash_profile exists
 +
<syntaxhighlight lang="bash">
 +
if [ ! -f ~/.bash_profile ]; then cp /etc/skel/.bash_profile ~/.bash_profile; fi
 +
</syntaxhighlight>
 +
** Add <code>source ~/perl5/perlbrew/etc/bashrc</code> to ~/.bash_profile
 +
<syntaxhighlight lang="bash">
 +
echo "source ~/perl5/perlbrew/etc/bashrc" >> ~/.bash_profile
 +
</syntaxhighlight>
 +
** Then source your bash profile
 +
<syntaxhighlight lang="bash">
 +
source ~/.bash_profile
 +
</syntaxhighlight>
 +
* Now, install perl with threads within perlbrew
 +
** Find the current Perl version.
 +
<pre>
 +
% perl -version
 +
 
 +
This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux
 +
(with 22 registered patches, see perl -V for more detail)
 +
(...several more lines deleted)
 +
</pre>
 +
** In this case the version is 5.16.3, so we run
 +
<syntaxhighlight lang="bash">
 +
perlbrew install -f -n -D usethreads perl-5.16.3
 +
</syntaxhighlight>
 +
** To temporarily use the new version of perl in the current shell, we now run
 +
<syntaxhighlight lang="bash">
 +
perlbrew use perl-5.16.3
 +
</syntaxhighlight>
 +
** To switch versions of perl for every new login or job, run
 +
<syntaxhighlight lang="bash">
 +
perlbrew switch perl-5.16.3
 +
</syntaxhighlight>
 +
** You can reverse this switch with
 +
<syntaxhighlight lang="bash">
 +
perlbrew switch-off
 +
</syntaxhighlight>
  
% eselect java-vm list
+
== Installing my own software ==
Available Java Virtual Machines:
+
Installing and maintaining software for the many different users of Beocat would be very difficult, if not impossible. For this reason, we don't generally install user-run software on our cluster. Instead, we ask that you install it into your home directories.
  [1]  icedtea-bin-6  user-vm
 
  [2]  icedtea-bin-7
 
  [3]  oracle-jdk-bin-1.7  system-vm
 
  [4]  sun-jdk-1.6
 
  
To verify you are seeing the correct java, you can run <code>java -version</code>
+
In many cases, the software vendor or support site will incorrectly assume that you are installing the software system-wide or that you need 'sudo' access.
  
% java -version
+
As a quick example of installing software in your home directory, we have a sample video on our [[Training Videos]] page. If you're still having problems or questions, please contact support as mentioned on our [[Main Page]].
java version "1.6.0_27"
 
OpenJDK Runtime Environment (IcedTea6 1.12.7) (Gentoo build 1.6.0_27-b27)
 
OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
 

Latest revision as of 14:23, 14 May 2017

Drinking from the Firehose

For a complete list of all installed software, see NodePackageList

Most Commonly Used Software

OpenMPI

Version 2.0.1

Scilab

Version 6.0.0

R

Version 3.3.1

Modules

We provide a small number of R modules installed by default, these are generally modules that are needed by more than one person.

Installing your own modules

To install your own module, login to Beocat and start R interactively

R

Then install the package using

install.packages("PACKAGENAME")

Follow the prompts. Note that there is a CRAN mirror at KU - it will be listed as "USA (KS)".

After installing you can test before leaving interactive mode by issuing the command

library("PACKAGENAME")

Running R Jobs

You cannot submit an R script directly. 'qsub myscript.R' will result in an error. Instead, you need to make a bash script that will call R appropriately. Here is a minimal example. We'll save this as submit-R.qsub

 #!/bin/bash
 #$ -l mem=1G
 # Now we tell qsub how long we expect our work to take: 15 minutes (H:MM:SS)
 #$ -l h_rt=0:15:00
 
 # Now lets do some actual work. This starts R and loads the file myscript.R
 R --no-save -q < myscript.R

Now, to submit your R job, you would type

qsub submit-R.qsub

Java

We support 4 versions of the Java VM on Beocat. IcedTea 7 and 8 (based on OpenJDK), Oracle JDK 1.7 (Java 7), and Oracle JDK 1.8 (Java 8).

We allow each user to select his or her Java version individually. If you do not select one, we default to IcedTea 8. This was changed from Oracle JDK 1.7 on May 29, 2015 due to a EOL notice from Oracle.

Selecting your Java version

First, lets list the available versions. This can be done with the command eselect java-vm list

% eselect java-vm list
Available Java Virtual Machines:
  [1]   icedtea-bin-7
  [2]   icedtea-bin-8  system-vm
  [3]   oracle-jdk-bin-1.7
  [4]   oracle-jdk-bin-1.8

If you'll note, icedtea-bin-8 (marked "system-vm") is the default for all users. If you have a custom version set, it will be marked with "user-vm". Now if you wanted to use icedtea-7, you could run the following:

eselect java-vm set user 1

Now, we see the difference when running the above command

% eselect java-vm list
Available Java Virtual Machines:
  [1]   icedtea-bin-7 user-vm
  [2]   icedtea-bin-8  system-vm
  [3]   oracle-jdk-bin-1.7
  [4]   oracle-jdk-bin-1.8

To verify you are seeing the correct java, you can run java -version

% java -version
java version "1.7.0_121"
OpenJDK Runtime Environment (IcedTea 2.6.8) (Gentoo icedtea-7.2.6.8)
OpenJDK 64-Bit Server VM (build 24.121-b00, mixed mode)

Python

We have several versions of Python available:

For the uninitiated PyPy provides just-in-time compilation for python code. While it doesn't support all modules, code which does run under PyPy can see a significant performance increase.

If you just need python and its default modules, you can use python2 python3 or pypy as you would any other application.

If, however, you need modules that we do not have installed, you should use virtualenvwrapper to setup a virtual python environment in your home directory. This will let you install python modules as you please.

Setting up your virtual environment

if [ ! -f ~/.bash_profile ]; then cp /etc/skel/.bash_profile ~/.bash_profile; fi
  • Add a line like source /usr/bin/virtualenvwrapper.sh to your .bashrc.
echo "source /usr/bin/virtualenvwrapper.sh" >> ~/.bashrc
  • CRITICAL: Logout, and then log back in
  • Show your existing environments
workon
  • Create a virtual environment. Here I will create a default virtual environment called 'test', a python2 virtual environment called 'testp2', a python3 virtual environment called 'testp3', and a pypy environment called testpypy. Note that mkvirtualenv --help has many more useful options.
 mkvirtualenv -p $(which python2) testp2
 mkvirtualenv -p $(which python3) testp3
 mkvirtualenv -p $(which pypy) testpypy
  • Lets look at our virtual environments
%workon
testp2
testp3
testpypy
  • Activate one of these
%workon testp2
  • You can now install the python modules you want. This can be done using pip.
pip install numpy biopython

Using your virtual environment within a job

Here is a simple job script using the virtual environment testp2

#!/bin/bash
source /usr/bin/virtualenvwrapper.sh
workon testp2
~/path/to/your/python/script.py

A note on mpi4py

If you are wanting to use mpi with your python script and are using a virtual environment, you will need to send the correct environment variables to all of the mpi processes to make the virtual environment work.

#!/bin/bash
# sample mpi4py submit script
source /usr/bin/virtualenvwrapper.sh
workon testp2
# figure out the location of the python interpreter in the virtual environment
PYTHON_BINARY=$(which python)
# mpirun the python interpreter within the virtual environment
# if you don't use the interpreter within the virtual environment, i.e. just using 'python'
# the system python interpreter (without access to your other modules) will be used.
mpirun ${PYTHON_BINARY} ~/path/to/your/mpi-enabled/python/script.py

If you are using comm.send and comm.recv for communication with python objects and receive an output message like the one below, you will need to use infiniband to allow MPI to communicate properly.

--------------------------------------------------------------------------
[[33053,1],52]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:

Module: OpenFabrics (openib)
  Host: host

Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------

A note on scipy

SciPy requires numpy, unfortunately it doesn't properly define a dependency on numpy, so you just have to install it first.

source /usr/bin/virtualenvwrapper.sh
workon testp2
pip install numpy
# now scipy needs lapack and it doesn't detect the system one. lets fix it
export LAPACK=/usr/lib/libreflapack.so
export BLAS=/usr/lib/libopenblas_openmp.so
pip install scipy

Perl

The system-wide version of perl is tracking the stable releases of perl. Unfortunately there are some features that we do not include in the system distribution of perl, namely threads.

Submitting a job with Perl

Much like R (above), you cannot simply 'qsub myProgram.pl', but you must create a submit script which will call perl. Here is an example:

#!/bin/bash
#$ -l mem=1G
# Now we tell qsub how long we expect our work to take: 15 minutes (H:MM:SS)
#$ -l h_rt=0:15:00
# Now lets do some actual work. 
perl /path/to/myProgram.pl

Getting Perl with threads

curl -L http://install.perlbrew.pl | bash
    • Make sure that ~/.bash_profile exists
if [ ! -f ~/.bash_profile ]; then cp /etc/skel/.bash_profile ~/.bash_profile; fi
    • Add source ~/perl5/perlbrew/etc/bashrc to ~/.bash_profile
echo "source ~/perl5/perlbrew/etc/bashrc" >> ~/.bash_profile
    • Then source your bash profile
source ~/.bash_profile
  • Now, install perl with threads within perlbrew
    • Find the current Perl version.
% perl -version

This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux
(with 22 registered patches, see perl -V for more detail)
(...several more lines deleted)
    • In this case the version is 5.16.3, so we run
perlbrew install -f -n -D usethreads perl-5.16.3
    • To temporarily use the new version of perl in the current shell, we now run
perlbrew use perl-5.16.3
    • To switch versions of perl for every new login or job, run
perlbrew switch perl-5.16.3
    • You can reverse this switch with
perlbrew switch-off

Installing my own software

Installing and maintaining software for the many different users of Beocat would be very difficult, if not impossible. For this reason, we don't generally install user-run software on our cluster. Instead, we ask that you install it into your home directories.

In many cases, the software vendor or support site will incorrectly assume that you are installing the software system-wide or that you need 'sudo' access.

As a quick example of installing software in your home directory, we have a sample video on our Training Videos page. If you're still having problems or questions, please contact support as mentioned on our Main Page.