-
Notifications
You must be signed in to change notification settings - Fork 1
ParallelPage
Several protocols may be run in parallel on a wide range of multi-processor environments. In theory, only the number of parallel jobs and a list with the CPUs is needed to execute the programs, but unfortunately the way this information is required by different clusters varies dramatically. Therefore, two steps have to be performed:
/mpi.jpg |
Figure 1: Paralelization issues: MPI related questions |
- Set
Number of threads
to 1 unless you know what is a thread, and which programs use them. Note that threaded programs may be run in parallel on a shared-memory multi-core machine without using distributed-memory parallelization. - Set
Distributed-memory parallelization (MPI)?
toYes
(Otherwise what follows will be ignored!) - Set
Number of MPI processes
to thenumber of CPUs you want to use
/Number of threads
- Set
System Flavour
depending on your queueing system and MPI-implementation. The following values are available:-
SLURM-MPICH
: SLURM queue with MPICH-implementation -
TORQUE-OPENMPI
: Torque (PBS) queue with openMPI-implementation -
SGE-OPENMPI
: Sun Grid Engine with openMPI-implementation -
PBS
: Basic PBS queue -
XMIPP_MACHINEFILE
: Environment variable $XMIPP_MACHINEFILE points to machinefile -
HOME_MACHINEFILE
: machinefile is called $HOME/machinefile.dat -
Leave it black
: Run locally (for most personal computers)
-
If you are in doubt about theSystem Flavour
, ask the person who installed Xmipp on your cluster, or read the source code of launch_job.py which is the class that launches parallel jobs in the protocols.
Apart from filling in the protocol GUI fields, if your cluster uses a queueing system it may be necessary to write a dedicated job submission script (we call this scriptqsub.py
, and generally place it in a position in the $PATH of the user). When submitting the job by pressing theSave & Execute
button on the protocol GUI, one has to answerYes
to the question whether one wants to use a job queueing system, and use thisqsub.py
command in the pop-up window (see Figure 2).
If you don't know how to write such a script, ask the person who installed Xmipp for you, or have a look at the examples below (e.g. Crunchy,MareNostrum).
/submit.jpg |
Figure 2: Parallel job submission using a queueing system |
The following three options are available:
-
XMIPP_MACHINEFILE
:environment variable $XMIPP_MACHINEFILE points to machinefile -
HOME_MACHINEFILE
: machinefile is called $HOME/machines.dat - nothing, all mpi jobs run in localhost
- Define
XMIPP_MACHINEFILE
as~biologia/machines.dat
(export XMIPP_MACHINEFILE
~biologia/machines.dat=) - Set
system flavour
toXMIPP_MACHINEFILE
- In queue submit pop-up window, type:
bsub -q 1week_parallel
Mainframes are very picky regarding how to send jobs. In the more popular ones we have installed a launching script calledqsub.py
. This script is different in the different machines but present the same syntax to the user.
Crunchy is our new (IBM) cluster. It has 28 nodes, each with 8 cores and at least 2Gb per core (some nodes have 4Gb/core). Xmipp is running using openMPI and there is a Torque/Moab resource managing system.
- Set
system flavour
toTORQUE-OPENMPI
- Place the followingQsubPyCrunchy in the $PATH of the user, and execute the job as given in Figure 2.
If you want to modify the default values see more details atRunningXmippOnCrunchy.
- Set
system flavour
toSLURM-MPICH
- Use thisQsubPyBsc
- The graphic interface is not suported in this computer, therefore you must edit the python scripts manually and execute them form the command line. For example, to execute the
Projection Matching
protocol you must edit theprotocol_projmatch.py
file. - Set
system flavour
toPBS
- As the GUI is not installed yet, edit your own PBS script. For an example seePbsScript. If you need more memory you may ask for a node but only use one of the two available cpuExample2PBS
- Submit this script from the command line, using:
qsub example.pbs
More expert tips regardingTipsVermeer.
- Set
system flavour
toPBS
. - The graphic interface is not supported in this computer, therefore you must edit the python scripts manually and execute them form the command line. For example, to execute the Projection Matching protocol you must edit the protocol_projmatch.py file.
- Available queues
- exe-x86_64 4 cpus per node and 16G memory. Example of pbs filePbsTrueno for x86_64
- exe-ia64 20? cpus in 1 node and 64G memory. Example of pbs filePbsTruenoIa64 for ia64
NOTE x86_64 and ia64 uses different binaries.
- Set
system flavour
toPBS
- Available queues: No idea, default one seems to be OK (check http://www.cesga.es/File/Computacion/DO-SIS-Guia-uso-FT.pdf for details)
If you want to modify the deault parameter check
- example of PBS file is availableFinisterraePbs
- examplpe of qsub is availableFinisterraeQsub
- Example of qsub.py fileFinisterraeQsub.py
- TerraeCommand
Finisterrae only accepts connections from the academic network and not from home. To overcome this problem you may tunnel your ssh connections.
(16 cpus per node and 142 nodes, nax memory per node up to 112)
See a description here.
ssh -N -f -L 2123:ft.cesga.es:22 USERNAMEJUMILLA@jumilla.cnb.csic.es
scp -P 2123 USERNAME_FINISTERRAE@localhost:/home/csic/eda/msp/Adeno/*py .
A couple of links related with finisterrae
SeeRunningXmippOnBlueGeneMartinsried
We have done a limited comparison of speed of some of our clusters. The summary of the results is availableComputerComparison
if you want to define an environmental variable with your machinefile for mpi use export OMPI_MCA_orte_default_hostfile=/home/roberto/machinefile.dat |
Main.RobertoMarabini | 2012-10-09 - 15:41 |
--Main.RobertoMarabini - 08 Oct 2007