Supported schedulers

The list below describes the supported schedulers, i.e. the batch job schedulers that manage the job queues and execution on any given computer.

PBSPro

The PBSPro scheduler is supported (and it has been tested with version 12.1).

All the main features are supported with this scheduler.

The JobResource class to be used when setting the job resources is the NodeNumberJobResource (PBS-like).

SLURM

The SLURM scheduler is supported (and it has been tested with version 2.5.4).

All the main features are supported with this scheduler.

The JobResource class to be used when setting the job resources is the NodeNumberJobResource (PBS-like).

SGE

The SGE scheduler (Sun Grid Engine, now called Oracle Grid Engine) is supported (and it has been tested with version GE 6.2u3), together with some of the main variants/forks.

All the main features are supported with this scheduler.

The JobResource class to be used when setting the job resources is the ParEnvJobResource (SGE-like).

LSF

The IBM LSF scheduler is supported and has been tested with version 9.1.3 on the CERN lxplus cluster.

Torque

Torque (based on OpenPBS) is supported (and it has been tested with Torque v.2.4.16 from Ubuntu).

All the main features are supported with this scheduler.

The JobResource class to be used when setting the job resources is the NodeNumberJobResource (PBS-like).

Direct execution (bypassing schedulers)

The direct scheduler, to be used mainly for debugging, is an implementation of a scheduler plugin that does not require a real scheduler installed, but instead directly executes a command, puts it in the background, and checks for its process ID (PID) to discover if the execution is completed.

Warning

The direct execution mode is very fragile. Currently, it spawns a separate Bash shell to execute a job and track each shell by process ID (PID). This poses following problems:

  • PID numeration is reset during reboots;
  • PID numeration is different from machine to machine, thus direct execution is not possible in multi-machine clusters, redirecting each SSH login to a different node in round-robin fashion;
  • there is no real queueing, hence, all calculation started will be run in parallel.

Warning

Direct execution bypasses schedulers, so it should be used with care in order not to disturb the functioning of machines.

All the main features are supported with this scheduler.

The JobResource class to be used when setting the job resources is the NodeNumberJobResource (PBS-like)

Job resources

When asking a scheduler to allocate some nodes/machines for a given job, we have to specify some job resources, such as the number of required nodes or the numbers of MPI processes per node.

Unfortunately, the way of specifying this information is different on different clusters. In AiiDA, this is implemented in different subclasses of the aiida.scheduler.datastructures.JobResource class. The subclass that should be used is given by the scheduler, as described in the previous section.

The interfaces of these subclasses are not all exactly the same. Instead, specifying the resources is similar to writing a scheduler script. All classes define at least one method, get_tot_num_mpiprocs, that returns the total number of MPI processes requested.

In the following, the different JobResource subclasses are described:

Note

you can manually load a specific JobResource subclass by directly importing it, e..g.

from aiida.scheduler.datastructures import NodeNumberJobResource

However, in general, you will pass the fields to set directly to the set_option method of a JobCalculation object with the resources key. For instance:

calc = JobCalculation(computer=...) # select here a given computer configured
                                    # in AiiDA

# This assumes that the computer is configured to use a scheduler with
# job resources of type NodeNumberJobResource
calc.set_option('resources', {"num_machines": 4, "num_mpiprocs_per_machine": 16})

NodeNumberJobResource (PBS-like)

This is the way of specifying the job resources in PBS and SLURM. The class is aiida.scheduler.datastructures.NodeNumberJobResource.

Once an instance of the class is obtained, you have the following fields that you can set:

  • res.num_machines: specify the number of machines (also called nodes) on which the code should run
  • res.num_mpiprocs_per_machine: number of MPI processes to use on each machine
  • res.tot_num_mpiprocs: the total number of MPI processes that this job is requesting
  • res.num_cores_per_machine: specify the number of cores to use on each machine
  • res.num_cores_per_mpiproc: specify the number of cores to run each MPI process

Note that you need to specify only two among the first three fields above, for instance:

res = NodeNumberJobResource()
res.num_machines = 4
res.num_mpiprocs_per_machine = 16

asks the scheduler to allocate 4 machines, with 16 MPI processes on each machine. This will automatically ask for a total of 4*16=64 total number of MPI processes.

The same can be achieved passing the fields directly to the constructor:

res = NodeNumberJobResource(num_machines=4, num_mpiprocs_per_machine=16)

or, even better, directly calling the set_option method of the JobCalculation class (assuming here that calc is your calculation object) for the resources key:

calc.set_option('resources', {"num_machines": 4, "num_mpiprocs_per_machine": 16})

Note

If you specify res.num_machines, res.num_mpiprocs_per_machine, and res.tot_num_mpiprocs fields (not recommended), make sure that they satisfy:

res.num_machines * res.num_mpiprocs_per_machine = res.tot_num_mpiprocs

Moreover, if you specify res.tot_num_mpiprocs, make sure that this is a multiple of res.num_machines and/or res.num_mpiprocs_per_machine.

Note

When creating a new computer, you will be asked for a default_mpiprocs_per_machine. If you specify it, then you can avoid to specify num_mpiprocs_per_machine when creating the resources for that computer, and the default number will be used.

Of course, all the requirements between num_machines, num_mpiprocs_per_machine and tot_num_mpiprocs still apply.

Moreover, you can explicitly specify num_mpiprocs_per_machine if you want to use a value different from the default one.

The num_cores_per_machine and num_cores_per_mpiproc fields are optional. If you specify num_mpiprocs_per_machine and num_cores_per_machine fields, make sure that:

res.num_cores_per_mpiproc * res.num_mpiprocs_per_machine = res.num_cores_per_machine

If you want to specifiy single value in num_mpiprocs_per_machine and num_cores_per_machine, please make sure that res.num_cores_per_machine is multiple of res.num_cores_per_mpiproc and/or res.num_mpiprocs_per_machine.

Note

In PBSPro, the num_mpiprocs_per_machine and num_cores_per_machine fields are used for mpiprocs and ppn respectively.

Note

In Torque, the num_mpiprocs_per_machine field is used for ppn unless the num_mpiprocs_per_machine is specified.

ParEnvJobResource (SGE-like)

In SGE and similar schedulers, one has to specify a parallel environment and the total number of CPUs requested. The class is aiida.scheduler.datastructures.ParEnvJobResource.

Once an instance of the class is obtained, you have the following fields that you can set:

  • res.parallel_env: specify the parallel environment in which you want to run your job (a string)
  • res.tot_num_mpiprocs: the total number of MPI processes that this job is requesting

Remember to always specify both fields. No checks are done on the consistency between the specified parallel environment and the total number of MPI processes requested (for instance, some parallel environments may have been configured by your cluster administrator to run on a single machine). It is your responsibility to make sure that the information is valid, otherwise the submission will fail.

Some examples:

  • setting the fields one by one:

    res = ParEnvJobResource()
    res.parallel_env = 'mpi'
    res.tot_num_mpiprocs = 64
    
  • setting the fields directly in the class constructor:

    res = ParEnvJobResource(parallel_env='mpi', tot_num_mpiprocs=64)
    
  • even better, directly calling the set_option method of the JobCalculation class (assuming here that calc is your calculation object) for the resources key:

    calc.set_option('resources', {"parallel_env": 'mpi', "tot_num_mpiprocs": 64})