Batch Job Schedulers#

Batch job schedulers manage the job queues and execution on a compute resource. AiiDA ships with plugins for a range of schedulers, and this section describes the interface of these plugins.

Follow these instructions to add support for a custom scheduler.

PBSPro#

The PBSPro scheduler is supported (tested: version 12.1).

All the main features are supported with this scheduler.

Use the NodeNumberJobResource (PBS-like) when setting job resources.

SLURM#

The SLURM scheduler is supported (tested: version 2.5.4).

All the main features are supported with this scheduler.

Use the NodeNumberJobResource (PBS-like) when setting job resources.

SGE#

The SGE scheduler (Sun Grid Engine, now called Oracle Grid Engine) and some of its main variants/forks are supported (tested: version GE 6.2u3).

All the main features are supported with this scheduler.

Use the ParEnvJobResource (SGE-like) when setting job resources.

LSF#

The IBM LSF scheduler is supported (tested: version 9.1.3 on the CERN lxplus cluster).

Torque#

Torque (based on OpenPBS) is supported (tested: version 2.4.16 from Ubuntu).

All the main features are supported with this scheduler.

Use the NodeNumberJobResource (PBS-like) when setting job resources.

Direct execution (bypassing schedulers)#

The direct scheduler plugin simply executes the command in a new bash shell, puts it in the background and checks for its process ID (PID) to determine when the execution is completed.

Its main purpose is debugging on the local machine. Use a proper batch scheduler for any production calculations.

Warning

Compared to a proper batch scheduler, direct execution mode is fragile. In particular:

  • There is no queueing, i.e. all calculations run in parallel.

  • PID numeration is reset during reboots.

Warning

Do not use the direct scheduler for running on a supercomputer. The job will end up running on the login node (which is typically forbidden), and if your centre has multiple login nodes, AiiDA may get confused if subsequent SSH connections end up at a different login node (causing AiiDA to infer that the job has completed).

All the main features are supported with this scheduler.

Use the NodeNumberJobResource (PBS-like) when setting job resources.

Job resources#

Unsurprisingly, different schedulers have different ways of specifying the resources for a job (such as the number of required nodes or the numbers of MPI processes per node).

In AiiDA, these differences are accounted for by subclasses of the JobResource class. The previous section lists which subclass to use with a given scheduler.

All subclasses define at least the get_tot_num_mpiprocs() method that returns the total number of MPI processes requested but otherwise have slightly different interfaces described in the following.

Note

You can manually load a specific JobResource subclass by directly importing it, e.g.

from aiida.schedulers.datastructures import NodeNumberJobResource

In practice, however, the appropriate class will be inferred from scheduler configured for the relevant AiiDA computer, and you can simply set the relevant fields in the metadata.options input dictionary of the CalcJob.

For a scheduler with job resources of type NodeNumberJobResource, this could be:

from aiida.orm import load_code

inputs = {
    'code': load_code('somecode@localhost'),  # The configured code to be used, which also defines the computer
    'metadata': {
        'options': {
            'resources', {'num_machines': 4, 'num_mpiprocs_per_machine': 16}
        }
    }
}

NodeNumberJobResource (PBS-like)#

The NodeNumberJobResource class is used for specifying job resources in PBS and SLURM.

The class has the following attributes:

  • res.num_machines: the number of machines (also called nodes) on which the code should run

  • res.num_mpiprocs_per_machine: number of MPI processes to use on each machine

  • res.tot_num_mpiprocs: the total number of MPI processes that this job requests

  • res.num_cores_per_machine: the number of cores to use on each machine

  • res.num_cores_per_mpiproc: the number of cores to run each MPI process on

You need to specify only two among the first three fields above, but they have to be defined upon construction. We suggest using the first two, for instance:

res = NodeNumberJobResource(num_machines=4, num_mpiprocs_per_machine=16)

asks the scheduler to allocate 4 machines, with 16 MPI processes on each machine. This will automatically ask for a total of 4*16=64 total number of MPI processes.

Note

When creating a new computer, you will be asked for a default_mpiprocs_per_machine. If specified, it will automatically be used as the default value for num_mpiprocs_per_machine whenever creating the resources for that computer.

Note

If you prefer using res.tot_num_mpiprocs instead, make sure it is a multiple of res.num_machines and/or res.num_mpiprocs_per_machine.

The first three fields are related by the equation:

res.num_machines * res.num_mpiprocs_per_machine = res.tot_num_mpiprocs

The num_cores_per_machine and num_cores_per_mpiproc fields are optional and must satisfy the equation:

res.num_cores_per_mpiproc * res.num_mpiprocs_per_machine = res.num_cores_per_machine

Note

In PBSPro, the num_mpiprocs_per_machine and num_cores_per_machine fields are used for mpiprocs and ppn respectively.

In Torque, the num_mpiprocs_per_machine field is used for ppn unless the num_mpiprocs_per_machine is specified.

ParEnvJobResource (SGE-like)#

The ParEnvJobResource class is used for specifying the resources of SGE and similar schedulers, which require specifying a parallel environment and the total number of CPUs requested.

The class has the following attributes:

  • res.parallel_env: the parallel environment in which you want to run your job (a string)

  • res.tot_num_mpiprocs: the total number of MPI processes that this job requests

Both attributes are required. No checks are done on the consistency between the specified parallel environment and the total number of MPI processes requested (for instance, some parallel environments may have been configured by your cluster administrator to run on a single machine). It is your responsibility to make sure that the information is valid, otherwise the submission will fail.

Setting the fields directly in the class constructor:

res = ParEnvJobResource(parallel_env='mpi', tot_num_mpiprocs=64)

And setting the fields using the metadata.options input dictionary of the CalcJob:

inputs = {
    'metadata': {
        'options': {
            resources', {'parallel_env': 'mpi', 'tot_num_mpiprocs': 64}
        }
    }
}

Developing a plugin#

A scheduler plugin allows AiiDA to communicate with a specific type of scheduler. The plugin should subclass the Scheduler class and implement a number of methods, that will instruct how certain key commands are to be executed, such as submitting a new job or requesting the current active jobs. To get you started, you can download this template and implement the following methods:

  1. _get_joblist_command: returns the command to report a full information on existing jobs.

  2. _get_detailed_job_info_command: returns the command to get the detailed information on a job, even after the job has finished.

  3. _get_submit_script_header: return the submit script header.

  4. _get_submit_command: return the string to submit a given script.

  5. _parse_joblist_output: parse the queue output string, as returned by executing the command returned by _get_joblist_command.

  6. _parse_submit_output: parse the output of the submit command, as returned by executing the command returned by _get_submit_command.

  7. _get_kill_command: return the command to kill the job with specified jobid.

  8. _parse_kill_output: parse the output of the kill command.

  9. parse_output: parse the output of the scheduler.

All these methods have to be implemented, except for _get_detailed_job_info_command and parse_output, which are optional. In addition to these methods, the _job_resource_class class attribute needs to be set to a subclass JobResource. For schedulers that work like SLURM, Torque and PBS, one can most likely simply reuse the NodeNumberJobResource class, that ships with aiida-core. Schedulers that work like LSF and SGE, may be able to reuse ParEnvJobResource instead. If neither of these work, one can implement a custom subclass, a template for which, the class called TemplateJobResource, is already included in the template file.

Note

To inform AiiDA about your new scheduler plugin you must register an entry point in the aiida.schedulers entry point group. Refer to the section on how to register plugins for instructions.