aiida.schedulers package¶

Module for classes and utilities to interact with cluster schedulers.

class aiida.schedulers.JobState[source]¶

Bases: enum.Enum

Enumeration of possible scheduler states of a CalcJob.

There is no FAILED state as every completed job is put in DONE, regardless of success.

DONE = 'done'¶

QUEUED = 'queued'¶

QUEUED_HELD = 'queued held'¶

RUNNING = 'running'¶

SUSPENDED = 'suspended'¶

UNDETERMINED = 'undetermined'¶

__module__ = 'aiida.schedulers.datastructures'¶

class aiida.schedulers.JobResource(dictionary=None)[source]¶

Bases: aiida.common.extendeddicts.DefaultFieldsAttributeDict

A class to store the job resources. It must be inherited and redefined by the specific plugin, that should contain a _job_resource_class attribute pointing to the correct JobResource subclass.

It should at least define the get_tot_num_mpiprocs() method, plus an __init__ to accept its set of variables.

Typical attributes are:

num_machines
num_mpiprocs_per_machine

or (e.g. for SGE)

tot_num_mpiprocs
parallel_env

The __init__ should take care of checking the values. The init should raise only ValueError or TypeError on invalid parameters.

__module__ = 'aiida.schedulers.datastructures'¶

_default_fields = ()¶

classmethod accepts_default_mpiprocs_per_machine()[source]¶

Return True if this JobResource accepts a ‘default_mpiprocs_per_machine’ key, False otherwise.

Should be implemented in each subclass.

get_tot_num_mpiprocs()[source]¶: Return the total number of cpus of this job resource.

classmethod get_valid_keys()[source]¶: Return a list of valid keys to be passed to the __init__

class aiida.schedulers.JobTemplate(dictionary=None)[source]¶

Bases: aiida.common.extendeddicts.DefaultFieldsAttributeDict

A template for submitting jobs. This contains all required information to create the job header.

The required fields are: working_directory, job_name, num_machines,: num_mpiprocs_per_machine, argv.

Fields:

shebang line: The first line of the submission script

submit_as_hold: if set, the job will be in a ‘hold’ status right after the submission

rerunnable: if the job is rerunnable (boolean)

job_environment: a dictionary with environment variables to set before the execution of the code.

working_directory: the working directory for this job. During submission, the transport will first do a ‘chdir’ to this directory, and then possibly set a scheduler parameter, if this is supported by the scheduler.

email: an email address for sending emails on job events.

email_on_started: if True, ask the scheduler to send an email when the job starts.

email_on_terminated: if True, ask the scheduler to send an email when the job ends. This should also send emails on job failure, when possible.

job_name: the name of this job. The actual name of the job can be different from the one specified here, e.g. if there are unsupported characters, or the name is too long.

sched_output_path: a (relative) file name for the stdout of this job

sched_error_path: a (relative) file name for the stdout of this job

sched_join_files: if True, write both stdout and stderr on the same file (the one specified for stdout)

queue_name: the name of the scheduler queue (sometimes also called partition), on which the job will be submitted.

account: the name of the scheduler account (sometimes also called projectid), on which the job will be submitted.

qos: the quality of service of the scheduler account, on which the job will be submitted.

job_resource: a suitable JobResource subclass with information on how many nodes and cpus it should use. It must be an instance of the aiida.schedulers.Scheduler.job_resource_class class. Use the Scheduler.create_job_resource method to create it.

num_machines: how many machines (or nodes) should be used

num_mpiprocs_per_machine: how many MPI procs should be used on each machine (or node).

priority: a priority for this job. Should be in the format accepted by the specific scheduler.

max_memory_kb: The maximum amount of memory the job is allowed to allocate ON EACH NODE, in kilobytes

max_wallclock_seconds: The maximum wall clock time that all processes of a job are allowed to exist, in seconds

custom_scheduler_commands: a string that will be inserted right after the last scheduler command, and before any other non-scheduler command; useful if some specific flag needs to be added and is not supported by the plugin

prepend_text: a (possibly multi-line) string to be inserted in the scheduler script before the main execution line

append_text: a (possibly multi-line) string to be inserted in the scheduler script after the main execution line

import_sys_environment: import the system environment variables

codes_info: a list of aiida.common.datastructures.CalcInfo objects. Each contains the information necessary to run a single code. At the moment, it can contain:

cmdline_parameters: a list of strings with the command line arguments of the program to run. This is the main program to be executed. NOTE: The first one is the executable name. For MPI runs, this will probably be “mpirun” or a similar program; this has to be chosen at a upper level.

stdin_name: the (relative) file name to be used as stdin for the program specified with argv.

stdout_name: the (relative) file name to be used as stdout for the program specified with argv.

stderr_name: the (relative) file name to be used as stderr for the program specified with argv.

join_files: if True, stderr is redirected on the same file specified for stdout.
codes_run_mode: sets the run_mode with which the (multiple) codes have to be executed. For example, parallel execution:
mpirun -np 8 a.x &
mpirun -np 8 b.x &
wait
The serial execution would be without the &’s. Values are given by aiida.common.datastructures.CodeRunMode.

__module__ = 'aiida.schedulers.datastructures'¶

_default_fields = ('shebang', 'submit_as_hold', 'rerunnable', 'job_environment', 'working_directory', 'email', 'email_on_started', 'email_on_terminated', 'job_name', 'sched_output_path', 'sched_error_path', 'sched_join_files', 'queue_name', 'account', 'qos', 'job_resource', 'priority', 'max_memory_kb', 'max_wallclock_seconds', 'custom_scheduler_commands', 'prepend_text', 'append_text', 'import_sys_environment', 'codes_run_mode', 'codes_info')¶

class aiida.schedulers.JobInfo(dictionary=None)[source]¶

Bases: aiida.common.extendeddicts.DefaultFieldsAttributeDict

Contains properties for a job in the queue. Most of the fields are taken from DRMAA v.2.

Note that default fields may be undefined. This is an expected behavior and the application must cope with this case. An example for instance is the exit_status for jobs that have not finished yet; or features not supported by the given scheduler.

Fields:

job_id: the job ID on the scheduler

title: the job title, as known by the scheduler

exit_status: the exit status of the job as reported by the operating system on the execution host

terminating_signal: the UNIX signal that was responsible for the end of the job.

annotation: human-readable description of the reason for the job being in the current state or substate.

job_state: the job state (one of those defined in aiida.schedulers.datastructures.JobState)

job_substate: a string with the implementation-specific sub-state

allocated_machines: a list of machines used for the current job. This is a list of aiida.schedulers.datastructures.MachineInfo objects.

job_owner: the job owner as reported by the scheduler

num_mpiprocs: the total number of requested MPI procs

num_cpus: the total number of requested CPUs (cores) [may be undefined]

num_machines: the number of machines (i.e., nodes), required by the job. If allocated_machines is not None, this number must be equal to len(allocated_machines). Otherwise, for schedulers not supporting the retrieval of the full list of allocated machines, this attribute can be used to know at least the number of machines.

queue_name: The name of the queue in which the job is queued or running.

account: The account/projectid in which the job is queued or running in.

qos: The quality of service in which the job is queued or running in.

wallclock_time_seconds: the accumulated wallclock time, in seconds

requested_wallclock_time_seconds: the requested wallclock time, in seconds

cpu_time: the accumulated cpu time, in seconds

submission_time: the absolute time at which the job was submitted, of type datetime.datetime

dispatch_time: the absolute time at which the job first entered the ‘started’ state, of type datetime.datetime

finish_time: the absolute time at which the job first entered the ‘finished’ state, of type datetime.datetime

__module__ = 'aiida.schedulers.datastructures'¶

_default_fields = ('job_id', 'title', 'exit_status', 'terminating_signal', 'annotation', 'job_state', 'job_substate', 'allocated_machines', 'job_owner', 'num_mpiprocs', 'num_cpus', 'num_machines', 'queue_name', 'account', 'qos', 'wallclock_time_seconds', 'requested_wallclock_time_seconds', 'cpu_time', 'submission_time', 'dispatch_time', 'finish_time')¶

static _deserialize_date(value)[source]¶: Deserialise a date :param value: The date vlue :return: The deserialised date

static _deserialize_job_state(job_state)[source]¶: Return an instance of JobState from the job_state string.

static _serialize_date(value)[source]¶: Serialise a data value :param value: The value to serialise :return: The serialised value

static _serialize_job_state(job_state)[source]¶: Return the serialized value of the JobState instance.

_special_serializers = {'dispatch_time': 'date', 'finish_time': 'date', 'job_state': 'job_state', 'submission_time': 'date'}¶

deserialize_field(value, field_type)[source]¶: Deserialise the value of a particular field with a type :param value: The value :param field_type: The field type :return: The deserialised value

load_from_serialized(data)[source]¶: Load value from serialised data :param data: The data to load from :return: The value after loading

serialize()[source]¶: Serialise the current data :return: A serialised representation of the current data

serialize_field(value, field_type)[source]¶

Serialise a particular field value

Parameters

value – The value to serialise
field_type – The field type

Returns

The serialised value

class aiida.schedulers.NodeNumberJobResource(**kwargs)[source]¶

Bases: aiida.schedulers.datastructures.JobResource

An implementation of JobResource for schedulers that support the specification of a number of nodes and a number of cpus per node

__init__(**kwargs)[source]¶

Initialize the job resources from the passed arguments (the valid keys can be obtained with the function self.get_valid_keys()).

Should raise only ValueError or TypeError on invalid parameters.

__module__ = 'aiida.schedulers.datastructures'¶

_default_fields = ('num_machines', 'num_mpiprocs_per_machine', 'num_cores_per_machine', 'num_cores_per_mpiproc')¶

classmethod accepts_default_mpiprocs_per_machine()[source]¶: Return True if this JobResource accepts a ‘default_mpiprocs_per_machine’ key, False otherwise.

get_tot_num_mpiprocs()[source]¶: Return the total number of cpus of this job resource.

classmethod get_valid_keys()[source]¶: Return a list of valid keys to be passed to the __init__

class aiida.schedulers.ParEnvJobResource(**kwargs)[source]¶

Bases: aiida.schedulers.datastructures.JobResource

An implementation of JobResource for schedulers that support the specification of a parallel environment (a string) + the total number of nodes

__init__(**kwargs)[source]¶

Initialize the job resources from the passed arguments (the valid keys can be obtained with the function self.get_valid_keys()).

Raises

ValueError – on invalid parameters.
TypeError – on invalid parameters.
aiida.common.ConfigurationError – if default_mpiprocs_per_machine was set for this computer, since ParEnvJobResource cannot accept this parameter.

__module__ = 'aiida.schedulers.datastructures'¶

_default_fields = ('parallel_env', 'tot_num_mpiprocs', 'default_mpiprocs_per_machine')¶

classmethod accepts_default_mpiprocs_per_machine()[source]¶: Return True if this JobResource accepts a ‘default_mpiprocs_per_machine’ key, False otherwise.

get_tot_num_mpiprocs()[source]¶: Return the total number of cpus of this job resource.

class aiida.schedulers.MachineInfo(dictionary=None)[source]¶

Bases: aiida.common.extendeddicts.DefaultFieldsAttributeDict

Similarly to what is defined in the DRMAA v.2 as SlotInfo; this identifies each machine (also called ‘node’ on some schedulers) on which a job is running, and how many CPUs are being used. (Some of them could be undefined)

name: name of the machine
num_cpus: number of cores used by the job on this machine
num_mpiprocs: number of MPI processes used by the job on this machine

__module__ = 'aiida.schedulers.datastructures'¶

_default_fields = ('name', 'num_mpiprocs', 'num_cpus')¶

class aiida.schedulers.Scheduler[source]¶

Bases: object

Base class for all schedulers.

__abstractmethods__ = frozenset({'_get_joblist_command', '_get_submit_command', '_get_submit_script_header', '_parse_joblist_output', '_parse_submit_output'})¶

__dict__ = mappingproxy({'__module__': 'aiida.schedulers.scheduler', '__doc__': '\n Base class for all schedulers.\n ', '_logger': <Logger aiida.scheduler (REPORT)>, '_features': {}, '_job_resource_class': None, '__init__': <function Scheduler.__init__>, 'set_transport': <function Scheduler.set_transport>, 'get_valid_schedulers': <classmethod object>, 'get_short_doc': <classmethod object>, 'get_feature': <function Scheduler.get_feature>, 'logger': <property object>, 'job_resource_class': <aiida.common.lang.classproperty object>, 'create_job_resource': <classmethod object>, 'get_submit_script': <function Scheduler.get_submit_script>, '_get_submit_script_header': <function Scheduler._get_submit_script_header>, '_get_submit_script_footer': <function Scheduler._get_submit_script_footer>, '_get_run_line': <function Scheduler._get_run_line>, '_get_joblist_command': <function Scheduler._get_joblist_command>, '_get_detailed_jobinfo_command': <function Scheduler._get_detailed_jobinfo_command>, 'get_detailed_jobinfo': <function Scheduler.get_detailed_jobinfo>, '_parse_joblist_output': <function Scheduler._parse_joblist_output>, 'get_jobs': <function Scheduler.get_jobs>, 'transport': <property object>, '_get_submit_command': <function Scheduler._get_submit_command>, '_parse_submit_output': <function Scheduler._parse_submit_output>, 'submit_from_script': <function Scheduler.submit_from_script>, 'kill': <function Scheduler.kill>, '_get_kill_command': <function Scheduler._get_kill_command>, '_parse_kill_output': <function Scheduler._parse_kill_output>, '__dict__': <attribute '__dict__' of 'Scheduler' objects>, '__weakref__': <attribute '__weakref__' of 'Scheduler' objects>, '__abstractmethods__': frozenset({'_get_joblist_command', '_get_submit_command', '_get_submit_script_header', '_parse_submit_output', '_parse_joblist_output'}), '_abc_impl': <_abc_data object>})¶

__init__()[source]¶: Initialize self. See help(type(self)) for accurate signature.

__module__ = 'aiida.schedulers.scheduler'¶

__weakref__¶: list of weak references to the object (if defined)

_abc_impl = <_abc_data object>¶

_features = {}¶

_get_detailed_jobinfo_command(jobid)[source]¶

Return the command to run to get the detailed information on a job. This is typically called after the job has finished, to retrieve the most detailed information possible about the job. This is done because most schedulers just make finished jobs disappear from the ‘qstat’ command, and instead sometimes it is useful to know some more detailed information about the job exit status, etc.

Raises: aiida.common.exceptions.FeatureNotAvailable

abstract _get_joblist_command(jobs=None, user=None)[source]¶

Return the qstat (or equivalent) command to run with the required command-line parameters to get the most complete description possible; also specifies the output format of qsub to be the one to be used by the parse_queue_output method.

Must be implemented in the plugin.

Parameters

jobs – either None to get a list of all jobs in the machine, or a list of jobs.
user – either None, or a string with the username (to show only jobs of the specific user).

Note: typically one can pass only either jobs or user, depending on the: specific plugin. The choice can be done according to the value returned by self.get_feature(‘can_query_by_user’)

_get_kill_command(jobid)[source]¶

Return the command to kill the job with specified jobid.

To be implemented by the plugin.

_get_run_line(codes_info, codes_run_mode)[source]¶

Return a string with the line to execute a specific code with specific arguments.

Parameters

codes_info – a list of aiida.common.datastructures.CodeInfo objects. Each contains the information needed to run the code. I.e. cmdline_params, stdin_name, stdout_name, stderr_name, join_files. See the documentation of JobTemplate and CodeInfo
codes_run_mode –
contains the information on how to launch the multiple codes. As described in aiida.common.datastructures.CodeRunMode

argv: an array with the executable and the command line arguments.
The first argument is the executable. This should contain everything, including the mpirun command etc.

stdin_name: the filename to be used as stdin, relative to the
working dir, or None if no stdin redirection is required.

stdout_name: the filename to be used to store the standard output,
relative to the working dir, or None if no stdout redirection is required.

stderr_name: the filename to be used to store the standard error,
relative to the working dir, or None if no stderr redirection is required.

join_files: if True, stderr is redirected to stdout; the value of
stderr_name is ignored.

Return a string with the following format: [executable] [args] {[ < stdin ]} {[ < stdout ]} {[2>&1 | 2> stderr]}

abstract _get_submit_command(submit_script)[source]¶

Return the string to execute to submit a given script.

To be implemented by the plugin.

Parameters: submit_script (str) – the path of the submit script relative to the working directory. IMPORTANT: submit_script should be already escaped.
Returns: the string to execute to submit a given script.

_get_submit_script_footer(job_tmpl)[source]¶

Return the submit script final part, using the parameters from the job_tmpl.

Parameters: job_tmpl – a JobTemplate instance with relevant parameters set.

abstract _get_submit_script_header(job_tmpl)[source]¶

Return the submit script header, using the parameters from the job_tmpl.

Parameters: job_tmpl – a JobTemplate instance with relevant parameters set.

_job_resource_class = None¶

_logger = <Logger aiida.scheduler (REPORT)>¶

abstract _parse_joblist_output(retval, stdout, stderr)[source]¶

Parse the joblist output (‘qstat’), as returned by executing the command returned by _get_joblist_command method.

To be implemented by the plugin.

Return a list of JobInfo objects, one of each job, each with at least its default params implemented.

_parse_kill_output(retval, stdout, stderr)[source]¶

Parse the output of the kill command.

To be implemented by the plugin.

Returns: True if everything seems ok, False otherwise.

abstract _parse_submit_output(retval, stdout, stderr)[source]¶

Parse the output of the submit command, as returned by executing the command returned by _get_submit_command command.

To be implemented by the plugin.

Returns: a string with the JobID.

classmethod create_job_resource(**kwargs)[source]¶: Create a suitable job resource from the kwargs specified

get_detailed_jobinfo(jobid)[source]¶

Return a string with the output of the detailed_jobinfo command.

At the moment, the output text is just retrieved and stored for logging purposes, but no parsing is performed.

Raises: aiida.common.exceptions.FeatureNotAvailable

get_feature(feature_name)[source]¶

get_jobs(jobs=None, user=None, as_dict=False)[source]¶

Get the list of jobs and return it.

Typically, this function does not need to be modified by the plugins.

Parameters

jobs (list) – a list of jobs to check; only these are checked
user (str) – a string with a user: only jobs of this user are checked
as_dict (list) – if False (default), a list of JobInfo objects is returned. If True, a dictionary is returned, having as key the job_id and as value the JobInfo object.

Note: typically, only either jobs or user can be specified. See also comments in _get_joblist_command.

classmethod get_short_doc()[source]¶: Return the first non-empty line of the class docstring, if available

get_submit_script(job_tmpl)[source]¶

Return the submit script as a string. :parameter job_tmpl: a aiida.schedulers.datastrutures.JobTemplate object.

The plugin returns something like

#!/bin/bash <- this shebang line is configurable to some extent scheduler_dependent stuff to choose numnodes, numcores, walltime, … prepend_computer [also from calcinfo, joined with the following?] prepend_code [from calcinfo] output of _get_script_main_content postpend_code postpend_computer

classmethod get_valid_schedulers()[source]¶

job_resource_class = None¶

kill(jobid)[source]¶

Kill a remote job, and try to parse the output message of the scheduler to check if the scheduler accepted the command.

..note:: On some schedulers, even if the command is accepted, it may take some seconds for the job to actually disappear from the queue.

Parameters: jobid (str) – the job id to be killed
Returns: True if everything seems ok, False otherwise.

property logger¶: Return the internal logger.

set_transport(transport)[source]¶: Set the transport to be used to query the machine or to submit scripts. This class assumes that the transport is open and active.

submit_from_script(working_directory, submit_script)[source]¶

Goes in the working directory and submits the submit_script.

Return a string with the JobID in a valid format to be used for querying.

Typically, this function does not need to be modified by the plugins.

property transport¶: Return the transport set for this scheduler.

exception aiida.schedulers.SchedulerError[source]¶

Bases: aiida.common.exceptions.AiidaException

__module__ = 'aiida.schedulers.scheduler'¶

exception aiida.schedulers.SchedulerParsingError[source]¶

Bases: aiida.schedulers.scheduler.SchedulerError

__module__ = 'aiida.schedulers.scheduler'¶

Subpackages¶

aiida.schedulers.plugins package
- Submodules

Submodules¶

This module defines the main data structures used by the Scheduler.

In particular, there is the definition of possible job states (job_states), the data structure to be filled for job submission (JobTemplate), and the data structure that is returned when querying for jobs in the scheduler (JobInfo).

class aiida.schedulers.datastructures.JobState[source]¶

Bases: enum.Enum

Enumeration of possible scheduler states of a CalcJob.

There is no FAILED state as every completed job is put in DONE, regardless of success.

DONE = 'done'¶

QUEUED = 'queued'¶

QUEUED_HELD = 'queued held'¶

RUNNING = 'running'¶

SUSPENDED = 'suspended'¶

UNDETERMINED = 'undetermined'¶

__module__ = 'aiida.schedulers.datastructures'¶

class aiida.schedulers.datastructures.JobResource(dictionary=None)[source]¶

Bases: aiida.common.extendeddicts.DefaultFieldsAttributeDict

A class to store the job resources. It must be inherited and redefined by the specific plugin, that should contain a _job_resource_class attribute pointing to the correct JobResource subclass.

It should at least define the get_tot_num_mpiprocs() method, plus an __init__ to accept its set of variables.

Typical attributes are:

num_machines
num_mpiprocs_per_machine

or (e.g. for SGE)

tot_num_mpiprocs
parallel_env

The __init__ should take care of checking the values. The init should raise only ValueError or TypeError on invalid parameters.

__module__ = 'aiida.schedulers.datastructures'¶

_default_fields = ()¶

classmethod accepts_default_mpiprocs_per_machine()[source]¶

Return True if this JobResource accepts a ‘default_mpiprocs_per_machine’ key, False otherwise.

Should be implemented in each subclass.

get_tot_num_mpiprocs()[source]¶: Return the total number of cpus of this job resource.

classmethod get_valid_keys()[source]¶: Return a list of valid keys to be passed to the __init__

class aiida.schedulers.datastructures.JobTemplate(dictionary=None)[source]¶

Bases: aiida.common.extendeddicts.DefaultFieldsAttributeDict

A template for submitting jobs. This contains all required information to create the job header.

The required fields are: working_directory, job_name, num_machines,: num_mpiprocs_per_machine, argv.

Fields:

shebang line: The first line of the submission script

submit_as_hold: if set, the job will be in a ‘hold’ status right after the submission

rerunnable: if the job is rerunnable (boolean)

job_environment: a dictionary with environment variables to set before the execution of the code.

working_directory: the working directory for this job. During submission, the transport will first do a ‘chdir’ to this directory, and then possibly set a scheduler parameter, if this is supported by the scheduler.

email: an email address for sending emails on job events.

email_on_started: if True, ask the scheduler to send an email when the job starts.

email_on_terminated: if True, ask the scheduler to send an email when the job ends. This should also send emails on job failure, when possible.

job_name: the name of this job. The actual name of the job can be different from the one specified here, e.g. if there are unsupported characters, or the name is too long.

sched_output_path: a (relative) file name for the stdout of this job

sched_error_path: a (relative) file name for the stdout of this job

sched_join_files: if True, write both stdout and stderr on the same file (the one specified for stdout)

queue_name: the name of the scheduler queue (sometimes also called partition), on which the job will be submitted.

account: the name of the scheduler account (sometimes also called projectid), on which the job will be submitted.

qos: the quality of service of the scheduler account, on which the job will be submitted.

job_resource: a suitable JobResource subclass with information on how many nodes and cpus it should use. It must be an instance of the aiida.schedulers.Scheduler.job_resource_class class. Use the Scheduler.create_job_resource method to create it.

num_machines: how many machines (or nodes) should be used

num_mpiprocs_per_machine: how many MPI procs should be used on each machine (or node).

priority: a priority for this job. Should be in the format accepted by the specific scheduler.

max_memory_kb: The maximum amount of memory the job is allowed to allocate ON EACH NODE, in kilobytes

max_wallclock_seconds: The maximum wall clock time that all processes of a job are allowed to exist, in seconds

custom_scheduler_commands: a string that will be inserted right after the last scheduler command, and before any other non-scheduler command; useful if some specific flag needs to be added and is not supported by the plugin

prepend_text: a (possibly multi-line) string to be inserted in the scheduler script before the main execution line

append_text: a (possibly multi-line) string to be inserted in the scheduler script after the main execution line

import_sys_environment: import the system environment variables

codes_info: a list of aiida.common.datastructures.CalcInfo objects. Each contains the information necessary to run a single code. At the moment, it can contain:

cmdline_parameters: a list of strings with the command line arguments of the program to run. This is the main program to be executed. NOTE: The first one is the executable name. For MPI runs, this will probably be “mpirun” or a similar program; this has to be chosen at a upper level.

stdin_name: the (relative) file name to be used as stdin for the program specified with argv.

stdout_name: the (relative) file name to be used as stdout for the program specified with argv.

stderr_name: the (relative) file name to be used as stderr for the program specified with argv.

join_files: if True, stderr is redirected on the same file specified for stdout.
codes_run_mode: sets the run_mode with which the (multiple) codes have to be executed. For example, parallel execution:
mpirun -np 8 a.x &
mpirun -np 8 b.x &
wait
The serial execution would be without the &’s. Values are given by aiida.common.datastructures.CodeRunMode.

__module__ = 'aiida.schedulers.datastructures'¶

_default_fields = ('shebang', 'submit_as_hold', 'rerunnable', 'job_environment', 'working_directory', 'email', 'email_on_started', 'email_on_terminated', 'job_name', 'sched_output_path', 'sched_error_path', 'sched_join_files', 'queue_name', 'account', 'qos', 'job_resource', 'priority', 'max_memory_kb', 'max_wallclock_seconds', 'custom_scheduler_commands', 'prepend_text', 'append_text', 'import_sys_environment', 'codes_run_mode', 'codes_info')¶

class aiida.schedulers.datastructures.JobInfo(dictionary=None)[source]¶

Bases: aiida.common.extendeddicts.DefaultFieldsAttributeDict

Contains properties for a job in the queue. Most of the fields are taken from DRMAA v.2.

Note that default fields may be undefined. This is an expected behavior and the application must cope with this case. An example for instance is the exit_status for jobs that have not finished yet; or features not supported by the given scheduler.

Fields:

job_id: the job ID on the scheduler

title: the job title, as known by the scheduler

exit_status: the exit status of the job as reported by the operating system on the execution host

terminating_signal: the UNIX signal that was responsible for the end of the job.

annotation: human-readable description of the reason for the job being in the current state or substate.

job_state: the job state (one of those defined in aiida.schedulers.datastructures.JobState)

job_substate: a string with the implementation-specific sub-state

allocated_machines: a list of machines used for the current job. This is a list of aiida.schedulers.datastructures.MachineInfo objects.

job_owner: the job owner as reported by the scheduler

num_mpiprocs: the total number of requested MPI procs

num_cpus: the total number of requested CPUs (cores) [may be undefined]

num_machines: the number of machines (i.e., nodes), required by the job. If allocated_machines is not None, this number must be equal to len(allocated_machines). Otherwise, for schedulers not supporting the retrieval of the full list of allocated machines, this attribute can be used to know at least the number of machines.

queue_name: The name of the queue in which the job is queued or running.

account: The account/projectid in which the job is queued or running in.

qos: The quality of service in which the job is queued or running in.

wallclock_time_seconds: the accumulated wallclock time, in seconds

requested_wallclock_time_seconds: the requested wallclock time, in seconds

cpu_time: the accumulated cpu time, in seconds

submission_time: the absolute time at which the job was submitted, of type datetime.datetime

dispatch_time: the absolute time at which the job first entered the ‘started’ state, of type datetime.datetime

finish_time: the absolute time at which the job first entered the ‘finished’ state, of type datetime.datetime

__module__ = 'aiida.schedulers.datastructures'¶

_default_fields = ('job_id', 'title', 'exit_status', 'terminating_signal', 'annotation', 'job_state', 'job_substate', 'allocated_machines', 'job_owner', 'num_mpiprocs', 'num_cpus', 'num_machines', 'queue_name', 'account', 'qos', 'wallclock_time_seconds', 'requested_wallclock_time_seconds', 'cpu_time', 'submission_time', 'dispatch_time', 'finish_time')¶

static _deserialize_date(value)[source]¶: Deserialise a date :param value: The date vlue :return: The deserialised date

static _deserialize_job_state(job_state)[source]¶: Return an instance of JobState from the job_state string.

static _serialize_date(value)[source]¶: Serialise a data value :param value: The value to serialise :return: The serialised value

static _serialize_job_state(job_state)[source]¶: Return the serialized value of the JobState instance.

_special_serializers = {'dispatch_time': 'date', 'finish_time': 'date', 'job_state': 'job_state', 'submission_time': 'date'}¶

deserialize_field(value, field_type)[source]¶: Deserialise the value of a particular field with a type :param value: The value :param field_type: The field type :return: The deserialised value

load_from_serialized(data)[source]¶: Load value from serialised data :param data: The data to load from :return: The value after loading

serialize()[source]¶: Serialise the current data :return: A serialised representation of the current data

serialize_field(value, field_type)[source]¶

Serialise a particular field value

Parameters

value – The value to serialise
field_type – The field type

Returns

The serialised value

class aiida.schedulers.datastructures.NodeNumberJobResource(**kwargs)[source]¶

Bases: aiida.schedulers.datastructures.JobResource

An implementation of JobResource for schedulers that support the specification of a number of nodes and a number of cpus per node

__init__(**kwargs)[source]¶

Initialize the job resources from the passed arguments (the valid keys can be obtained with the function self.get_valid_keys()).

Should raise only ValueError or TypeError on invalid parameters.

__module__ = 'aiida.schedulers.datastructures'¶

_default_fields = ('num_machines', 'num_mpiprocs_per_machine', 'num_cores_per_machine', 'num_cores_per_mpiproc')¶

classmethod accepts_default_mpiprocs_per_machine()[source]¶: Return True if this JobResource accepts a ‘default_mpiprocs_per_machine’ key, False otherwise.

get_tot_num_mpiprocs()[source]¶: Return the total number of cpus of this job resource.

classmethod get_valid_keys()[source]¶: Return a list of valid keys to be passed to the __init__

class aiida.schedulers.datastructures.ParEnvJobResource(**kwargs)[source]¶

Bases: aiida.schedulers.datastructures.JobResource

An implementation of JobResource for schedulers that support the specification of a parallel environment (a string) + the total number of nodes

__init__(**kwargs)[source]¶

Initialize the job resources from the passed arguments (the valid keys can be obtained with the function self.get_valid_keys()).

Raises

ValueError – on invalid parameters.
TypeError – on invalid parameters.
aiida.common.ConfigurationError – if default_mpiprocs_per_machine was set for this computer, since ParEnvJobResource cannot accept this parameter.

__module__ = 'aiida.schedulers.datastructures'¶

_default_fields = ('parallel_env', 'tot_num_mpiprocs', 'default_mpiprocs_per_machine')¶

classmethod accepts_default_mpiprocs_per_machine()[source]¶: Return True if this JobResource accepts a ‘default_mpiprocs_per_machine’ key, False otherwise.

get_tot_num_mpiprocs()[source]¶: Return the total number of cpus of this job resource.

class aiida.schedulers.datastructures.MachineInfo(dictionary=None)[source]¶

Bases: aiida.common.extendeddicts.DefaultFieldsAttributeDict

Similarly to what is defined in the DRMAA v.2 as SlotInfo; this identifies each machine (also called ‘node’ on some schedulers) on which a job is running, and how many CPUs are being used. (Some of them could be undefined)

name: name of the machine
num_cpus: number of cores used by the job on this machine
num_mpiprocs: number of MPI processes used by the job on this machine

__module__ = 'aiida.schedulers.datastructures'¶

_default_fields = ('name', 'num_mpiprocs', 'num_cpus')¶

Implementation of Scheduler base class.

class aiida.schedulers.scheduler.Scheduler[source]¶

Bases: object

Base class for all schedulers.

__abstractmethods__ = frozenset({'_get_joblist_command', '_get_submit_command', '_get_submit_script_header', '_parse_joblist_output', '_parse_submit_output'})¶

__dict__ = mappingproxy({'__module__': 'aiida.schedulers.scheduler', '__doc__': '\n Base class for all schedulers.\n ', '_logger': <Logger aiida.scheduler (REPORT)>, '_features': {}, '_job_resource_class': None, '__init__': <function Scheduler.__init__>, 'set_transport': <function Scheduler.set_transport>, 'get_valid_schedulers': <classmethod object>, 'get_short_doc': <classmethod object>, 'get_feature': <function Scheduler.get_feature>, 'logger': <property object>, 'job_resource_class': <aiida.common.lang.classproperty object>, 'create_job_resource': <classmethod object>, 'get_submit_script': <function Scheduler.get_submit_script>, '_get_submit_script_header': <function Scheduler._get_submit_script_header>, '_get_submit_script_footer': <function Scheduler._get_submit_script_footer>, '_get_run_line': <function Scheduler._get_run_line>, '_get_joblist_command': <function Scheduler._get_joblist_command>, '_get_detailed_jobinfo_command': <function Scheduler._get_detailed_jobinfo_command>, 'get_detailed_jobinfo': <function Scheduler.get_detailed_jobinfo>, '_parse_joblist_output': <function Scheduler._parse_joblist_output>, 'get_jobs': <function Scheduler.get_jobs>, 'transport': <property object>, '_get_submit_command': <function Scheduler._get_submit_command>, '_parse_submit_output': <function Scheduler._parse_submit_output>, 'submit_from_script': <function Scheduler.submit_from_script>, 'kill': <function Scheduler.kill>, '_get_kill_command': <function Scheduler._get_kill_command>, '_parse_kill_output': <function Scheduler._parse_kill_output>, '__dict__': <attribute '__dict__' of 'Scheduler' objects>, '__weakref__': <attribute '__weakref__' of 'Scheduler' objects>, '__abstractmethods__': frozenset({'_get_joblist_command', '_get_submit_command', '_get_submit_script_header', '_parse_submit_output', '_parse_joblist_output'}), '_abc_impl': <_abc_data object>})¶

__init__()[source]¶: Initialize self. See help(type(self)) for accurate signature.

__module__ = 'aiida.schedulers.scheduler'¶

__weakref__¶: list of weak references to the object (if defined)

_abc_impl = <_abc_data object>¶

_features = {}¶

_get_detailed_jobinfo_command(jobid)[source]¶

Return the command to run to get the detailed information on a job. This is typically called after the job has finished, to retrieve the most detailed information possible about the job. This is done because most schedulers just make finished jobs disappear from the ‘qstat’ command, and instead sometimes it is useful to know some more detailed information about the job exit status, etc.

Raises: aiida.common.exceptions.FeatureNotAvailable

abstract _get_joblist_command(jobs=None, user=None)[source]¶

Return the qstat (or equivalent) command to run with the required command-line parameters to get the most complete description possible; also specifies the output format of qsub to be the one to be used by the parse_queue_output method.

Must be implemented in the plugin.

Parameters

jobs – either None to get a list of all jobs in the machine, or a list of jobs.
user – either None, or a string with the username (to show only jobs of the specific user).

Note: typically one can pass only either jobs or user, depending on the: specific plugin. The choice can be done according to the value returned by self.get_feature(‘can_query_by_user’)

_get_kill_command(jobid)[source]¶

Return the command to kill the job with specified jobid.

To be implemented by the plugin.

_get_run_line(codes_info, codes_run_mode)[source]¶

Return a string with the line to execute a specific code with specific arguments.

Parameters

codes_info – a list of aiida.common.datastructures.CodeInfo objects. Each contains the information needed to run the code. I.e. cmdline_params, stdin_name, stdout_name, stderr_name, join_files. See the documentation of JobTemplate and CodeInfo
codes_run_mode –
contains the information on how to launch the multiple codes. As described in aiida.common.datastructures.CodeRunMode

argv: an array with the executable and the command line arguments.
The first argument is the executable. This should contain everything, including the mpirun command etc.

stdin_name: the filename to be used as stdin, relative to the
working dir, or None if no stdin redirection is required.

stdout_name: the filename to be used to store the standard output,
relative to the working dir, or None if no stdout redirection is required.

stderr_name: the filename to be used to store the standard error,
relative to the working dir, or None if no stderr redirection is required.

join_files: if True, stderr is redirected to stdout; the value of
stderr_name is ignored.

Return a string with the following format: [executable] [args] {[ < stdin ]} {[ < stdout ]} {[2>&1 | 2> stderr]}

abstract _get_submit_command(submit_script)[source]¶

Return the string to execute to submit a given script.

To be implemented by the plugin.

Parameters: submit_script (str) – the path of the submit script relative to the working directory. IMPORTANT: submit_script should be already escaped.
Returns: the string to execute to submit a given script.

_get_submit_script_footer(job_tmpl)[source]¶

Return the submit script final part, using the parameters from the job_tmpl.

Parameters: job_tmpl – a JobTemplate instance with relevant parameters set.

abstract _get_submit_script_header(job_tmpl)[source]¶

Return the submit script header, using the parameters from the job_tmpl.

Parameters: job_tmpl – a JobTemplate instance with relevant parameters set.

_job_resource_class = None¶

_logger = <Logger aiida.scheduler (REPORT)>¶

abstract _parse_joblist_output(retval, stdout, stderr)[source]¶

Parse the joblist output (‘qstat’), as returned by executing the command returned by _get_joblist_command method.

To be implemented by the plugin.

Return a list of JobInfo objects, one of each job, each with at least its default params implemented.

_parse_kill_output(retval, stdout, stderr)[source]¶

Parse the output of the kill command.

To be implemented by the plugin.

Returns: True if everything seems ok, False otherwise.

abstract _parse_submit_output(retval, stdout, stderr)[source]¶

Parse the output of the submit command, as returned by executing the command returned by _get_submit_command command.

To be implemented by the plugin.

Returns: a string with the JobID.

classmethod create_job_resource(**kwargs)[source]¶: Create a suitable job resource from the kwargs specified

get_detailed_jobinfo(jobid)[source]¶

Return a string with the output of the detailed_jobinfo command.

At the moment, the output text is just retrieved and stored for logging purposes, but no parsing is performed.

Raises: aiida.common.exceptions.FeatureNotAvailable

get_feature(feature_name)[source]¶

get_jobs(jobs=None, user=None, as_dict=False)[source]¶

Get the list of jobs and return it.

Typically, this function does not need to be modified by the plugins.

Parameters

jobs (list) – a list of jobs to check; only these are checked
user (str) – a string with a user: only jobs of this user are checked
as_dict (list) – if False (default), a list of JobInfo objects is returned. If True, a dictionary is returned, having as key the job_id and as value the JobInfo object.

Note: typically, only either jobs or user can be specified. See also comments in _get_joblist_command.

classmethod get_short_doc()[source]¶: Return the first non-empty line of the class docstring, if available

get_submit_script(job_tmpl)[source]¶

Return the submit script as a string. :parameter job_tmpl: a aiida.schedulers.datastrutures.JobTemplate object.

The plugin returns something like

#!/bin/bash <- this shebang line is configurable to some extent scheduler_dependent stuff to choose numnodes, numcores, walltime, … prepend_computer [also from calcinfo, joined with the following?] prepend_code [from calcinfo] output of _get_script_main_content postpend_code postpend_computer

classmethod get_valid_schedulers()[source]¶

job_resource_class = None¶

kill(jobid)[source]¶

Kill a remote job, and try to parse the output message of the scheduler to check if the scheduler accepted the command.

..note:: On some schedulers, even if the command is accepted, it may take some seconds for the job to actually disappear from the queue.

Parameters: jobid (str) – the job id to be killed
Returns: True if everything seems ok, False otherwise.

property logger¶: Return the internal logger.

set_transport(transport)[source]¶: Set the transport to be used to query the machine or to submit scripts. This class assumes that the transport is open and active.

submit_from_script(working_directory, submit_script)[source]¶

Goes in the working directory and submits the submit_script.

Return a string with the JobID in a valid format to be used for querying.

Typically, this function does not need to be modified by the plugins.

property transport¶: Return the transport set for this scheduler.

exception aiida.schedulers.scheduler.SchedulerError[source]¶

Bases: aiida.common.exceptions.AiidaException

__module__ = 'aiida.schedulers.scheduler'¶

exception aiida.schedulers.scheduler.SchedulerParsingError[source]¶

Bases: aiida.schedulers.scheduler.SchedulerError

__module__ = 'aiida.schedulers.scheduler'¶

Datastructures test

class aiida.schedulers.test_datastructures.TestNodeNumberJobResource(methodName='runTest')[source]¶

Bases: unittest.case.TestCase

Unit tests for the NodeNumberJobResource class.

__module__ = 'aiida.schedulers.test_datastructures'¶

test_init()[source]¶: Test the __init__ of the NodeNumberJobResource class