.. _topics:processes:usage: ===== Usage ===== .. note:: This chapter assumes knowledge of the previous section on the :ref:`basic concept of processes`. This section will explain the aspects of working with processes that apply to all processes. Details that only pertain to a specific sub type of process, will be documented in their respective sections: * :ref:`calculation functions` * :ref:`calculation jobs` * :ref:`work functions` * :ref:`work chains` .. _topics:processes:usage:defining: Defining processes ================== .. _topics:processes:usage:spec: Process specification --------------------- How a process defines the inputs that it requires or can optionally take, depends on the process type. The inputs of :py:class:`~aiida.engine.processes.calcjobs.calcjob.CalcJob` and :py:class:`~aiida.engine.processes.workchains.workchain.WorkChain` are given by the :py:class:`~aiida.engine.processes.process_spec.ProcessSpec` class, which is defined though the :py:meth:`~aiida.engine.processes.process.Process.define` method. For process functions, the :py:class:`~aiida.engine.processes.process_spec.ProcessSpec` is dynamically generated by the engine from the signature of the decorated function. Therefore, to determine what inputs a process takes, one simply has to look at the process specification in the ``define`` method or the function signature. For the :py:class:`~aiida.engine.processes.calcjobs.calcjob.CalcJob` and :py:class:`~aiida.engine.processes.workchains.workchain.WorkChain` there is also the concept of the :ref:`process builder`, which will allow one to inspect the inputs with tab-completion and help strings in the shell. The three most important attributes of the :py:class:`~aiida.engine.processes.process_spec.ProcessSpec` are: * ``inputs`` * ``outputs`` * ``exit_codes`` Through these attributes, one can define what inputs a process takes, what outputs it will produce and what potential exit codes it can return in case of errors. Just by looking at a process specification then, one will know exactly *what* will happen, just not *how* it will happen. The ``inputs`` and ``outputs`` attributes are *namespaces* that contain so called *ports*, each one of which represents a specific input or output. The namespaces can be arbitrarily nested with ports and so are called *port namespaces*. The port and port namespace are implemented by the :py:class:`~plumpy.Port` and :py:class:`~aiida.engine.processes.ports.PortNamespace` class, respectively. .. _topics:processes:usage:ports_portnamespaces: Ports and Port namespaces ^^^^^^^^^^^^^^^^^^^^^^^^^ To define an input for a process specification, we only need to add a port to the ``inputs`` port namespace, as follows: .. code:: python spec = ProcessSpec() spec.input('parameters') The ``input`` method, will create an instance of :py:class:`~aiida.engine.processes.ports.InputPort`, a sub class of the base :py:class:`~plumpy.Port`, and will add it to the ``inputs`` port namespace of the spec. Creating an output is just as easy, but one should use the :py:meth:`~plumpy.ProcessSpec.output` method instead: .. code:: python spec = ProcessSpec() spec.output('result') This will cause an instance of :py:class:`~aiida.engine.processes.ports.CalcJobOutputPort`, also a sub class of the base :py:class:`~plumpy.Port`, to be created and to be added to the ``outputs`` specifcation attribute. Recall, that the ``inputs`` and ``output`` are instances of a :py:class:`~aiida.engine.processes.ports.PortNamespace`, which means that they can contain any port. But the :py:class:`~aiida.engine.processes.ports.PortNamespace` itself is also a port itself, so it can be added to another port namespace, allowing one to create nested port namespaces. Creating a new namespace in for example the inputs namespace is as simple as: .. code:: python spec = ProcessSpec() spec.input_namespace('namespace') This will create a new ``PortNamespace`` named ``namespace`` in the ``inputs`` namespace of the spec. You can create arbitrarily nested namespaces in one statement, by separating them with a ``.`` as shown here: .. code:: python spec = ProcessSpec() spec.input_namespace('nested.namespace') This command will result in the ``PortNamespace`` name ``namespace`` to be nested inside another ``PortNamespace`` called ``nested``. .. note:: Because the period is reserved to denote different nested namespaces, it cannot be used in the name of terminal input and output ports as that could be misinterpreted later as a port nested in a namespace. Graphically, this can be visualized as a nested dictionary and will look like the following: .. code:: python 'inputs': { 'nested': { 'namespace': {} } } The ``outputs`` attribute of the ``ProcessSpec`` is also a ``PortNamespace`` just as the ``inputs``, with the only different that it will create ``OutputPort`` instead of ``InputPort`` instances. Therefore the same concept of nesting through ``PortNamespaces`` applies to the outputs of a ``ProcessSpec``. .. _topics:processes:usage:validation_defaults: Validation and defaults ^^^^^^^^^^^^^^^^^^^^^^^ In the previous section, we saw that the ``ProcessSpec`` uses the ``PortNamespace``, ``InputPort`` and ``OutputPort`` to define the inputs and outputs structure of the ``Process``. The underlying concept that allows this nesting of ports is that the ``PortNamespace``, ``InputPort`` and ``OutputPort``, are all a subclass of :py:class:`~plumpy.ports.Port`. And as different subclasses of the same class, they have more properties and attributes in common, for example related to the concept of validation and default values. All three have the following attributes (with the exception of the ``OutputPort`` not having a ``default`` attribute): * ``default`` * ``required`` * ``valid_type`` * ``validator`` These attributes can all be set upon construction of the port or after the fact, as long as the spec has not been sealed, which means that they can be altered without limit as long as it is within the ``define`` method of the corresponding ``Process``. An example input port that explicitly sets all these attributes is the following: .. code:: python spec.input('positive_number', required=False, default=lambda: Int(1), valid_type=(Int, Float), validator=is_number_positive) Here we define an input named ``positive_number`` that should be of type ``Int`` or ``Float`` and should pass the test of the ``is_number_positive`` validator. If no value is passed, the default will be used. .. warning:: In python, it is good practice to avoid mutable defaults for function arguments, `since they are instantiated at function definition and reused for each invocation `_. This can lead to unexpected results when the default value is changed between function calls. In the context of AiiDA, nodes (both stored and unstored) are considered *mutable* and should therefore *not* be used as default values for process ports. However, it is possible to use a lambda that returns a node instance as done in the example above. This will return a new instance of the node with the given value, each time the process is instantiated. Note that the validator is nothing more than a free function which takes a single argument, being the value that is to be validated. If nothing is returned, the value is considered to be valid. To signal that the value is invalid and to have a validation error raised, simply return a string with the validation error message, for example: .. code:: python def is_number_positive(number): if number < 0: return 'The number has to be greater or equal to zero' The ``valid_type`` can define a single type, or a tuple of valid types. .. versionadded:: 2.1 Optional ports can now accept ``None`` If a port is marked as optional through ``required=False`` and defines ``valid_type``, the port will also accept ``None`` as values, whereas before this would raise validation error. This is accomplished by automatically adding the ``NoneType`` to the ``valid_type`` tuple. Ports that do not define a ``valid_type`` are not affected. .. note:: Note that by default all ports are required, but specifying a default value implies that the input is not required and as such specifying ``required=False`` is not necessary in that case. It was added to the example above simply for clarity. The validation of input or output values with respect to the specification of the corresponding port, happens at the instantiation of the process and when it is finalized, respectively. If the inputs are invalid, a corresponding exception will be thrown and the process instantiation will fail. When the outputs fail to be validated, likewise an exception will be thrown and the process state will be set to ``Excepted``. .. _topics:processes:usage:dynamic_namespaces: Dynamic namespaces ^^^^^^^^^^^^^^^^^^ In the previous section we described the various attributes related to validation and claimed that all the port variants share those attributes, yet we only discussed the ``InputPort`` and ``OutputPort`` explicitly. The statement, however, is still correct and the ``PortNamespace`` has the same attributes. You might then wonder what the meaning is of a ``valid_type`` or ``default`` for a ``PortNamespace`` if all it does is contain ``InputPorts``, ``OutputPorts`` or other ``PortNamespaces``. The answer to this question lies in the ``PortNamespace`` attribute ``dynamic``. Often when designing the specification of a ``Process``, we cannot know exactly which inputs we want to be able to pass to the process. However, with the concept of the ``InputPort`` and ``OutputPort`` one *does* need to know exactly, how many values one expects at least, as they do have to be defined. This is where the ``dynamic`` attribute of the ``PortNamespace`` comes in. By default this is set to ``False``, but by setting it to ``True``, one indicates that that namespace can take a number of values that is unknown at the time of definition of the specification. This now explains the meaning of the ``valid_type``, ``validator`` and ``default`` attributes in the context of the ``PortNamespace``. If you do mark a namespace as dynamic, you may still want to limit the set of values that are acceptable, which you can do by specifying the valid type and or validator. The values that will eventually be passed to the port namespace will then be validated according to these rules exactly as a value for a regular input port would be. .. _topics:processes:usage:non_db: Non storable inputs ^^^^^^^^^^^^^^^^^^^ In principle, the only valid types for inputs and outputs should be instances of a :py:class:`~aiida.orm.nodes.data.data.Data` node, or one of its sub classes, as that is the only data type that can be recorded in the provenance graph as an input or output of a process. However, there are cases where you might want to pass an input to a process, whose provenance you do not care about and therefore would want to pass a non-database storable type anyway. .. note:: AiiDA allows you to break the provenance as to be not too restrictive, but always tries to urge you and guide you in a direction to keep the provenance. There are legitimate reasons to break it regardless, but make sure you think about the implications and whether you are really willing to lose the information. For this situation, the ``InputPort`` has the attribute ``non_db``. By default this is set to ``False``, but by setting it to ``True`` we can indicate that the values that are passed to the port should not be stored as a node in the provenance graph and linked to the process node. This allows one to pass any normal value that one would also be able to pass to a normal function. .. _topics:processes:usage:serialize_inputs: Automatic input serialization ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Quite often, inputs which are given as Python data types need to be cast to the corresponding AiiDA type before passing them to a process. Doing this manually can be cumbersome, so you can define a function when defining the process specification, which does the conversion automatically. This function, passed as ``serializer`` parameter to ``spec.input``, is invoked if the given input is not ``None`` *and* not already an AiiDA type. For inputs which are stored in the database (``non_db=False``), the serialization function should return an AiiDA data type. For ``non_db`` inputs, the function must be idempotent because it might be applied more than once. The following example work chain takes three inputs ``a``, ``b``, ``c``, and simply returns the given inputs. The :func:`~aiida.orm.nodes.data.base.to_aiida_type` function is used as serialization function. .. include:: include/snippets/serialize/workchain_serialize.py :code: python This work chain can now be called with native Python types, which will automatically be converted to AiiDA types by the :func:`~aiida.orm.nodes.data.base.to_aiida_type` function. Note that the module which defines the corresponding AiiDA type must be loaded for it to be recognized by :func:`~aiida.orm.nodes.data.base.to_aiida_type`. .. include:: include/snippets/serialize/run_workchain_serialize.py :code: python Of course, you can also use the serialization feature to perform a more complex serialization of the inputs. .. _topics:processes:usage:exit_codes: Exit codes ^^^^^^^^^^ Any ``Process`` most likely will have one or multiple expected failure modes. To clearly communicate to the caller what went wrong, the ``Process`` supports setting its ``exit_status``. This ``exit_status``, a positive integer, is an attribute of the process node and by convention, when it is zero means the process was successful, whereas any other value indicates failure. This concept of an exit code, with a positive integer as the exit status, `is a common concept in programming `_ and a standard way for programs to communicate the result of their execution. Potential exit codes for the ``Process`` can be defined through the ``ProcessSpec``, just like inputs and outputs. Any exit code consists of a positive non-zero integer, a string label to reference it and a more detailed description of the problem that triggers the exit code. Consider the following example: .. code:: python spec = ProcessSpec() spec.exit_code(418, 'ERROR_I_AM_A_TEAPOT', 'the process had an identity crisis') This defines an exit code for the ``Process`` with exit status ``418`` and exit message ``the work chain had an identity crisis``. The string ``ERROR_I_AM_A_TEAPOT`` is a label that the developer can use to reference this particular exit code somewhere in the ``Process`` code itself. Whenever a ``Process`` exits through a particular error code, the caller will be able to introspect it through the ``exit_status`` and ``exit_message`` attributes of the node. Assume for example that we ran a ``Process`` that threw the exit code described above, the caller would be able to do the following: .. code:: python in[1] node = load_node() in[2] node.exit_status out[2] 418 in[2] node.exit_message out[2] 'the process had an identity crisis' This is useful, because the caller can now programmatically, based on the ``exit_status``, decide how to proceed. This is an infinitely more robust way of communicating specific errors to a non-human than parsing text-based logs or reports. Additionally, the exit codes make it very easy to query for failed processes with specific error codes. .. seealso:: Additional documentation, specific to certain process types, can be found in the following sections: - :ref:`Process functions` - :ref:`Work functions` - :ref:`CalcJob parsers` - :ref:`Workchain exit code specification` - :ref:`External code plugins` - :ref:`Restart workchains` .. _topics:processes:usage:exit_code_conventions: Exit code conventions ..................... In principle, the only restriction on the exit status of an exit code is that it should be a positive integer or zero. However, to make effective use of exit codes, there are some guidelines and conventions as to decide what integers to use. Note that since the following rules are *guidelines* you can choose to ignore them and currently the engine will not complain, but this might change in the future. Regardless, we advise you to follow the guidelines since it will improve the interoperability of your code with other existing plugins. The following integer ranges are reserved or suggested: * 0 - 99: Reserved for internal use by `aiida-core` * 100 - 199: Reserved for errors parsed from scheduler output of calculation jobs (note: this is not yet implemented) * 200 - 299: Suggested to be used for process input validation errors * 300 - 399: Suggested for critical process errors For any other exit codes, one can use the integers from 400 and up. .. _topics:processes:usage:metadata: Process metadata ---------------- Each process, in addition to the normal inputs defined through its process specification, can take optional 'metadata'. These metadata differ from inputs in the sense that they are not nodes that will show up as inputs in the provenance graph of the executed process. Rather, these are inputs that slightly modify the behavior of the process or allow to set attributes on the process node that represents its execution. The following metadata inputs are available for *all* process classes: * ``label``: will set the label on the ``ProcessNode`` * ``description``: will set the description on the ``ProcessNode`` * ``store_provenance``: boolean flag, by default ``True``, that when set to ``False``, will ensure that the execution of the process **is not** stored in the provenance graph Sub classes of the :py:class:`~aiida.engine.processes.process.Process` class can specify further metadata inputs, refer to their specific documentation for details. To pass any of these metadata options to a process, simply pass them in a dictionary under the key ``metadata`` in the inputs when launching the process. How a process can be launched is explained the following section. .. _topics:processes:usage:launching: Launching processes =================== Any process can be launched by 'running' or 'submitting' it. Running means to run the process in the current python interpreter in a blocking way, whereas submitting means to send it to a daemon worker over RabbitMQ. For long running processes, such as calculation jobs or complex workflows, it is best advised to submit to the daemon. This has the added benefit that it will directly return control to your interpreter and allow the daemon to save intermediate progress during checkpoints and reload the process from those if it has to restart. Running processes can be useful for trivial computational tasks, such as simple calcfunctions or workfunctions, or for debugging and testing purposes. .. _topics:processes:usage:launch: Process launch -------------- To launch a process, one can use the free functions that can be imported from the :py:mod:`aiida.engine` module. There are four different functions: * :py:func:`~aiida.engine.launch.run` * :py:func:`~aiida.engine.launch.run_get_node` * :py:func:`~aiida.engine.launch.run_get_pk` * :py:func:`~aiida.engine.launch.submit` As the name suggest, the first three will 'run' the process and the latter will 'submit' it to the daemon. Running means that the process will be executed in the same interpreter in which it is launched, blocking the interpreter, until the process is terminated. Submitting to the daemon, in contrast, means that the process will be sent to the daemon for execution, and the interpreter is released straight away. All functions have the exact same interface ``launch(process, inputs)`` where: * ``process`` is the process class or process function to launch * ``inputs`` the inputs dictionary to pass to the process. .. versionchanged:: 2.5 Before AiiDA v2.5, the inputs could only be passed as keyword arguments. This behavior is still supported, e.g., one can launch a process as ``launch(process, **inputs)`` or ``launch(process, input_a=value_a, input_b=value_b)``. However, the recommended approach is now to use an input dictionary passed as the second positional argument. The reason is that certain launchers define arguments themselves which can overlap with inputs of the process. For example, the ``submit`` method defines the ``wait`` keyword. If the process being launched *also* defines an input named ``wait``, the launcher method cannot tell them apart. What inputs can be passed depends on the exact process class that is to be launched. For example, when we want to run an instance of the :py:class:`~aiida.calculations.arithmetic.add.ArithmeticAddCalculation` process, which takes two :py:class:`~aiida.orm.nodes.data.int.Int` nodes as inputs under the name ``x`` and ``y`` [#f1]_, we would do the following: .. include:: include/snippets/launch/launch_submit.py :code: python The function will submit the calculation to the daemon and immediately return control to the interpreter, returning the node that is used to represent the process in the provenance graph. .. warning:: For a process to be submittable, the class or function needs to be importable in the daemon environment by a) giving it an :ref:`associated entry point` or b) :ref:`including its module path` in the ``PYTHONPATH`` that the daemon workers will have. .. versionadded:: 2.5 Waiting on a process Use ``wait=True`` when calling ``submit`` to wait for the process to complete before returning the node. This can be useful for tutorials and demos in interactive notebooks where the user should not continue before the process is done. One could of course also use ``run`` (see below), but then the process would be lost if the interpreter gets accidentally shut down. By using ``submit``, the process is run by the daemon which takes care of saving checkpoints so it can always be restarted in case of problems. If you need to launch multiple processes in parallel and want to wait for all of them to be finished, simply use ``submit`` with the default ``wait=False`` and collect the returned nodes in a list. You can then pass them to :func:`aiida.engine.launch.await_processes` which will return once all processes have terminated: .. code:: python from aiida.engine import submit, await_processes nodes = [] for i in range(5): node = submit(...) nodes.append(node) await_processes(nodes, wait_interval=10) The ``await_processes`` function will loop every ``wait_interval`` seconds and check whether all processes (represented by the ``ProcessNode`` in the ``nodes`` list) have terminated. The ``run`` function is called identically: .. include:: include/snippets/launch/launch_run.py :code: python except that it does not submit the process to the daemon, but executes it in the current interpreter, blocking it until the process is terminated. The return value of the ``run`` function is also **not** the node that represents the executed process, but the results returned by the process, which is a dictionary of the nodes that were produced as outputs. If you would still like to have the process node or the pk of the process node you can use one of the following variants: .. include:: include/snippets/launch/launch_run_alternative.py :code: python Finally, the :py:func:`~aiida.engine.launch.run` launcher has two attributes ``get_node`` and ``get_pk`` that are simple proxies to the :py:func:`~aiida.engine.launch.run_get_node` and :py:func:`~aiida.engine.launch.run_get_pk` methods. This is a handy shortcut, as now you can choose to use any of the three variants with just a single import: .. include:: include/snippets/launch/launch_run_shortcut.py :code: python If you want to launch a process class that takes a lot more inputs, often it is useful to define them in a dictionary and use the python syntax ``**`` that automatically expands it into keyword argument and value pairs. The examples used above would look like the following: .. include:: include/snippets/launch/launch_submit_dictionary.py :code: python Process functions, i.e. :ref:`calculation functions` and :ref:`work functions`, can be launched like any other process as explained above. Process functions have two additional methods of being launched: * Simply *calling* the function * Using the internal run method attributes Using a calculation function to add two numbers as an example, these two methods look like the following: .. include:: include/snippets/launch/launch_process_function.py :code: python .. _topics:processes:usage:builder: Process builder --------------- As explained in a :ref:`previous section`, the inputs for a :py:class:`~aiida.engine.processes.calcjobs.calcjob.CalcJob` and :py:class:`~aiida.engine.processes.workchains.workchain.WorkChain` are defined in the :py:meth:`~aiida.engine.processes.process.Process.define` method. To know what inputs they take, one would have to read the implementation, which can be annoying if you are not a developer. To simplify this process, these two process classes provide a utility called the 'process builder'. The process builder is essentially a tool that helps you build the inputs for the specific process class that you want to run. To get a *builder* for a particular ``CalcJob`` or a ``WorkChain`` implementation, all you need is the class itself, which can be loaded through the :py:class:`~aiida.plugins.factories.CalculationFactory` and :py:class:`~aiida.plugins.factories.WorkflowFactory`, respectively. Let's take the :py:class:`~aiida.calculations.arithmetic.add.ArithmeticAddCalculation` as an example:: ArithmeticAddCalculation = CalculationFactory('core.arithmetic.add') builder = ArithmeticAddCalculation.get_builder() The string ``core.arithmetic.add`` is the entry point of the ``ArithmeticAddCalculation`` and passing it to the ``CalculationFactory`` will return the corresponding class. Calling the ``get_builder`` method on that class will return an instance of the :py:class:`~aiida.engine.processes.builder.ProcessBuilder` class that is tailored for the ``ArithmeticAddCalculation``. The builder will help you in defining the inputs that the ``ArithmeticAddCalculation`` requires and has a few handy tools to simplify this process. To find out which inputs the builder exposes, you can simply use tab completion. In an interactive python shell, by simply typing ``builder.`` and hitting the tab key, a complete list of all the available inputs will be shown. Each input of the builder can also show additional information about what sort of input it expects. In an interactive shell, you can get this information to display as follows:: builder.code? Type: property String form: Docstring: "name": "code", "required": "True" "non_db": "False" "valid_type": "" "help": "The Code to use for this job.", In the ``Docstring`` you will see a ``help`` string that contains more detailed information about the input port. Additionally, it will display a ``valid_type``, which when defined shows which data types are expected. If a default value has been defined, that will also be displayed. The ``non_db`` attribute defines whether that particular input will be stored as a proper input node in the database, if the process is submitted. Defining an input through the builder is as simple as assigning a value to the attribute. The following example shows how to set the ``parameters`` input, as well as the ``description`` and ``label`` metadata inputs:: builder.metadata.label = 'This is my calculation label' builder.metadata.description = 'An example calculation to demonstrate the process builder' builder.x = Int(1) builder.y = Int(2) If you evaluate the ``builder`` instance, simply by typing the variable name and hitting enter, the current values of the builder's inputs will be displayed:: builder { 'metadata': { 'description': 'An example calculation to demonstrate the process builder', 'label': 'This is my calculation label', 'options': {}, }, 'x': Int, 'y': Int } In this example, you can see the value that we just set for the ``description`` and the ``label``. In addition, it will also show any namespaces, as the inputs of processes support nested namespaces, such as the ``metadata.options`` namespace in this example. Note that nested namespaces are also all autocompleted, and you can traverse them recursively with tab-completion. All that remains is to fill in all the required inputs and we are ready to launch the process builder. When all the inputs have been defined for the builder, it can be used to actually launch the ``Process``. The process can be launched by passing the builder to any of the free functions :py:mod:`~aiida.engine.launch` module, just as you would do a normal process as :ref:`described above`, i.e.: .. include:: include/snippets/launch/launch_builder.py :code: python Note that the process builder is in principle designed to be used in an interactive shell, as there is where the tab-completion and automatic input documentation really shines. However, it is perfectly possible to use the same builder in scripts where you simply use it as an input container, instead of a plain python dictionary. .. _topics:processes:usage:monitoring: Monitoring processes ==================== When you have launched a process, you may want to investigate its status, progression and the results. The :ref:`verdi` command line tool provides various commands to do just this. .. _topics:processes:usage:monitoring_list: verdi process list ------------------ Your first point of entry will be the ``verdi`` command ``verdi process list``. This command will print a list of all active processes through the ``ProcessNode`` stored in the database that it uses to represent its execution. A typical example may look something like the following: .. code-block:: bash PK Created State Process label Process status ---- ---------- ------------ -------------------------- ---------------------- 151 3h ago ⏵ Running ArithmeticAddCalculation 156 1s ago ⏹ Created ArithmeticAddCalculation Total results: 2 The 'State' column is a concatenation of the ``process_state`` and the ``exit_status`` of the ``ProcessNode``. By default, the command will only show active items, i.e. ``ProcessNodes`` that have not yet reached a terminal state. If you want to also show the nodes in a terminal states, you can use the ``-a`` flag and call ``verdi process list -a``: .. code-block:: bash PK Created State Process label Process status ---- ---------- --------------- -------------------------- ---------------------- 143 3h ago ⏹ Finished [0] add 146 3h ago ⏹ Finished [0] multiply 151 3h ago ⏵ Running ArithmeticAddCalculation 156 1s ago ⏹ Created ArithmeticAddCalculation Total results: 4 For more information on the meaning of the 'state' column, please refer to the documentation of the :ref:`process state `. The ``-S`` flag let's you query for specific process states, i.e. issuing ``verdi process list -S created`` will return: .. code-block:: bash PK Created State Process label Process status ---- ---------- ------------ -------------------------- ---------------------- 156 1s ago ⏹ Created ArithmeticAddCalculation Total results: 1 To query for a specific exit status, one can use ``verdi process list -E 0``: .. code-block:: bash PK Created State Process label Process status ---- ---------- ------------ -------------------------- ---------------------- 143 3h ago ⏹ Finished [0] add 146 3h ago ⏹ Finished [0] multiply Total results: 2 This simple tool should give you a good idea of the current status of running processes and the status of terminated ones. For a complete list of all the available options, please refer to the documentation of :ref:`verdi process`. If you are looking for information about a specific process node, the following three commands are at your disposal: * ``verdi process report`` gives a list of the log messages attached to the process * ``verdi process status`` print the call hierarchy of the process and status of all its nodes * ``verdi process show`` print details about the status, inputs, outputs, callers and callees of the process In the following sections, we will explain briefly how the commands work. For the purpose of example, we will show the output of the commands for a completed ``PwBaseWorkChain`` from the ``aiida-quantumespresso`` plugin, which simply calls a ``PwCalculation``. .. _topics:processes:usage:monitoring_report: verdi process report -------------------- The developer of a process can attach log messages to the node of a process through the :py:meth:`~aiida.engine.processes.process.Process.report` method. The ``verdi process report`` command will display all the log messages in chronological order: .. code-block:: bash 2018-04-08 21:18:51 [164 | REPORT]: [164|PwBaseWorkChain|run_calculation]: launching PwCalculation<167> iteration #1 2018-04-08 21:18:55 [164 | REPORT]: [164|PwBaseWorkChain|inspect_calculation]: PwCalculation<167> completed successfully 2018-04-08 21:18:56 [164 | REPORT]: [164|PwBaseWorkChain|results]: work chain completed after 1 iterations 2018-04-08 21:18:56 [164 | REPORT]: [164|PwBaseWorkChain|on_terminated]: remote folders will not be cleaned The log message will include a timestamp followed by the level of the log, which is always ``REPORT``. The second block has the format ``pk|class name|function name`` detailing information about, in this case, the work chain itself and the step in which the message was fired. Finally, the message itself is displayed. Of course how many messages are logged and how useful they are is up to the process developer. In general they can be very useful for a user to understand what has happened during the execution of the process, however, one has to realize that each entry is stored in the database, so overuse can unnecessarily bloat the database. .. _topics:processes:usage:monitoring_status: verdi process status -------------------- This command is most useful for ``WorkChain`` instances, but also works for ``CalcJobs``. One of the more powerful aspects of work chains, is that they can call ``CalcJobs`` and other ``WorkChains`` to create a nested call hierarchy. If you want to inspect the status of a work chain and all the children that it called, ``verdi process status`` is the go-to tool. An example output is the following: .. code-block:: bash PwBaseWorkChain [ProcessState.FINISHED] [4:results] └── PwCalculation [FINISHED] The command prints a tree representation of the hierarchical call structure, that recurses all the way down. In this example, there is just a single ``PwBaseWorkChain`` which called a ``PwCalculation``, which is indicated by it being indented one level. In addition to the call tree, each node also shows its current process state and for work chains at which step in the outline it is. This tool can be very useful to inspect while a work chain is running at which step in the outline it currently is, as well as the status of all the children calculations it called. .. _topics:processes:usage:monitoring_show: verdi process show ------------------ Finally, there is a command that displays detailed information about the ``ProcessNode``, such as its inputs, outputs and the optional other processes it called and or was called by. An example output for a ``PwBaseWorkChain`` would look like the following: .. code-block:: bash Property Value ------------- ------------------------------------ type WorkChainNode pk 164 uuid 08bc5a3c-da7d-44e0-a91c-dda9ddcb638b label description ctime 2018-04-08 21:18:50.850361+02:00 mtime 2018-04-08 21:18:50.850372+02:00 process state ProcessState.FINISHED exit status 0 code pw-v6.1 Inputs PK Type -------------- ---- ------------- parameters 158 Dict structure 140 StructureData kpoints 159 KpointsData pseudo_family 161 Str max_iterations 163 Int clean_workdir 160 Bool options 162 Dict Outputs PK Type ----------------- ---- ------------- output_band 170 BandsData remote_folder 168 RemoteData output_parameters 171 Dict output_array 172 ArrayData Called PK Type -------- ---- ------------- CALL 167 PwCalculation Log messages --------------------------------------------- There are 4 log messages for this calculation Run 'verdi process report 164' to see them This overview should give you all the information if you want to inspect a process' inputs and outputs in closer detail as it provides you their pk's. .. _topics:processes:usage:manipulating: Manipulating processes ====================== To understand how one can manipulate running processes, one has to understand the principles of the :ref:`process/node distinction` and a :ref:`process' lifetime` first, so be sure to have read those sections first. .. _topics:processes:usage:manipulating_pause_play_kill: verdi process pause/play/kill ----------------------------- The ``verdi`` command line interface provides three commands to interact with 'live' processes. * ``verdi process pause`` * ``verdi process play`` * ``verdi process kill`` The first pauses a process temporarily, the second resumes any paused processes and the third one permanently kills them. The sub command names might seem to tell you this already and it might look like that is all there is to know, but the functionality underneath is quite complicated and deserves additional explanation nonetheless. As the section on :ref:`the distinction between the process and the node` explained, manipulating a process means interacting with the live process instance that lives in the memory of the runner that is running it. By definition, these runners will always run in a different system process than the one from which you want to interact, because otherwise, you would *be* the runner, given that there can only be a single runner in an interpreter and if it is running, the interpreter would be blocked from performing any other operations. This means that in order to interact with the live process, one has to interact with another interpreter running in a different system process. This is once again facilitated by the RabbitMQ message broker. When a runner starts to run a process, it will also add listeners for incoming messages that are being sent for that specific process over RabbitMQ. .. note:: This does not just apply to daemon runners, but also local runners. If you were to launch a process in a local runner, that interpreter will be blocked, but it will still setup the listeners for that process on RabbitMQ. This means that you can manipulate the process from another terminal, just as you would do with a process that is being run by a daemon runner. In the case of 'pause', 'play' and 'kill', one is sending what is called a Remote Procedure Call (RPC) over RabbitMQ. The RPC will include the process identifier for which the action is intended and RabbitMQ will send it to whoever registered itself to be listening for that specific process, in this case the runner that is running the process. This immediately reveals a potential problem: the RPC will fall on deaf ears if there is no one listening, which can have multiple causes. For example, as explained in the section on a :ref:`process' lifetime`, this can be the case for a submitted process, where the corresponding task is still queued, as all available process slots are occupied. But even if the task *were* to be with a runner, it might be too busy to respond to the RPC and the process appears to be unreachable. Whenever a process is unreachable for an RPC, the command will return an error: .. code:: bash Error: Process<100> is unreachable Depending on the cause of the process being unreachable, the problem may resolve itself automatically over time and one can try again at a later time, as for example in the case of the runner being too busy to respond. To minimize these issues, the runner has been designed to have the communication happen over a separate thread and to schedule callbacks for any necessary actions on the main thread, which performs all the heavy lifting. Unfortunately, there is no easy way of telling what the actual problem is for the process not being reachable. The problem will manifest itself identically if the runner just could not respond in time or if the task has accidentally been lost forever due to a bug, even though these are two completely separate situations. This brings us to another potential unintuitive aspect of interacting with processes. The previous paragraph already mentioned it in passing, but when a remote procedure call is sent, it first needs to be answered by the responsible runner, if applicable, but it will not *directly execute* the call. This is because the call will be incoming on the communication thread which is not allowed to have direct access to the process instance, but instead it will schedule a callback on the main thread which can perform the action. The callback will however not necessarily be executed directly, as there may be other actions waiting to be performed. So when you pause, play or kill a process, you are not doing so directly, but rather you are *scheduling* a request to do so. If the runner has successfully received the request and scheduled the callback, the command will therefore show something like the following: .. code:: bash Success: scheduled killing Process<100> The 'scheduled' indicates that the actual killing might not necessarily have happened just yet. This means that even after having called ``verdi process kill`` and getting the success message, the corresponding process may still be listed as active in the output of ``verdi process list``. By default, the ``pause``, ``play`` and ``kill`` commands will only ask for the confirmation of the runner that the request has been scheduled and not actually wait for the command to have been executed. To change this behavior, you can use the ``--wait`` flag to actually wait for the action to be completed. If workers are under heavy load, it may take some time for them to respond to the request and for the command to finish. If you know that your daemon runners may be experiencing a heavy load, you can also increase the time that the command waits before timing out, with the ``-t/--timeout`` flag. .. rubric:: Footnotes .. [#f1] Note that the :py:class:`~aiida.calculations.arithmetic.add.ArithmeticAddCalculation` process class also takes a ``code`` as input, but that has been omitted for the purposes of the example. .. _topics:processes:usage:processes_api: The processes API ----------------- The functionality of ``verdi process`` to ``play``, ``pause`` and ``kill`` is now made available through the :meth:`aiida.engine.processes.control` module. Processes can be played, paused or killed through the :meth:`~aiida.engine.processes.control.play_processes`, :meth:`~aiida.engine.processes.control.pause_processes`, and :meth:`~aiida.engine.processes.control.kill_processes`, respectively: .. code-block:: python from aiida.engine.processes import control processes = [load_node(), load_node()] pause_processes(processes) # Pause the processes play_processes(processes) # Play them again kill_processes(processes) # Kill the processes Instead of specifying an explicit list of processes, the functions also take the ``all_entries`` keyword argument: .. code-block:: python pause_processes(all_entries=True) # Pause all running processes