Workflows¶

Workflows are a central concept in AiiDA that allow you to string together multiple calculations that encodes the logic of a typical scientific workflow. In this section, we explain what workflows are, how they can be used and run. Finally, we will detail some best practices when designing workflows.

Workchains and workfunctions¶

At the core of a workflow, is the logic that defines the sequence of calculations that need to be executed to get from the initial inputs to the desired final answer. The way to encode this workflow logic in AiiDA, are workchains and workfunctions. By chaining workchains and workfunctions together, that each can run calculations within them, we can define a workflow. For simplicity, from here on out, we will use the terms, workflows, workchains and workfunctions interchangeably, as a ‘pars pro toto’ and ‘totum pro parte’.

Workfunctions¶

To illustrate how workfunctions and workchains are defined, how they are used and when to use which, we will consider the following trivial example. Given three integers, sum the first two and then multiply the result by the third. In plain python code, the solution would look something like the following:

```a = 1
b = 2
c = 3

return a + b

def multiply(a, b):
return a * b

```

This simple code snippet achieved the goal of getting the desired result, however, the provenance is lost. There is no connection between the output of the functions and their inputs. The remedy to this problem is the `workfunction`. The `workfunction` in AiiDA is a decorator that transforms a regular python function in a workfunction, which automatically stores the provenance of its output. The following snippet shows how little one has to change the initial solution to add automatic provenance keeping:

```from aiida.work.workfunctions import workfunction

a = 1
b = 2
c = 3

@workfunction
return a + b

@workfunction
def multiply(a, b):
return a * b

```

The only thing we had to do is to decorate the two functions with the `workfunction` decorator. Adding the decorator tells AiiDA that the provenance for this function when it is executed should be stored. This means linking up the inputs and the outputs for a calculation node, which represents the function that was executed. The final change that has to be performed is to make the inputs and the outputs storable. In this example, they are plain python integer types, which cannot be stored in the database. To solve this, one only has to wrap them in the `Int` class, which makes them storable in the database:

```from aiida.orm.data.int import Int
from aiida.work.workfunctions import workfunction

a = Int(1)
b = Int(2)
c = Int(3)

@workfunction
return Int(a + b)

@workfunction
def multiply(a, b):
return Int(a * b)

```

The only difference with the previous snippet is that all inputs and outputs have been wrapped in the `Int` class. With these trivial changes, the full provenance of the result is maintained and looks like this:

To summarize: to write a workflow that automatically stores the provenance, one only has to decorate the functions with the `workfunction` decorator and make sure that the inputs and outputs are wrapped in database storable types.

Workchains¶

Now that we have demonstrated how easily `workfunctions` can be used to write your workflow that automatically keeps the provenance, it is time to confess that workfunctions are not perfect and have their shortcomings. In the simple example of adding and multiplying numbers, the time to execute the functions is very short, but imagine that you are performing a more costly calculation, e.g. you want to run an actual `JobCalculation` that will be submitted to the scheduler and may run for a long time. If anywhere during the chain, the workflow is interrupted, for whatever reason, all progress is lost. There are no ‘checkpoints’, so to speak, by simply chaining workfunctions together.

But fret not! To tackle this problem, AiiDA defines the concept of the workchain. As the name suggests, this construct is a way to chain multiple logical steps of a workflow together in a way that allows to save the progress between those steps as soon as they are successfully completed. The workchain is therefore the preferred solution for parts of the workflow that involve more expensive and complex calculations. To define a workchain, AiiDA implements the `WorkChain` class.

If we were to reimplement our workfunction solution of the simple example problem of the previous section, but this time using a workchain, it would look something like the following:

```from aiida.orm.data.int import Int
from aiida.work.workchain import WorkChain

@classmethod
def define(cls, spec):
spec.input('a', valid_type=Int)
spec.input('b', valid_type=Int)
spec.input('c', valid_type=Int)
spec.outline(
cls.multiply,
cls.results,
)
spec.output('result', valid_type=Int)

self.ctx.sum = self.inputs.a + self.inputs.b

def multiply(self):
self.ctx.product = self.ctx.sum * self.ctx.sum

def results(self):
self.out('result', Int(self.ctx.product))
```

There is a lot going on in this snippet, so let’s tackle it line by line. Firstly, a `WorkChain` is a class and to create your own workchain, you subclass it and give it your own name, like `AddAndMultiplyWorkChain` in the example. You can pick any name that is a valid python class name. The most important method of the `WorkChain` class, is the `define` class method. Here you define, what inputs it takes, what outputs it will generate and the ‘logic’ that will be executed. The class method takes two arguments:

• `cls` this is the reference of the class itself and is mandatory for any class method
• `spec` which is the ‘specification’

Note

Do not forget to add the line `super(AddAndMultiplyWorkChain, self).define(spec)` as the first line of the `define` method, where you replace the class name with the name of your workchain. This will call the `define` method of the parent class, which is necessary for the workchain to work properly

As the name suggests, the `spec` can be used to specify the properties of the workchain. For example, it can be used to define inputs that the workchain takes. In our example, we need to be able to pass three integers as input, so we define those in the spec by calling `spec.input()`. The first argument is the name of the input. Additionally, as we have done here, you can specify which types are valid for that particular input. Since we expect integers, we specify that the valid type is the database storable `Int` class. Input validation is just one of the advantages of the `WorkChain` over the workfunction that we can already see here.

The outputs are defined in a similar manner, calling `spec.output()` you can declare a particular output that the workchain will or is expected to have. Be wary that if you define an output, but do not actually add it during the exection, at the end of the workchain, the validation will fail as by default all defined outputs are assumed to be required. If you want to specify an output that is optional, you can pass the keyword argument `required=False`. The final part of the spec definition is the `outline`. This is where you specify the ‘logic’ of the workchain. Since this example is rather contrived, in this case it is just a list of three functions calls `add`, `multiply` and `results`. However, the outline also supports logical constructs, like `if` conditionals `while` loops and `return` statements. Refer to the advanced workchain section to see all the possibilities the `outline` provides.

The only thing that remains, is to implement the methods that we added to the `outline`. Since they are class instance methods, they only take one argument `self`. Besides that rule, you can add any valid python code in the method that you want. The goal of the `add` method is to take the inputs `a` and `b` that are passed to the workchain and sum them. The inputs passed to the workchain are stored in the `inputs` attribute as an attribute dictionary. Therefore, to get the `a` input for example, you can call `self.inputs.a`.

After we summed `a` and `b`, we need to be able to store the temporary result and pass it to the next step in the `outline`, in this case the `multiply` method. For this purpose, each workchain has a context, which can be addressed at `self.ctx`. It is a dictionary that is persisted between workchain steps and can therefore be used to pass information and data between outline steps. Since it is a dictionary, we can store the sum of `a` and `b` by assigning it to a key of our choice. Again, any valid python key name, as long as it does do not contain a period, is fair game. Note that we do not have to return anything, as soon as the function ends, the workchain will save its state, including the result we just stored in the context and go to the next step, the `multiply` method. By now you should be familiar with what you see. We retrieve the sum we computed in the `add` method, multiply it by the `c` input integer and store it under the `product` key in the context.

The final step `results` adds the product as an official output of the workchain by calling `self.out()`. The first argument is the name of the output, which will also be used for the linkname in the provenance graph and the second argument is the actual value. Note that since it again has to be database storable, we wrap the product in the `Int` class. The resulting provenance when we run this workchain looks like this:

This is very quick overview of how a workchain works but of course it has a lot more features. To learn how to write workchains for real life problems, continue reading at the workflow development section.

When to use which¶

Now that we know how the two workflow components, workflows and workchains, work in AiiDA, you might wonder: when should I use which one? For simple operations that do not take long, the simplicity of the workfunction may be all you need, so by all means use it. However, a good rule of thumb is that as soon as the code is expected to take longer, for example when you want to launch a `JobCalculation`, it is always best to go for the `WorkChain`. The automatic checkpointing, which guarantess that work between steps is saved, becomes very important. The workchain offers a lot more features than checkpointing that may make it more preferable over the workfunction, which you can read about in the workflow development section.

Running workflows¶

Run¶

Without realizing it, in the introductory section on workfunctions and workchains, we already saw how a workfunction can be ran. We can run a workfunction in exactly the same manner as you would run any other python function. Simply call the function with the desired inputs and it will be executed, while AiiDA makes sure to store the provenance automatically in the background. You can run workfunctions from anywhere, also inside an outline step of a workchain.

Running a `WorkChain` on the other hand, is slightly different. Since it is a class, it cannot be ‘run’ directly like a function. Instead, we have to ‘launch’ it. This is done by passing it to the `run` function:

```from aiida.orm.data.int import Int
from aiida.work.launch import run
from aiida.work.workchain import WorkChain

...

result = run(AddAndMultiplyWorkChain, a=Int(1), b=Int(2), c=Int(3))
```

As you can see, the `run` function can be imported from `aiida.work.launch`. To launch the workchain (in this example we use the `AddAndMultiplyWorkChain` from the previous section), we simply call the `run` function with the workchain as the first argument, followed by the inputs as keyword arguments. Note that the keys used for each input have to correspond to the name of the inputs defined in the spec of the workchain. One can also define the inputs in a dictionary and then use the standard python expansion method to automatically unwrap the dictionary into keyword arguments, as is shown here:

```from aiida.orm.data.int import Int
from aiida.work.launch import run
from aiida.work.workchain import WorkChain

...

inputs = {
'a': Int(1),
'b': Int(2),
'c': Int(3)
}
```

After the workchain’s execution is finished, the result is returned, which is a dictionary of its outputs. In this example the variable `result` will therefore be equal to `{'result': 9}`. If you would also like to get a reference of the node that represents the `WorkChain` in the database, one can use the `run_get_node` or `run_get_pid` functions:

```from aiida.orm.data.int import Int
from aiida.work.launch import run_get_node, run_get_pid
from aiida.work.workchain import WorkChain

...

result, node = run_get_node(AddAndMultiplyWorkChain, a=Int(1), b=Int(2), c=Int(3))
result, pid = run_get_pid(AddAndMultiplyWorkChain, a=Int(1), b=Int(2), c=Int(3))
```

For the former, the `node` will be the `WorkCalculation` node that is used to represent the workchain in the database, whereas for the latter, the `pid` is the pk of that same node. The `run` based functions can actually also be used for `workfunctions`. Calling `run` with a workfunction, does exactly the same as running the workfunction directly as a normal python function and so doesn’t gain anything new. However, if you are interested in also getting the calculation node or the pid of the process, in addition to the result of the function, calling the workfunction through `run_get_node` or `run_get_pid` is the correct solution. Note that for workfunctions you can pass the inputs both as arguments as well as keyword arguments:

```from aiida.work.launch import run_get_node, run_get_pid
from aiida.work.workfunctions import workfunction

a = 1
b = 2

@workfunction
return a + b

# Passing inputs as arguments
result, node = run_get_node(add, a, b)
result, pid = run_get_pid(add, a, b)

# Passing inputs as keyword arguments
result, node = run_get_node(add, a=a, b=b)
result, pid = run_get_pid(add, a=a, b=b)
```

Submit¶

The launch functions, `run`, `run_get_node` and `run_get_pid`, described in the previous section, will execute the process in a blocking manner. That is to say that the interpreter in which you launch the process will be blocked until that process is completed. This might not necessarily be what you want. Imagine for example that you are launching a workchain that will take a long time to complete. The interpreter will be blocked the whole time and cannot do anything else. To circumvent this problem, you can also `submit` a process, for example a workchain:

```from aiida.orm.data.int import Int
from aiida.work.launch import submit
from aiida.work.workchain import WorkChain

...

node = submit(AddAndMultiplyWorkChain, a=Int(1), b=Int(2), c=Int(3))
```

Note

Workfunctions cannot be submitted but can only be run

The `submit` function will launch the process and send it to the daemon, who will take care of running it to the end. This way the interpreter is freed and regains control immediately. The return value of the `submit` call is the node that represents the process in the database. Note that besides the change in behavior, the syntax for passing the inputs to `submit` is exactly the same as for the `run` launch function and its siblings.

There is one limitation to the use of the `run` and `submit` launchers. They cannot be used within the steps of a `WorkChain` itself. Instead, the `WorkChain` class has its own `submit` method that should be used.

```from aiida.orm.data.int import Int
from aiida.work.workchain import WorkChain

...

def submit_sub_workchain(self)
node = self.submit(AddAndMultiplyWorkChain, a=Int(1), b=Int(2), c=Int(3))
```

In this example, we launch another instance of the `AddAndMultiplyWorkChain` from within the `AddAndMultiplyWorkChain` itself. The only difference is that, instead of using the free function `submit`, we use the class instance method `self.submit`. Note that this example is oversimplified and additional steps are required to be able to continue the workchain after submitting the child workchain. For more details, please refer to the advanced section on submitting calculations and workchains.

Note

When you submit a `WorkChain` or any other process over the daemon, you need to make sure that the daemon can find the class when it needs to load it. Registering your class through the plugin system with a designated entry point is one way to make sure that the daemon will be able to find it. If, however, you simply have a test class and do not want to go through the effort of creating an entry point for it, you should make sure that the module where you define the class is in the python path.

Process builder¶

There is one final way of launching a process, whether it be a `WorkChain` or a `JobCalculation`. Each process has a method called `get_builder` which will return an instance of the `ProcessBuilder` customised for that particular `Process` class. The builder knows exactly which inputs the process takes and expects and is therefore ideal for interactive usage. For details on how to instantiate and populate a `ProcessBuilder` instance please refer to the process builder section.

One you have constructed your builder and inserted all the inputs, you can pass it to the launch functions like we did in the previous two sections:

```from aiida.orm.data.int import Int
from aiida.work.launch import submit
from aiida.work.workchain import WorkChain

...

builder.a = Int(1)
builder.b = Int(2)
builder.c = Int(3)

node = submit(builder)
```

Note that you are free to use this method of launching processes in normal scripts, but the builder really is designed for use in an interactive shell.

Monitoring workflows¶

When you have launched workflows, be it workfunctions or workchains, you may want to investigate their status, progression and the results.

verdi work list¶

Your first point of entry will be the `verdi` command `verdi work list`. This command will print a list of all active `WorkCalculation` nodes, which are the database objects used by `WorkChains` and `workfunctions` to store the details of their execution in the database. A typical example may look something like the following:

```  PK  Creation    State           Process label
----  ----------  ------------    ----------------------
151  3h ago      Running | None  AddAnMultiplyWorkChain
156  1s ago      Created | None  AddAnMultiplyWorkChain

Total results: 2
```

The ‘State’ column is a concatenation of the `process_state` and the `exit_status` of the `WorkCalculation`. By default, the command will only show active items, i.e. `WorkCalculations` that have not yet reached a terminal state. If you want to also show the nodes in a terminal states, you can use the `-a` flag and call `verdi work list -a`:

```  PK  Creation    State           Process label
----  ----------  ------------    ----------------------
143  3h ago      Finished | 0    add
146  3h ago      Finished | 0    multiply
151  3h ago      Running | None  AddAnMultiplyWorkChain
156  1s ago      Created | None  AddAnMultiplyWorkChain

Total results: 4
```

For more information on the meaning of the ‘state’ column, please refer to the documentation of the process state. The `-S` flag let’s you query for specific process states, i.e. issuing `verdi work list -S created` will return:

```  PK  Creation    State           Process label
----  ----------  ------------    ----------------------
156  1s ago      Created | None  AddAnMultiplyWorkChain

Total results: 1
```

To query for a specific exit status, one can use `verdi work list -E 0`:

```  PK  Creation    State           Process label
----  ----------  ------------    ----------------------
143  3h ago      Finished | 0    add
146  3h ago      Finished | 0    multiply

Total results: 2
```

This simple tool should give you a good idea of the current status of running workflows and the status of terminated ones. If you are looking for information about a specific workflow node, the following three commands are at your disposal:

• `verdi work report` gives a list of the log messages attached to the workflow
• `verdi work status` print the call hierarchy of the workflow and status of all its nodes
• `verdi calculation show` print details about the status, inputs, outputs, callers and callees of the workflow

In the following sections, we will explain briefly how the commands work. For the purpose of example, we will show the output of the commands for a completed `PwBaseWorkChain` from the `aiida-quantumespresso` plugin, which simply calls a `PwCalculation`.

verdi work report¶

The developer of a `WorkChain` can attach log messages to the workchain at any place within the function body of one of the outline steps through the `report()` method. The `verdi work report` command will display all the log messages in chronological order:

```2018-04-08 21:18:51 [164 | REPORT]: [164|PwBaseWorkChain|run_calculation]: launching PwCalculation<167> iteration #1
2018-04-08 21:18:55 [164 | REPORT]: [164|PwBaseWorkChain|inspect_calculation]: PwCalculation<167> completed successfully
2018-04-08 21:18:56 [164 | REPORT]: [164|PwBaseWorkChain|results]: workchain completed after 1 iterations
2018-04-08 21:18:56 [164 | REPORT]: [164|PwBaseWorkChain|on_terminated]: remote folders will not be cleaned
```

The log message will include a timestamp followed by the level of the log, which is always `REPORT`. The second block has the format `pk|class name|step function name` detailing information about the workchain itself and the step in which the message was fired. Finally, the message itself is displayed. Of course how many messages are logged and how useful they are is up to the workchain development. In general they can be very useful for a user to understand what has happened during the execution of the workchain, however, one has to realize that each entry is stored in the database, so overuse can unnecessarily bloat the database.

verdi work status¶

One of the more powerful aspect of workchains, is that they can call `JobCalculations` and other `WorkChains` to create a nested call hierarchy. If you want to inspect the status of a workchain and all the children that it called, `verdi work status` is the go-to tool. An example output is the following:

```PwBaseWorkChain <pk=164> [ProcessState.FINISHED] [4:results]
└── PwCalculation <pk=167> [FINISHED]
```

The command prints a tree representation of the hierarchical call structure, that recurses all the way down. In this example, there is just a single `PwBaseWorkChain` which called a `PwCalculation`, which is indicated by it being indented one level. In addition to the call tree, each node also shows its current process state and for workchains at which step in the outline it is. This tool can be very useful to inspect while a workchain is running at which step in the outline it currently is, as well as the status of all the children calculations it called.

verdi calculation show¶

Finally, there is a command that displays detailed information about the `WorkCalculation`, such as its inputs, outputs and the calculations it called and was called by. Since the `WorkCalculation` is a sub class of the `Calculation` class, we can use the same command `verdi calculation show` that one would use to inspect the details of a `JobCalculation`. An example output for a `PwBaseWorkChain` would look like the following:

```Property       Value
-------------  ------------------------------------
type           WorkCalculation
pk             164
uuid           08bc5a3c-da7d-44e0-a91c-dda9ddcb638b
label
description
ctime          2018-04-08 21:18:50.850361+02:00
mtime          2018-04-08 21:18:50.850372+02:00
process state  ProcessState.FINISHED
exit status    0
code           pw-v6.1

Inputs            PK  Type
--------------  ----  -------------
parameters       158  ParameterData
structure        140  StructureData
kpoints          159  KpointsData
pseudo_family    161  Str
max_iterations   163  Int
clean_workdir    160  Bool
options          162  ParameterData

Outputs              PK  Type
-----------------  ----  -------------
output_band         170  BandsData
remote_folder       168  RemoteData
output_parameters   171  ParameterData
output_array        172  ArrayData

Called      PK  Type
--------  ----  -------------
CALL       167  PwCalculation

Log messages
---------------------------------------------
There are 4 log messages for this calculation
Run 'verdi work report 164' to see them
```

This overview should give you all the information if you want to inspect a workchains inputs and outputs in closer detail as it provides you with their pk’s.

Workflow development¶

This section will be a more in-depth guide of all the features and tools that the workflow system of AiiDA provides. Along the way, we will also highlight how one can use these tools to write maintanable, robust and modular workflows. Just like the definition of a `WorkChain`, we will start with the process specification.

Process specification¶

The process specification of a workchain, implemented by the `ProcessSpec`, is the construct that is used to define the inputs, outputs and the logical outline of the workchain. Defining this specification is therefore one of the more important steps of designing a workflow. A very simple example of the definition of a workchain specification, in the `define()` method, was demonstrated in the introductory section on workchains. In this section we will describe all the features of the process spec in more detail.

Ports and Portnamespaces¶

Two core concepts and components of the `ProcessSpec` that have remained hidden so far, but are crucial to understanding the functionality of the `ProcessSpec`, are that of the `Port` and the `PortNamespace`. In the workchain introduction, we already saw how an input could be defined for the process spec:

```spec.input('a')
```

What this directive really accomplishes, is that an `InputPort` is added to the `inputs` attribute of the `ProcessSpec`, which is a `PortNamespace`. This `PortNamespace` is a simple namespace that contains all the `InputPorts` and can even have nested `PortNamespaces`. This allows the designer of a workchain to create any nested structure for the input ports. Creating a new namespace in the inputs namespace is as simple as:

```spec.input_namespace('namespace')
```

This will create a new `PortNamespace` named `namespace` in the `inputs` namespace of the spec. You can create arbitrarily nested namespaces in one statement, by separating them with a `.` as shown here:

```spec.input_namespace('nested.namespace')
```

This command will result in the `PortNamespace` name `namespace` to be nested inside another `PortNamespace` called `nested`.

Note

Because the period is reserved to denote different nested namespaces, it cannot be used in the name of terminal input and output ports as that could be misinterpreted later as a port nested in a namespace.

Graphically, this can be visualized as a nested dictionary and will look like the following:

```'inputs': {
'nested': {
'namespace': {}
}
}
```

The `outputs` attribute of the `ProcessSpec` is also a `PortNamespace` just as the `inputs`, with the only different that it will create `OutputPort` instead of `InputPort` instances. Therefore the same concept of nesting through `PortNamespaces` applies to the outputs of a `ProcessSpec`.

Validation and defaults¶

In the previous section, we saw that the `ProcessSpec` uses the `PortNamespace`, `InputPort` and `OutputPort` to define the inputs and outputs structure of the `Process`. The underlying concept that allows this nesting of ports is that the `PortNamespace`, `InputPort` and `OutputPort`, are all a subclass of `Port`. And as different subclasses of the same class, they have more properties and attributes in common, for example related to the concept of validation and default values. All three have the following attributes (with the exception of the `OutputPort` not having a `default` attribute):

• `default`
• `required`
• `valid_type`
• `validator`

These attributes can all be set upon construction of the port or after the fact, as long as the spec has not been sealed, which means that they can be altered without limit as long as it is within the `define` method of the corresponding `Process`. An example input port that explicitly sets all these attributes is the following:

```spec.input('positive_number', required=False, default=Int(1), valid_type=(Int, Float), validator=is_number_positive)
```

Here we define an input named `positive_number` that is not required, if a value is not explicitly passed, the default `Int(1)` will be used and if a value is passed, it should be of type `Int` or `Float` and it should be valid according to the `is_number_positive` validator. Note that the validator is nothing more than a free function which takes a single argument, being the value that is to be validated and should return `True` if that value is valid or `False` otherwise. The `valid_type` can define a single type, or a tuple of valid types. Note that by default all ports are required, but specifying a default value implies that the input is not required and as such specifying `required=False` is not necessary in that case. It was added to this example simply for clarity.

The validation of input or output values with respect to the specification of the corresponding port, happens at the instantiation of the process and when it is finalized, respectively. If the inputs are invalid, a corresponding exception will be thrown and the process instantiation will fail. When the outputs fail to be validated, likewise an exception will be thrown and the process state will be set to `Excepted`.

Dynamic namespaces¶

In the previous section we described the various attributes related to validation and claimed that all the port variants share those attributes, yet we only discussed the `InputPort` and `OutputPort`. The statement, however, is still correct and the `PortNamespace` has the same attributes. You might then wonder what the meaning is of a `valid_type` or `default` for a `PortNamespace` if all it does is contain `InputPorts`, `OutputPorts` or other `PortNamespaces`. The answer to this question lies in the `PortNamespace` attribute `dynamic`.

Often when designing the specification of a `Process`, we cannot know exactly which inputs we want to be able to pass to the process. However, with the concept of the `InputPort` and `OutputPort` one does need to know exactly, how many value one expects at least, as they do have to be defined. This is where the `dynamic` attribute of the `PortNamespace` comes in. By default this is set to `False`, but by setting it to `True`, one indicates that that namespace can take a number of values that is unknown at the time of definition of the specification. This now explains the meaning of the `valid_type`, `validator` and `default` attributes in the context of the `PortNamespace`. If you do mark a namespace as dynamic, you may still want to limit the set of values that are acceptable, which you can do by specifying the valid type and or validator. The values that will eventually be passed to the port namespave will then be validated according to these rules exactly as a value for a regular input port would be.

Non storable inputs¶

In the introduction of this section on workflows, we mentioned that valid types for inputs and outputs should be AiiDA data types, as they can be stored in the database and that is the only way the provenance can be kept. However, there are cases where you might want to pass an input to a workchain, whose provenance you do not care about and therefore would want to pass a non-database storable type anyway.

Note

AiiDA allows you to break the provenance as to be not too restrictive, but always tries to urge you and guide you in a direction to keep the provenance. There are legitimate reasons to break it regardless, but make sure you think about the implications and whether you are really willing to lose the information.

For this situation, the `InputPort` has the attribute `non_db`. By default this is set to `False`, but by setting it to `True` the port is marked that the values that are passed to it should not be stored as a node in the provenance graph and linked to the calculation node. This allows one to pass any normal value that one would also be able to pass to a normal function.

Outline¶

After the ports, both input and output, have been specified, it is time to define the outline of the workchain. The outline describes the logical flow of the workchain, which makes it an extremely important part of the workchain design process. Since the goal of a workchain should be to execute a very well defined task, it is the goal of the outline to capture the required logic to achieve that goal, in a clear and short yet not overly succint manner. The outline supports various logical flow constructs, such as conditionals and while loops, so where possible this logic should be expressed in the outline and not in the body of the outline functions. However, one can also go overboard and put too finely grained logical blocks into the outline, causing it to become bulky and difficult to understand.

A good rule of thumb in designing the outline is the following: before you start designing a workchain, define very clearly the task that it should carry out. Once the goal is clear, draw a schematic block diagram of the necessary steps and logical decisions that connect them, in order to accomplish that goal. Converting the resulting flow diagram in a one-to-one fashion into an outline, often results in very reasonable outline designs.

The currently available logical constructs for the workchain outline are:

• `if`, `elif`, `else`
• `while`
• `return`

To distinguish these constructs from the python builtins, they are suffixed with an underscore, like so `while_`. To use these in your workchain design, you will have to import them:

```from aiida.work.workchain import if_, while_, return_
```

The following example shows how to use these logical constructs to define the outline of a workchain:

```spec.outline(
cls.intialize_to_zero,
while_(cls.n_is_less_than_hundred)(
if_(cls.n_is_multitple_of_three)(
cls.report_fizz,
).elif_(cls.n_is_multiple_of_five)(
cls.report_buzz,
).elif_(cls.n_is_multiple_of_three_and_five)(
cls.report_fizz_buzz,
).else_(
cls.report_n,
)
),
cls.increment_n_by_one,
)
```

This is an implementation (and an extremely contrived one at that) of the well known FizzBuzz problem. The idea is that the program is supposed to print in sequence the numbers from zero to some limit, except when the number is a multiple of three `Fizz` is printed, for a multiple of five `Buzz` and when it is a multiple of both, the program should print `FizzBuzz`. The actual implementation of the outline steps themselves is now trivial:

```def initialize_to_zero(self):
self.ctx.n = 0

def n_is_less_than_hundred(self):
return self.ctx.n < 100

def n_is_multiple_of_three(self):
return self.ctx.n % 3 == 0

def n_is_multiple_of_five(self):
return self.ctx.n % 5 == 0

def n_is_multiple_of_three_and_five(self):
return self.ctx.n % 3 == 0 and self.ctx.n % 5 == 0

def increment_n_by_one(self):
self.ctx.n += 1
```

The intention of this example is to show that with a well designed outline, a user only has to look at the outline to have a good idea what the workchain does and how it does it. One should not have to look at the implementation of the outline steps as all the important information is captured by the outline itself.

Exit codes¶

Any `WorkChain` most likely will have one or multiple expected failure modes. To clearly communicate to the caller what went wrong, the `WorkChain` supports setting its `exit_status`. This `exit_status`, a positive integer, is an attribute of the calculation node and by convention, when it is zero means the process was successful, whereas any other value indicates failure. This concept of an exit code, with a positive integer as the exit status, is a common concept in programming and a standard way for programs to communicate the result of their execution.

Potential exit codes for the `WorkChain` can be defined through the `ProcessSpec`, just like inputs and ouputs. Any exit code consists of a positive non-zero integer, a string label to reference it and a more detailed description of the problem that triggers the exit code. Consider the following example:

```spec.exit_code(418, 'ERROR_I_AM_A_TEAPOT', 'the workchain had an identity crisis')
```

This defines an exit code for the `WorkChain` with exit status `418` and exit message `the workchain had an identity crisis`. The string `ERROR_I_AM_A_TEAPOT` is a label that the developer can use to reference this particular exit code somewhere in the `WorkChain` code itself. A detailed explanation of how this is accomplished will be explained in a later section.

Whenever a `WorkChain` exits through a particular error code, the caller will be able to introspect it through the `exit_status` and `exit_message` attributes of the node. Assume for example that we ran a `WorkChain` that threw the exit code described above, the caller would be able to do the following:

```in[1] workchain = load_node(<pk>)
in[2] workchain.exit_status
out[2] 418
in[2] workchain.exit_message
out[2] 'the workchain had an identity crisis'
```

This is useful, because the caller can now programmatically, based on the `exit_status`, decide how to proceed. This is an infinitely more robust way of communcating specific errors to a non-human then parsing text based logs or reports (see the section on reporting).

Reporting¶

During the execution of a `WorkChain`, we may want to keep the user abreast of its progress and what is happening. For this purpose, the `WorkChain` implements the `report()` method, which functions as a logger of sorts. It takes a single argument, a string, that is the message that needs to be reported:

```def submit_calculation(self):
self.report('here we will submit a calculation')
```

This will send that message to the internal logger of python, which will cause it to be picked up by the default AiiDA logger, but it will also trigger the database log handler, which will store the message in the database and link it to the node of the workchain. This allows the `verdi work report` command to retrieve all those messages that were fired using the `report` method for a specifc `WorkCalculation` node. Note that the report method, in addition to the pk of the workchain, will also automatically record the name of the workchain and the name of the outline step in which the report message was fired. This information will show up in the output of `verdi work report`, so you never have to explicitly reference the workchain name, outline step name or date and time in the message itself.

It is important to note that the report system is a form of logging and as such has been designed to be read by humans only. That is to say, the report system is not designed to pass information programmatically by parsing the log messages.

Context¶

In the simplest workchain example presented in the introductory section, we already saw how the context can be used to persist information during the execution of a workchain and pass it between outline steps. The context is essentially a data container, very similar to a dictionary that can hold all sorts of data. The workflow engine will ensure that its contents are saved and persisted in between steps and when the daemon shuts down or restarts. A trivial example of this would be the following:

```def step_one(self):
self.ctx.some_variable = 'store me in the context'

def step_two(self):
assert self.ctx.some_variable == 'store me in the context'
```

In the `step_one` outline step we store the string `store me in the context` in the context, which can be addressed as `self.ctx`, under the key `some_variable`. Note that for the key you can use anything that would be a valid key for a normal python dictionary. In the second outline step `step_two`, we can verify that the string was successfully persisted, by checking the value stored in the context `self.ctx.some_variable`. This was just a simple example to introduce the concept of the context, however, it really is one of the more important parts of the workchain. The context really becomes crucial when you want to submit a calculation or another workchain from within the workchain. How this is accomplished, we will show in the next section.

Submitting calculations and workchains¶

One of the main tasks of a `WorkChain` will be to launch a `JobCalculation` or even another `WorkChain`. An example in the section on running workflows already showed that the `WorkChain` class provides the `submit()` method, to submit another `WorkChain` or `JobCalculation` to the daemon. However, that is not enough to complete the process. When the `submit` method is called, the process is created and submitted to the daemon, but at that point it is not yet done. So the value that is returned by the `submit` call is not the result of the submitted process, but rather it is a future. When the process is terminated, this future will than be transformed into the results of the process, however, until this happens, the workchain cannot continue: it has to wait for the process to be finished. To do this, control has to be returned to the workflow engine, which can then, when the process is completed, call the next step in the outline, where we can analyse the results.

To context¶

In order to store the future of the submitted process, we can store it in the context with a special construct that will tell the workflow engine that it should wait for that process to finish before continuing the workchain. To illustrate how this works, consider the following minimal example:

```from aiida.orm.data.int import Int
from aiida.work.workchain import WorkChain, ToContext

class SomeWorkChain(WorkChain):

@classmethod
def define(cls, spec):
super(SomeWorkChain, cls).define(spec)
spec.outline(
cls.submit_workchain,
cls.inspect_workchain,
)

def submit_workchain(self)
future = self.submit(SomeWorkChain)

def inspect_workchain(self)
assert self.ctx.workchain.is_finished_ok
```

As explained in the previous section, calling `self.submit` for a given process that you want to submit, will return a future. To add this future to the context, we can not access the context directly as explained in the context section, but rather we need to use the class `ToContext`. This class has to be imported from the `aiida.work.workchain` module. To add the future to the context, simply construct an instance of `ToContext`, passing the future as a keyword argument, and returning it from the outline step. The keyword used, `workchain` in this example, will be the key used under which to store the node in the context once its execution has terminated. Returning an instance of `ToContext` signals to the workflow engine that it has to wait for the futures contained within it to finish execution, store their nodes in the context under the specified keys and then continue to the next step in the outline. In this example, that is the `inspect_workchain` method. At this point we are sure that the process, a workchain in this case, has terminated its execution, although not necessarily successful, and we can continue the logic of the workchain.

Sometimes one wants to launch not just one, but multiple processes at the same time that can run in parallel. With the mechanism described above, this will not be possible since after submitting a single process and returning the `ToContext` instance, the workchain has to wait for the process to be finished before it can continue. To solve this problem, there is another way to add futures to the context:

```from aiida.orm.data.int import Int
from aiida.work.workchain import WorkChain, ToContext

class SomeWorkChain(WorkChain):

@classmethod
def define(cls, spec):
super(SomeWorkChain, cls).define(spec)
spec.outline(
cls.submit_workchains,
cls.inspect_workchains,
)

def submit_workchains(self)
for i in range(3):
future = self.submit(SomeWorkChain)
key = 'workchain_{}'.format(i)
self.to_context(key=future)

def inspect_workchains(self)
for i in range(3):
key = 'workchain_{}'.format(i)
assert self.ctx[key].is_finished_ok
```

Here we submit three workchains in a for loop in a single outline step, but instead of returning an instance of `ToContext`, we call the `to_context()` method. This method has exactly the same syntax as the `ToContext` class, except it is not necessary to return its value, so we can call it multiple times in one outline step. Under the hood the functionality is also the same as the `ToContext` class. At the end of the `submit_workchains` outline step, the workflow engine will find the futures that were added by calling `to_context` and will wait for all of them to be finished. The good thing here is that these three sub workchains can be run in parallel and once all of them are done, the parent workchain will go to the next step, which is `inspect_workchains`. There we can find the nodes of the workchains in the context under the key that was used as the keyword argument in the `to_context` call in the previous step.

Since we do not want the subsequent calls of `to_context` to override the previous future, we had to create unique keys to store them under. In this example, we chose to use the index of the for-loop. The name carries no meaning and is just required to guarantee unique key names. This pattern will occur often where you will want to launch multiple workchains or calculations in parallel and will have to come up with unique names. In essence, however, you are really just creating a list and it would be better to be able to create a list in the context and simply append the future to that list as you submit them. How this can be achieved is explained in the next section.

Appending¶

When you want to add a future of a submitted sub process to the context, but append it to a list rather than assign it to a key, you can use the `append_()` function. Consider the example from the previous section, but now we will use the `append_` function instead:

```from aiida.orm.data.int import Int
from aiida.work.workchain import WorkChain, ToContext, append_

class SomeWorkChain(WorkChain):

@classmethod
def define(cls, spec):
super(SomeWorkChain, cls).define(spec)
spec.outline(
cls.submit_workchains,
cls.inspect_workchains,
)

def submit_workchains(self)
for i in range(3):
future = self.submit(SomeWorkChain)
self.to_context(workchains=append_(future))

def inspect_workchains(self)
for workchain in self.ctx.workchains:
assert workchain.is_finished_ok
```

Notice that in the `submit_workchains` step we no longer have to generate a unique key based on the index but we simply wrap the future in the `append_` function and assign it to the generic key `workchains`. The workflow engine will see the `append_` function and instead of assigning the node corresponding to the future to the key `workchains`, it will append it to the list stored under that key. If the list did not yet exist, it will automatically be created. The `self.ctx.workchains` now contains a list with the nodes of the completed workchains and so in the `inspect_workchains` step we can simply iterate over it to access all of them in order.

Note that the use of `append_` is not just limited to the `to_context` method. You can also use it in exactly the same way with `ToContext` to append a process to a list in the context in multiple outline steps.

Aborting and exit codes¶

At the end of every outline step, the return value will be inspected by the workflow engine. If a non-zero integer value is detected, the workflow engine will interpret this as an exit code and will stop the execution of the workchain, while setting its process state to `Finished`. In addition, the integer return value will be set as the `exit_status` of the workchain, which combined with the `Finished` process state will denote that the worchain is considered to be `Failed`, as explained in the section on the process state. This is useful because it allows a workflow designer to easily exit from a workchain and use the return value to communicate programmatically the reason for the workchain stopping.

We assume that you have read the section on how to define exit code through the process specification of the workchain. Consider the following example workchain that defines such an exit code:

```spec.exit_code(400, 'ERROR_CALCULATION_FAILED', 'the child calculation did not finish successfully')
```

Now imagine that in the outline, we launch a calculation and in the next step check whether it finished successfully. In the event that the calculation did not finish successfully, the following snippet shows how you can retrieve the corresponding exit code and abort the `WorkChain` by returning it:

```def submit_calculation(self):
inputs = {'code': code}
future = self.submit(SomeJobCalculation, **inputs)

def inspect_calculation(self):
if not self.ctx.calculation.is_finished_ok:
self.report('the calculation did not finish successfully, there is nothing we can do')
return self.exit_codes.ERROR_CALCULATION_FAILED

self.report('the calculation finished successfully')
```

In the `inspect_calculation` outline, we retrieve the calculation that was submitted and added to the context in the previous step and check if it finished successfully through the property `is_finished_ok`. If this returns `False`, in this example we simply fire a report message and return the exit code corresponding to the label `ERROR_CALCULATION_FAILED`. Note that the specific exit code can be retrieved through the `WorkChain` property `exit_codes`. This will return a collection of exit codes that have been defined for that `WorkChain` and any specific exit code can then be retrieved by accessing it as an attribute. Returning this exit code, which will be an instance of the `ExitCode` named tuple, will cause the workchain to be aborted and the `exit_status` and `exit_message` to be set on the node, which were defined in the spec.

Note

The notation `self.exit_codes.ERROR_CALCULATION_FAILED` is just syntactic sugar to retrieve the `ExitCode` tuple that was defined in the spec with that error label. Constructing your own `ExitCode` directly and returning that from the outline step will have exactly the same effect in terms of aborting the workchain execution and setting the exit status and message. However, it is strongly advised to define the exit code through the spec and retrieve it through the `self.exit_codes` collection, as that makes it easily retrievable through the spec by the caller of the workchain.

The best part about this method of aborting a workchains execution, is that the exit status can now be used programmatically, by for example a parent workchain. Imagine that a parent workchain submitted this workchain. After it has terminated its execution, the parent workchain will want to know what happened to the child workchain. As already noted in the report section, the report messages of the workchain should not be used. The exit status, however, is a perfect way. The parent workchain can easily request the exit status of the child workchain through the `exit_status` property, and based on its value determine how to proceed.

Workfunction exit codes¶

The method of setting the exit status for a `WorkChain` by returning an `ExitCode`, as explained in the previous section, works almost exactly the same for `workfunctions`. The only difference is that for a workfunction, we do not have access to the convenience `exit_codes` property of the `WorkChain`, but rather we have to import and return an `ExitCode` ourselves. This named tuple can be constructed with an integer, to denote the desired exit status and an optional message, and when returned, the workflow engine will mark the node of the workfunction as `Finished` and set the exit status and message to the value of the tuple. Consider the following example:

```@workfunction
def exiting_workfunction():
from aiida.work import ExitCode
return ExitCode(418, 'I am a teapot')
```

The execution of the workfunction will be immediately terminated as soon as the tuple is returned, and the exit status and message will be set to `418` and `I am a teapot`, respectively. Since no output nodes are returned, the `FunctionCalculation` node will have no outputs and the value returned from the function call will be an empty dictionary.

Modular workflow design¶

When creating complex workflows, it is a good idea to split them up into smaller, modular parts. At the lowest level, each workflow should perform exactly one task. These workflows can then be wrapped together by a “parent” workflow to create a larger logical unit.

In order to make this approach manageable, it needs to be as simple as possible to glue together multiple workflows in a larger parent workflow. One of the tools that AiiDA provides to simplify this is the ability to expose the ports of another workchain.

Exposing inputs and outputs¶

Consider the following example workchain, which simply takes a few inputs and returns them again as outputs:

```from aiida.orm.data.bool import Bool
from aiida.orm.data.float import Float
from aiida.orm.data.int import Int
from aiida.work import WorkChain

class ChildWorkChain(WorkChain):
@classmethod
def define(cls, spec):
super(ChildWorkChain, cls).define(spec)
spec.input('a', valid_type=Int)
spec.input('b', valid_type=Float)
spec.input('c', valid_type=Bool)
spec.outline(cls.do_run)
spec.output('d', valid_type=Int)
spec.output('e', valid_type=Float)
spec.output('f', valid_type=Bool)

def do_run(self):
self.out('d', self.inputs.a)
self.out('e', self.inputs.b)
self.out('f', self.inputs.c)
```

As a first example, we will implement a thin wrapper workflow, which simply forwards its inputs to `ChildWorkChain`, and forwards the outputs of the child to its outputs:

```from aiida.work import ToContext, WorkChain, run
from child import ChildWorkChain

class SimpleParentWorkChain(WorkChain):

@classmethod
def define(cls, spec):
super(SimpleParentWorkChain, cls).define(spec)
spec.expose_inputs(ChildWorkChain)
spec.expose_outputs(ChildWorkChain)
spec.outline(cls.run_child, cls.finalize)

def run_child(self):
child = self.submit(ChildWorkChain, **self.exposed_inputs(ChildWorkChain))

def finalize(self):
self.out_many(
self.exposed_outputs(self.ctx.child, ChildWorkChain)
)
```

In the `define` method of this simple parent workchain, we use the `expose_inputs()` and `plumpy.process_spec.ProcessSpec.expose_outputs()`. This creates the corresponding input and output ports in the parent workchain. Additionally, AiiDA remembers which inputs and outputs were exposed from that particular workchain class. This is used when calling the child in the `run_child` method. The `exposed_inputs()` method returns a dictionary of inputs that the parent received which were exposed from the child, and so it can be used to pass these on to the child. Finally, in the `finalize` method, we use `exposed_outputs()` to retrieve the outputs of the child which were exposed to the parent. Using `out_many()`, these outputs are added to the outputs of the parent workchain. This workchain can now be run in exactly the same way as the child itself:

```#!/usr/bin/env runaiida
from __future__ import print_function

from aiida.orm.data.bool import Bool
from aiida.orm.data.float import Float
from aiida.orm.data.int import Int
from aiida.work import run
from simple_parent import SimpleParentWorkChain

if __name__ == '__main__':
result = run(SimpleParentWorkChain, a=Int(1), b=Float(1.2), c=Bool(True))
print(result)
# {u'e': 1.2, u'd': 1, u'f': True}
```

Next, we will see how a more complex parent workchain can be created by using the additional features of the expose functionality. The following workchain launches two children. These children share the input `a`, but have different `b` and `c`. The output `e` will be taken only from the first child, whereas `d` and `f` are taken from both children. In order to avoid name conflicts, we need to create a namespace for each of the two children, where the inputs and outputs which are not shared are stored. Our goal is that the workflow can be called as follows:

```#!/usr/bin/env runaiida
from __future__ import print_function

from aiida.orm.data.bool import Bool
from aiida.orm.data.float import Float
from aiida.orm.data.int import Int
from aiida.work import run
from complex_parent import ComplexParentWorkChain

if __name__ == '__main__':
result = run(
ComplexParentWorkChain,
a=Int(1),
child_1=dict(b=Float(1.2), c=Bool(True)),
child_2=dict(b=Float(2.3), c=Bool(False))
)
print(result)
# {
#     u'e': 1.2,
#     u'child_1.d': 1, u'child_1.f': True,
#     u'child_2.d': 1, u'child_2.f': False
# }
```

This is achieved by the following workflow. In the next section, we will explain each of the steps.

```from aiida.work import ToContext, WorkChain, run

from child import ChildWorkChain

class ComplexParentWorkChain(WorkChain):
@classmethod
def define(cls, spec):
super(ComplexParentWorkChain, cls).define(spec)
spec.expose_inputs(ChildWorkChain, include=['a'])
spec.expose_inputs(ChildWorkChain, namespace='child_1', exclude=['a'])
spec.expose_inputs(ChildWorkChain, namespace='child_2', exclude=['a'])
spec.outline(cls.run_children, cls.finalize)
spec.expose_outputs(ChildWorkChain, include=['e'])
spec.expose_outputs(ChildWorkChain, namespace='child_1', exclude=['e'])
spec.expose_outputs(ChildWorkChain, namespace='child_2', exclude=['e'])

def run_children(self):
child_1_inputs = self.exposed_inputs(ChildWorkChain, namespace='child_1')
child_2_inputs = self.exposed_inputs(ChildWorkChain, namespace='child_1', agglomerate=False)
child_1 = self.submit(ChildWorkChain, **child_1_inputs)
child_2 = self.submit(ChildWorkChain, **child_2_inputs, a=self.inputs.a)

def finalize(self):
self.out_many(
self.exposed_outputs(
self.ctx.child_1,
ChildWorkChain,
namespace='child_1'
)
)
self.out_many(
self.exposed_outputs(
self.ctx.child_2,
ChildWorkChain,
namespace='child_2',
agglomerate=False
)
)
```

First of all, we want to expose the `a` input and the `e` output at the top-level. For this, we again use `expose_inputs()` and `plumpy.process_spec.ProcessSpec.expose_outputs()`, but with the optional keyword `include`. This specifies a list of keys, and only inputs or outputs which are in that list will be exposed. So by passing `include=['a']` to `expose_inputs()`, only the input `a` is exposed.

Additionally, we want to expose the inputs `b` and `c` (outputs `d` and `f`), but in a namespace specific for each of the two children. For this purpose, we pass the `namespace` parameter to the expose functions. However, since we now shouldn’t expose `a` (`e`) again, we use the `exclude` keyword, which specifies a list of keys that will not be exposed.

When calling the children, we again use the `exposed_inputs()` method to forward the exposed inputs. Since the inputs `b` and `c` are now in a specific namespace, we need to pass this namespace as an additional parameter. By default, `exposed_inputs()` will search through all the parent namespaces of the given namespace to search for input, as shown in the call for `child_1`. If the same input key exists in multiple namespaces, the input in the lowest namespace takes precedence. It’s also possible to disable this behavior, and instead search only in the explicit namespace that was passed. This is done by setting `agglomerate=False`, as shown in the call to `child_2`. Of course, we then need to explicitly pass the input `a`.

Finally, we use `exposed_outputs()` and `out_many()` to forward the outputs of the children to the outputs of the parent. Again, the `namespace` and `agglomerate` options can be used to select which outputs are returned by the `exposed_outputs()` method.

Automatic input serialization¶

Quite often, inputs which are given as Python data types need to be cast to the corresponding AiiDA type before passing them to a workflow. Doing this manually can be cumbersome, so you can define a function which does this automatically when defining the input spec. This function, passed as `serializer` parameter to `spec.input`, is invoked if the given input is not already an AiiDA type.

For inputs which are stored in the database (`non_db=False`), the serialization function should return an AiiDA data type. For `non_db` inputs, the function must be idempotent because it might be applied more than once.

The following example workchain takes three inputs `a`, `b`, `c`, and simply returns the given inputs. The `to_aiida_type()` function is used as serialization function.

```from aiida.work import WorkChain
from aiida.orm.data import to_aiida_type
# The basic types need to be loaded such that they are registered with
# the 'to_aiida_type' function.
from aiida.orm.data.base import *

class SerializeWorkChain(WorkChain):
@classmethod
def define(cls, spec):
super(SerializeWorkChain, cls).define(spec)

spec.input('a', serializer=to_aiida_type)
spec.input('b', serializer=to_aiida_type)
spec.input('c', serializer=to_aiida_type)

spec.outline(cls.echo)

def echo(self):
self.out('a', self.inputs.a)
self.out('b', self.inputs.b)
self.out('c', self.inputs.c)
```

This workchain can now be called with native Python types, which will automatically converted to AiiDA types by the `to_aiida_type()` function. Note that the module which defines the corresponding AiiDA type must be loaded for it to be recognized by `to_aiida_type()`.

```#!/usr/bin/env runaiida

from __future__ import print_function

from aiida.work.launch import run

from serialize_workchain import SerializeWorkChain

if __name__ == '__main__':
print(run(
SerializeWorkChain,
a=1, b=1.2, c=True
))
# Result: {'a': 1, 'b': 1.2, 'c': True}
```

Of course, you can also use the serialization feature to perform a more complex serialization of the inputs.

Upgrading workchains from pre `v1.0.0`¶

The release of version `1.0.0` of `aiida-core` saw some major changes to the workflow engine to improve its robustness and userfriendliness. In doing so a few minor changes were introduced that break workchains that were written before that release. However, these workchains can be updated with just a few minor updates that we will list here:

• The free function `submit` in any `WorkChain` should be replaced with `self.submit`.
• The `_options` input for `JobCalculation` is now `options`, simply removed the leading underscore.
• The `label` and `description` inputs for `JobCalculation` or a `WorkChain` have also lost the underscore.
• The free functions from `aiida.work.run` have been moved to `aiida.work.launch`, even though for the time being the old import will still work.
• The future returned by `submit` no longer has the `pid` attribute but rather `pk`.
• The `get_inputs_template class` method has been replaced by `get_builder`. See the section on the process builder on how to use it.
• The import `aiida.work.workfunction` has been moved to `aiida.work.workfunctions`.
• The `input_group` has been deprecated and been replaced by namespaces. See the section on port namespaces on how to use them.
• The use of a `.` (period) in output keys is not supported in `Process.out` because that is now reserved to indicate namespaces.