Caching: implementation details¶
This section covers some details of the caching mechanism which are not discussed in the user guide. If you are developing a plugin and want to modify the caching behavior of your classes, we recommend you read this section first.
Controlling hashing¶
Below are some methods you can use to control how the hashes of calculation and data classes are computed:
To ignore specific attributes, a
Node
subclass can have a_hash_ignored_attributes
attribute. This is a list of attribute names, which are ignored when creating the hash.For calculations, the
_hash_ignored_inputs
attribute lists inputs that should be ignored when creating the hash.To add things which should be considered in the hash, you can override the
_get_objects_to_hash()
method. Note that doing so overrides the behavior described above, so you should make sure to use thesuper()
method.Pass a keyword argument to
get_hash()
. These are passed on tomake_hash()
.
Controlling caching¶
There are several methods you can use to disable caching for particular nodes:
On the level of generic aiida.orm.nodes.Node
:
The
is_valid_cache()
property determines whether a particular node can be used as a cache. This is used for example to disable caching from failed calculations.Node classes have a
_cachable
attribute, which can be set toFalse
to completely switch off caching for nodes of that class. This avoids performing queries for the hash altogether.
On the level of aiida.engine.processes.process.Process
and aiida.orm.nodes.process.ProcessNode
:
The
ProcessNode.is_valid_cache
callsProcess.is_valid_cache
, passing the node itself. This can be used inProcess
subclasses (e.g. in calculation plugins) to implement custom ways of invalidating the cache.The
spec.exit_code
has a keyword argumentinvalidates_cache
. If this is set toTrue
, returning that exit code means the process is no longer considered a valid cache. This is implemented inProcess.is_valid_cache
.
The WorkflowNode
example¶
As discussed in the user guide, nodes which can have RETURN
links cannot be cached.
This is enforced on two levels:
The
_cachable
property is set toFalse
in theNode
, and only re-enabled inCalculationNode
(which affects CalcJobs and calcfunctions). This means that aWorkflowNode
will not be cached.The
_store_from_cache
method, which is used to “clone” an existing node, will raise an error if the existing node has anyRETURN
links. This extra safe-guard prevents cases where a user might incorrectly override the_cachable
property on aWorkflowNode
subclass.
Design guidelines¶
When modifying the hashing/caching behaviour of your classes, keep in mind that cache matches can go wrong in two ways:
False negatives, where two nodes should have the same hash but do not
False positives, where two different nodes get the same hash by mistake
False negatives are highly preferrable because they only increase the runtime of your calculations, while false positives can lead to wrong results.