Caching: implementation details

This section covers some details of the caching mechanism which are not discussed in the user guide. If you are developing a plugin and want to modify the caching behavior of your classes, we recommend you read this section first.

Controlling hashing

Below are some methods you can use to control how the hashes of calculation and data classes are computed:

  • To ignore specific attributes, a Node subclass can have a _hash_ignored_attributes attribute. This is a list of attribute names, which are ignored when creating the hash.
  • For calculations, the _hash_ignored_inputs attribute lists inputs that should be ignored when creating the hash.
  • To add things which should be considered in the hash, you can override the _get_objects_to_hash() method. Note that doing so overrides the behavior described above, so you should make sure to use the super() method.
  • Pass a keyword argument to get_hash(). These are passed on to make_hash().

Controlling caching

There are two methods you can use to disable caching for particular nodes:

  • The is_valid_cache() property determines whether a particular node can be used as a cache. This is used for example to disable caching from failed calculations.
  • Node classes have a _cachable attribute, which can be set to False to completely switch off caching for nodes of that class. This avoids performing queries for the hash altogether.

The WorkflowNode example

As discussed in the user guide, nodes which can have RETURN links cannot be cached. This is enforced on two levels:

  • The _cachable property is set to False in the ProcessNode, and only re-enabled in CalcJobNode and CalcFunctionNode. This means that a WorkflowNode will not be cached.
  • The _store_from_cache method, which is used to “clone” an existing node, will raise an error if the existing node has any RETURN links. This extra safe-guard prevents cases where a user might incorrectly override the _cachable property on a WorkflowNode subclass.

Design guidelines

When modifying the hashing/caching behaviour of your classes, keep in mind that cache matches can go wrong in two ways:

  • False negatives, where two nodes should have the same hash but do not
  • False positives, where two different nodes get the same hash by mistake

False negatives are highly preferrable because they only increase the runtime of your calculations, while false positives can lead to wrong results.