QueryBuilder documentation

The QueryBuilder: A class that allows you to query the AiiDA database, independent from backend. Note that the backend implementation is enforced and handled with a composition model! QueryBuilder() is the frontend class that the user can use. It inherits from object and contains backend-specific functionality. Backend specific functionality is provided by the implementation classes.

These inherit from aiida.backends.general.querybuilder_interface.QueryBuilderInterface(), an interface classes which enforces the implementation of its defined methods. An instance of one of the implementation classes becomes a member of the QueryBuilder() instance when instantiated by the user.

class aiida.orm.querybuilder.QueryBuilder(*args, **kwargs)[source]

QueryBuilder: The class to query the AiiDA database. Usage:

from aiida.orm.querybuilder import QueryBuilder
qb = QueryBuilder()
# Querying nodes:
qb.append(Node)
# retrieving the results:
results = qb.all()
__init__(*args, **kwargs)[source]

Instantiates a QueryBuilder instance.

Which backend is used decided here based on backend-settings (taken from the user profile). This cannot be overriden so far by the user.

Parameters:
  • with_dbpath (bool) – Whether to use the DbPath table (if existing) to query ancestor-descendant relations. The default now is True. Set to False if you want to use the recursive functionality. This gives you the ability to project the path which constructed on the fly. It also allows to have the AiiDA instance without the DbPath, which can consume a lot of memory for heavy usage of AiiDA. Check QueryBuilder.set_with_dbpath() for details.
  • expand_path (bool) – If set to True (default is False) allows to project the path when querying ancestor-descendant relationships. This can cause the query to be much slower, which is why it’s optional. Check QueryBuilder.set_expand_path() for details
  • debug (bool) – Turn on debug mode. This feature prints information on the screen about the stages of the QueryBuilder. Does not affect results.
  • path (list) – A list of the vertices to traverse. Leave empty if you plan on using the method QueryBuilder.append().
  • filters – The filters to apply. You can specify the filters here, when appending to the query using QueryBuilder.append() or even later using QueryBuilder.add_filter(). Check latter gives API-details.
  • project – The projections to apply. You can specify the projections here, when appending to the query using QueryBuilder.append() or even later using QueryBuilder.add_projection(). Latter gives you API-details.
  • limit (int) – Limit the number of rows to this number. Check QueryBuilder.limit() for more information.
  • offset (int) – Set an offset for the results returned. Details in QueryBuilder.offset().
  • order_by – How to order the results. As the 2 above, can be set also at later stage, check QueryBuilder.order_by() for more information.
add_filter(tagspec, filter_spec)[source]

Adding a filter to my filters.

Parameters:
  • tagspec – The tag, which has to exist already as a key in self._filters
  • filter_spec – The specifications for the filter, has to be a dictionary

Usage:

qb = QueryBuilder()         # Instantiating the QueryBuilder instance
qb.append(Node, tag='node') # Appending a Node
#let's put some filters:
qb.add_filter('node',{'id':{'>':12}})
# 2 filters together:
qb.add_filter('node',{'label':'foo', 'uuid':{'like':'ab%'}})
# Now I am overriding the first filter I set:
qb.add_filter('node',{'id':13})
add_projection(tag_spec, projection_spec)[source]

Adds a projection

Parameters:
  • tag_spec – A valid specification for a tag
  • projection_spec – The specification for the projection. A projection is a list of dictionaries, with each dictionary containing key-value pairs where the key is database entity (e.g. a column / an attribute) and the value is (optional) additional information on how to process this database entity.

If the given projection_spec is not a list, it will be expanded to a list. If the listitems are not dictionaries, but strings (No additional processing of the projected results desired), they will be expanded to dictionaries.

Usage:

qb = QueryBuilder()
qb.append(StructureData, tag='struc')

# Will project the uuid and the kinds
qb.add_projection('struc', ['uuid', 'attributes.kinds'])

The above example will project the uuid and the kinds-attribute of all matching structures. There are 2 (so far) special keys.

The single star * will project the ORM-instance:

qb = QueryBuilder()
qb.append(StructureData, tag='struc')
# Will project the ORM instance
qb.add_projection('struc', '*')
print type(qb.first()[0])
# >>> aiida.orm.data.structure.StructureData

The double start ** projects all possible projections of this entity:

QueryBuilder().append(StructureData,tag=’s’, project=’**’).limit(1).dict()[0][‘s’].keys()

# >>> u’user_id, description, ctime, label, extras, mtime, id, attributes, dbcomputer_id, nodeversion, type, public, uuid’

Be aware that the result of ** depends on the backend implementation.

all(batch_size=None)[source]

Executes the full query with the order of the rows as returned by the backend. the order inside each row is given by the order of the vertices in the path and the order of the projections for each vertice in the path.

Parameters:batch_size (int) – The size of the batches to ask the backend to batch results in subcollections. You can optimize the speed of the query by tuning this parameter. Leave the default (None) if speed is not critical or if you don’t know what you’re doing!
Returns:a list of lists of all projected entities.
append(cls=None, type=None, tag=None, filters=None, project=None, subclassing=True, edge_tag=None, edge_filters=None, edge_project=None, outerjoin=False, **kwargs)[source]

Any iterative procedure to build the path for a graph query needs to invoke this method to append to the path.

Parameters:
  • cls – The Aiida-class (or backend-class) defining the appended vertice
  • type (str) – The type of the class, if cls is not given
  • autotag (bool) – Whether to find automatically a unique tag. If this is set to True (default False),
  • tag (str) – A unique tag. If none is given, I will create a unique tag myself.
  • filters – Filters to apply for this vertice. See add_filter(), the method invoked in the background, or usage examples for details.
  • project – Projections to apply. See usage examples for details. More information also in add_projection().
  • subclassing (bool) – Whether to include subclasses of the given class (default True). E.g. Specifying a Calculation as cls will include JobCalculations, InlineCalculations, etc..
  • outerjoin (bool) – If True, (default is False), will do a left outerjoin instead of an inner join
  • edge_tag (str) – The tag that the edge will get. If nothing is specified (and there is a meaningful edge) the default is tag1–tag2 with tag1 being the entity joining from and tag2 being the entity joining to (this entity).
  • edge_filters (str) – The filters to apply on the edge. Also here, details in add_filter().
  • edge_project (str) – The project from the edges. API-details in add_projection().

A small usage example how this can be invoked:

qb = QueryBuilder()             # Instantiating empty querybuilder instance
qb.append(cls=StructureData)    # First item is StructureData node
# The
# next node in the path is a PwCalculation, with
# the structure joined as an input
qb.append(
    cls=PwCalculation,
    output_of=StructureData
)
Returns:self
count()[source]

Counts the number of rows returned by the backend.

Returns:the number of rows as an integer
dict(batch_size=None)[source]

Executes the full query with the order of the rows as returned by the backend. the order inside each row is given by the order of the vertices in the path and the order of the projections for each vertice in the path.

Parameters:batch_size (int) – The size of the batches to ask the backend to batch results in subcollections. You can optimize the speed of the query by tuning this parameter. Leave the default (None) if speed is not critical or if you don’t know what you’re doing!
Returns:a list of dictionaries of all projected entities. Each dictionary consists of key value pairs, where the key is the tag of the vertice and the value a dictionary of key-value pairs where key is the entity description (a column name or attribute path) and the value the value in the DB.

Usage:

qb = QueryBuilder()
qb.append(
    StructureData,
    tag='structure',
    filters={'uuid':{'==':myuuid}},
)
qb.append(
    Node,
    descendant_of='structure',
    project=['type', 'id'],  # returns type (string) and id (string)
    tag='descendant'
)

# Return the dictionaries:
print "qb.iterdict()"
for d in qb.iterdict():
    print '>>>', d

results in the following output:

qb.iterdict()
>>> {'descendant': {
        'type': u'calculation.job.quantumespresso.pw.PwCalculation.',
        'id': 7716}
    }
>>> {'descendant': {
        'type': u'data.remote.RemoteData.',
        'id': 8510}
    }
distinct()[source]

Asks for distinct rows, which is the same as asking the backend to remove duplicates. Does not execute the query!

If you want a distinct query:

qb = QueryBuilder()
# append stuff!
qb.append(...)
qb.append(...)
...
qb.distinct().all() #or
qb.distinct().dict()
Returns:self
first()[source]

Executes query asking for one instance. Use as follows:

qb = QueryBuilder(**queryhelp)
qb.first()
Returns:One row of results as a list
get_alias(tag)[source]

In order to continue a query by the user, this utility function returns the aliased ormclasses.

Parameters:tag – The tag for a vertice in the path
Returns:the alias given for that vertice
get_aliases()[source]
Returns:the list of aliases
get_json_compatible_queryhelp()[source]

Makes the queryhelp a json - compatible dictionary. In this way,the queryhelp can be stored in the database or a json-object, retrieved or shared and used later. See this usage:

qb = QueryBuilder(limit=3).append(StructureData, project='id').order_by({StructureData:'id'})
queryhelp  = qb.get_json_compatible_queryhelp()

# Now I could save this dictionary somewhere and use it later:

qb2=QueryBuilder(**queryhelp)

# This is True if no change has been made to the database.
# Note that such a comparison can only be True if the order of results is enforced
qb.all()==qb2.all()
Returns:the json-compatible queryhelp
get_query()[source]

Instantiates and manipulates a sqlalchemy.orm.Query instance if this is needed. First, I check if the query instance is still valid by hashing the queryhelp. In this way, if a user asks for the same query twice, I am not recreating an instance.

Returns:an instance of sqlalchemy.orm.Query that is specific to the backend used.
get_results_dict()[source]

Deprecated, use dict() instead

get_used_tags(vertices=True, edges=True)[source]

Returns a list of all the vertices that are being used. Some parameter allow to select only subsets. :param bool vertices: Defaults to True. If True, adds the tags of vertices to the returned list :param bool edges: Defaults to True. If True, adds the tags of edges to the returnend list.

Returns:A list of all tags, including (if there is) also the tag give for the edges
iterall(batch_size=100)[source]

Same as all(), but returns a generator. Be aware that this is only safe if no commit will take place during this transaction. You might also want to read the SQLAlchemy documentation on http://docs.sqlalchemy.org/en/latest/orm/query.html#sqlalchemy.orm.query.Query.yield_per

Parameters:batch_size (int) – The size of the batches to ask the backend to batch results in subcollections. You can optimize the speed of the query by tuning this parameter.
Returns:a generator of lists
iterdict(batch_size=100)[source]

Same as dict(), but returns a generator. Be aware that this is only safe if no commit will take place during this transaction. You might also want to read the SQLAlchemy documentation on http://docs.sqlalchemy.org/en/latest/orm/query.html#sqlalchemy.orm.query.Query.yield_per

Parameters:batch_size (int) – The size of the batches to ask the backend to batch results in subcollections. You can optimize the speed of the query by tuning this parameter.
Returns:a generator of dictionaries
limit(limit)[source]

Set the limit (nr of rows to return)

Parameters:limit (int) – integers of number of rows of rows to return
offset(offset)[source]

Set the offset. If offset is set, that many rows are skipped before returning. offset = 0 is the same as omitting setting the offset. If both offset and limit appear, then offset rows are skipped before starting to count the limit rows that are returned.

Parameters:offset (int) – integers of nr of rows to skip
order_by(order_by)[source]

Set the entity to order by

Parameters:order_by – This is a list of items, where each item is a dictionary specifies what to sort for an entity

In each dictionary in that list, keys represent valid tags of entities (tables), and values are list of columns.

Usage:

#Sorting by id (ascending):
qb = QueryBuilder()
qb.append(Node, tag='node')
qb.order_by({'node':['id']})

# or
#Sorting by id (ascending):
qb = QueryBuilder()
qb.append(Node, tag='node')
qb.order_by({'node':[{'id':{'order':'asc'}}]})

# for descending order:
qb = QueryBuilder()
qb.append(Node, tag='node')
qb.order_by({'node':[{'id':{'order':'desc'}}]})

# or (shorter)
qb = QueryBuilder()
qb.append(Node, tag='node')
qb.order_by({'node':[{'id':'desc'}]})
set_debug(debug)[source]

Run in debug mode. This does not affect functionality, but prints intermediate stages when creating a query on screen.

Parameters:debug (bool) – Turn debug on or off
set_expand_path(l_expand_path)[source]

Turn this feature on if you want to project the path when querying ancestor-descenendant relationships. This implies the use of the recursive feature, that you have to turn on, as an exception will be raise otherwise (see QueryBuilder.set_with_dbpath() Note that you set the use of recursive feature by setting off the use of the path:

from aiida.orm.querybuilder import QueryBuilder
from aiida.orm.data.structure import StructureData
qb = QueryBuilder()
qb.set_with_dbpath(False) # Setting of the use of DbPath, and enabling recursive queries.
qb.set_expand_path(True)  # enabling the projection on a path
# Now I create a query that search for all descendant of  structure pk=23:
qb.append(StructureData, tag='ancestor', filters={'id':23})
qb.append(Calculation, tag='desc', edge_project='path', descendant_of='ancestor')
# will return the paths:
qb.all()
..note:
There is no way project the path when using the DbPath table, since it is not stored explicitly.
set_with_dbpath(l_with_dbpath)[source]

Sets whether I will use a DbPath table when querying ancestor-dependant relationships. If set to False (default behavior now is True) I will use recursive queries. You can check the source code of _join_ancestors_recursive and _join_descendants_recursive for details. This option allows to run AiiDA without a DbPath table, saving memory, especially for heavy usage. Of course, if left to True, there needs to be a table with the triggers set. Note that this feature behaves the same for the whole query. You cannot query on ancestor-descendant relationship using the DbPath and another using the recursive functionality within the same query.

Parameters:l_with_dbpath (bool) – True to use DbPath, False to use recursive queries.
class aiida.backends.general.querybuilder_interface.QueryBuilderInterface(*args, **kwargs)[source]
AiidaComputer()[source]

A property, decorated with @property. Returns the implementation for the AiiDA-class for Computer

AiidaGroup()[source]

A property, decorated with @property. Returns the implementation for the AiiDA-class for Group

AiidaNode()[source]

A property, decorated with @property. Returns the implementation for the AiiDA-class for Node

AiidaUser()[source]

A property, decorated with @property. Returns the implementation for the AiiDA-class for User

Computer()[source]

A property, decorated with @property. Returns the implementation for the Computer

Group()[source]

A property, decorated with @property. Returns the implementation for the Group

A property, decorated with @property. Returns the implementation for the DbLink

Node()[source]

Decorated as a property, returns the implementation for DbNode. It needs to return a subclass of sqlalchemy.Base, which means that for different ORM’s a corresponding dummy-model must be written.

User()[source]

A property, decorated with @property. Returns the implementation for the User

count()[source]
Returns:the number of results
first()[source]

Executes query in the backend asking for one instance.

Returns:One row of aiida results
get_aiida_res(key, res)[source]

Some instance returned by ORM (django or SA) need to be converted to Aiida instances (eg nodes)

Parameters:
  • key – the key that this entry would be returned with
  • res – the result returned by the query
Returns:

an aiida-compatible instance

get_filter_expr_from_attributes(operator, value, attr_key, column=None, column_name=None, alias=None)[source]

A classmethod that returns an valid SQLAlchemy expression.

Parameters:
  • operator – The operator provided by the user (‘==’, ‘>’, ...)
  • value – The value to compare with, e.g. (5.0, ‘foo’, [‘a’,’b’])
  • attr_key (str) – The path to that attribute as a tuple of values. I.e. if that attribute I want to filter by is the 2nd element in a list stored under the key ‘mylist’, this is (‘mylist’, ‘2’).
  • column – Optional, an instance of sqlalchemy.orm.attributes.InstrumentedAttribute or
  • column_name (str) – The name of the column, and the backend should get the InstrumentedAttribute.
  • alias – The aliased class.
Returns:

An instance of sqlalchemy.sql.elements.BinaryExpression

get_session()[source]
Returns:a valid session, an instance of sqlalchemy.orm.session.Session
iterall(batch_size=100)[source]
Returns:An iterator over all the results of a list of lists.
iterdict(batch_size=100)[source]
Returns:An iterator over all the results of a list of dictionaries.
modify_expansions(alias, expansions)[source]

Modify names of projections if ** was specified. This is important for the schema having attributes in a different table.

prepare_with_dbpath()[source]

A method to use the DbPath, if this is supported, or throw an exception if not. The overrider must fill add the DbPath-ORM as an attribute to self:

from aiida.backends.implementation.model import DbPath
self.path = DbPath
table_groups_nodes()[source]

A property, decorated with @property. Returns the implementation for the many-to-many relationship between group and nodes.

yield_per(batch_size)[source]
Parameters:batch_size (int) – Number of rows to yield per step

Yields count rows at a time

Returns:a generator