QueryBuilder documentation¶
The QueryBuilder: A class that allows you to query the AiiDA database, independent from backend.
Note that the backend implementation is enforced and handled with a composition model!
QueryBuilder()
is the frontend class that the user can use. It inherits from object and contains
backend-specific functionality. Backend specific functionality is provided by the implementation classes.
These inherit from aiida.backends.general.querybuilder_interface.QueryBuilderInterface()
,
an interface classes which enforces the implementation of its defined methods.
An instance of one of the implementation classes becomes a member of the QueryBuilder()
instance
when instantiated by the user.
-
class
aiida.orm.querybuilder.
QueryBuilder
(*args, **kwargs)[source]¶ QueryBuilder: The class to query the AiiDA database. Usage:
from aiida.orm.querybuilder import QueryBuilder qb = QueryBuilder() # Querying nodes: qb.append(Node) # retrieving the results: results = qb.all()
-
__init__
(*args, **kwargs)[source]¶ Instantiates a QueryBuilder instance.
Which backend is used decided here based on backend-settings (taken from the user profile). This cannot be overriden so far by the user.
Parameters: - debug (bool) – Turn on debug mode. This feature prints information on the screen about the stages of the QueryBuilder. Does not affect results.
- path (list) – A list of the vertices to traverse. Leave empty if you plan on using the method
QueryBuilder.append()
. - filters – The filters to apply. You can specify the filters here, when appending to the query
using
QueryBuilder.append()
or even later usingQueryBuilder.add_filter()
. Check latter gives API-details. - project – The projections to apply. You can specify the projections here, when appending to the query
using
QueryBuilder.append()
or even later usingQueryBuilder.add_projection()
. Latter gives you API-details. - limit (int) – Limit the number of rows to this number. Check
QueryBuilder.limit()
for more information. - offset (int) – Set an offset for the results returned. Details in
QueryBuilder.offset()
. - order_by – How to order the results. As the 2 above, can be set also at later stage,
check
QueryBuilder.order_by()
for more information.
-
add_filter
(tagspec, filter_spec)[source]¶ Adding a filter to my filters.
Parameters: - tagspec – The tag, which has to exist already as a key in self._filters
- filter_spec – The specifications for the filter, has to be a dictionary
Usage:
qb = QueryBuilder() # Instantiating the QueryBuilder instance qb.append(Node, tag='node') # Appending a Node #let's put some filters: qb.add_filter('node',{'id':{'>':12}}) # 2 filters together: qb.add_filter('node',{'label':'foo', 'uuid':{'like':'ab%'}}) # Now I am overriding the first filter I set: qb.add_filter('node',{'id':13})
-
add_projection
(tag_spec, projection_spec)[source]¶ Adds a projection
Parameters: - tag_spec – A valid specification for a tag
- projection_spec – The specification for the projection. A projection is a list of dictionaries, with each dictionary containing key-value pairs where the key is database entity (e.g. a column / an attribute) and the value is (optional) additional information on how to process this database entity.
If the given projection_spec is not a list, it will be expanded to a list. If the listitems are not dictionaries, but strings (No additional processing of the projected results desired), they will be expanded to dictionaries.
Usage:
qb = QueryBuilder() qb.append(StructureData, tag='struc') # Will project the uuid and the kinds qb.add_projection('struc', ['uuid', 'attributes.kinds'])
The above example will project the uuid and the kinds-attribute of all matching structures. There are 2 (so far) special keys.
The single star * will project the ORM-instance:
qb = QueryBuilder() qb.append(StructureData, tag='struc') # Will project the ORM instance qb.add_projection('struc', '*') print type(qb.first()[0]) # >>> aiida.orm.data.structure.StructureData
The double start ** projects all possible projections of this entity:
QueryBuilder().append(StructureData,tag=’s’, project=’**’).limit(1).dict()[0][‘s’].keys()
# >>> u’user_id, description, ctime, label, extras, mtime, id, attributes, dbcomputer_id, nodeversion, type, public, uuid’
Be aware that the result of ** depends on the backend implementation.
-
all
(batch_size=None)[source]¶ Executes the full query with the order of the rows as returned by the backend. the order inside each row is given by the order of the vertices in the path and the order of the projections for each vertice in the path.
Parameters: batch_size (int) – The size of the batches to ask the backend to batch results in subcollections. You can optimize the speed of the query by tuning this parameter. Leave the default (None) if speed is not critical or if you don’t know what you’re doing! Returns: a list of lists of all projected entities.
-
append
(cls=None, type=None, tag=None, filters=None, project=None, subclassing=True, edge_tag=None, edge_filters=None, edge_project=None, outerjoin=False, **kwargs)[source]¶ Any iterative procedure to build the path for a graph query needs to invoke this method to append to the path.
Parameters: - cls – The Aiida-class (or backend-class) defining the appended vertice
- type (str) – The type of the class, if cls is not given
- autotag (bool) – Whether to find automatically a unique tag. If this is set to True (default False),
- tag (str) – A unique tag. If none is given, I will create a unique tag myself.
- filters – Filters to apply for this vertice.
See
add_filter()
, the method invoked in the background, or usage examples for details. - project – Projections to apply. See usage examples for details.
More information also in
add_projection()
. - subclassing (bool) – Whether to include subclasses of the given class (default True). E.g. Specifying a Calculation as cls will include JobCalculations, InlineCalculations, etc..
- outerjoin (bool) – If True, (default is False), will do a left outerjoin instead of an inner join
- edge_tag (str) – The tag that the edge will get. If nothing is specified (and there is a meaningful edge) the default is tag1–tag2 with tag1 being the entity joining from and tag2 being the entity joining to (this entity).
- edge_filters (str) – The filters to apply on the edge. Also here, details in
add_filter()
. - edge_project (str) – The project from the edges. API-details in
add_projection()
.
A small usage example how this can be invoked:
qb = QueryBuilder() # Instantiating empty querybuilder instance qb.append(cls=StructureData) # First item is StructureData node # The # next node in the path is a PwCalculation, with # the structure joined as an input qb.append( cls=PwCalculation, output_of=StructureData )
Returns: self
-
count
()[source]¶ Counts the number of rows returned by the backend.
Returns: the number of rows as an integer
-
dict
(batch_size=None)[source]¶ Executes the full query with the order of the rows as returned by the backend. the order inside each row is given by the order of the vertices in the path and the order of the projections for each vertice in the path.
Parameters: batch_size (int) – The size of the batches to ask the backend to batch results in subcollections. You can optimize the speed of the query by tuning this parameter. Leave the default (None) if speed is not critical or if you don’t know what you’re doing! Returns: a list of dictionaries of all projected entities. Each dictionary consists of key value pairs, where the key is the tag of the vertice and the value a dictionary of key-value pairs where key is the entity description (a column name or attribute path) and the value the value in the DB. Usage:
qb = QueryBuilder() qb.append( StructureData, tag='structure', filters={'uuid':{'==':myuuid}}, ) qb.append( Node, descendant_of='structure', project=['type', 'id'], # returns type (string) and id (string) tag='descendant' ) # Return the dictionaries: print "qb.iterdict()" for d in qb.iterdict(): print '>>>', d
results in the following output:
qb.iterdict() >>> {'descendant': { 'type': u'calculation.job.quantumespresso.pw.PwCalculation.', 'id': 7716} } >>> {'descendant': { 'type': u'data.remote.RemoteData.', 'id': 8510} }
-
distinct
()[source]¶ Asks for distinct rows, which is the same as asking the backend to remove duplicates. Does not execute the query!
If you want a distinct query:
qb = QueryBuilder() # append stuff! qb.append(...) qb.append(...) ... qb.distinct().all() #or qb.distinct().dict()
Returns: self
-
first
()[source]¶ Executes query asking for one instance. Use as follows:
qb = QueryBuilder(**queryhelp) qb.first()
Returns: One row of results as a list
-
get_alias
(tag)[source]¶ In order to continue a query by the user, this utility function returns the aliased ormclasses.
Parameters: tag – The tag for a vertice in the path Returns: the alias given for that vertice
-
get_json_compatible_queryhelp
()[source]¶ Makes the queryhelp a json - compatible dictionary. In this way,the queryhelp can be stored in the database or a json-object, retrieved or shared and used later. See this usage:
qb = QueryBuilder(limit=3).append(StructureData, project='id').order_by({StructureData:'id'}) queryhelp = qb.get_json_compatible_queryhelp() # Now I could save this dictionary somewhere and use it later: qb2=QueryBuilder(**queryhelp) # This is True if no change has been made to the database. # Note that such a comparison can only be True if the order of results is enforced qb.all()==qb2.all()
Returns: the json-compatible queryhelp
-
get_query
()[source]¶ Instantiates and manipulates a sqlalchemy.orm.Query instance if this is needed. First, I check if the query instance is still valid by hashing the queryhelp. In this way, if a user asks for the same query twice, I am not recreating an instance.
Returns: an instance of sqlalchemy.orm.Query that is specific to the backend used.
Returns a list of all the vertices that are being used. Some parameter allow to select only subsets. :param bool vertices: Defaults to True. If True, adds the tags of vertices to the returned list :param bool edges: Defaults to True. If True, adds the tags of edges to the returnend list.
Returns: A list of all tags, including (if there is) also the tag give for the edges
-
iterall
(batch_size=100)[source]¶ Same as
all()
, but returns a generator. Be aware that this is only safe if no commit will take place during this transaction. You might also want to read the SQLAlchemy documentation on http://docs.sqlalchemy.org/en/latest/orm/query.html#sqlalchemy.orm.query.Query.yield_perParameters: batch_size (int) – The size of the batches to ask the backend to batch results in subcollections. You can optimize the speed of the query by tuning this parameter. Returns: a generator of lists
-
iterdict
(batch_size=100)[source]¶ Same as
dict()
, but returns a generator. Be aware that this is only safe if no commit will take place during this transaction. You might also want to read the SQLAlchemy documentation on http://docs.sqlalchemy.org/en/latest/orm/query.html#sqlalchemy.orm.query.Query.yield_perParameters: batch_size (int) – The size of the batches to ask the backend to batch results in subcollections. You can optimize the speed of the query by tuning this parameter. Returns: a generator of dictionaries
-
limit
(limit)[source]¶ Set the limit (nr of rows to return)
Parameters: limit (int) – integers of number of rows of rows to return
-
offset
(offset)[source]¶ Set the offset. If offset is set, that many rows are skipped before returning. offset = 0 is the same as omitting setting the offset. If both offset and limit appear, then offset rows are skipped before starting to count the limit rows that are returned.
Parameters: offset (int) – integers of nr of rows to skip
-
one
()[source]¶ Executes the query asking for exactly one results. Will raise an exception if this is not the case :raises: MultipleObjectsError if more then one row can be returned :raises: NotExistent if no result was found
-
order_by
(order_by)[source]¶ Set the entity to order by
Parameters: order_by – This is a list of items, where each item is a dictionary specifies what to sort for an entity In each dictionary in that list, keys represent valid tags of entities (tables), and values are list of columns.
Usage:
#Sorting by id (ascending): qb = QueryBuilder() qb.append(Node, tag='node') qb.order_by({'node':['id']}) # or #Sorting by id (ascending): qb = QueryBuilder() qb.append(Node, tag='node') qb.order_by({'node':[{'id':{'order':'asc'}}]}) # for descending order: qb = QueryBuilder() qb.append(Node, tag='node') qb.order_by({'node':[{'id':{'order':'desc'}}]}) # or (shorter) qb = QueryBuilder() qb.append(Node, tag='node') qb.order_by({'node':[{'id':'desc'}]})
-
-
class
aiida.backends.general.querybuilder_interface.
QueryBuilderInterface
(*args, **kwargs)[source]¶ -
AiidaComputer
()[source]¶ A property, decorated with @property. Returns the implementation for the AiiDA-class for Computer
-
AiidaGroup
()[source]¶ A property, decorated with @property. Returns the implementation for the AiiDA-class for Group
-
AiidaNode
()[source]¶ A property, decorated with @property. Returns the implementation for the AiiDA-class for Node
-
AiidaUser
()[source]¶ A property, decorated with @property. Returns the implementation for the AiiDA-class for User
-
Computer
()[source]¶ A property, decorated with @property. Returns the implementation for the Computer
-
Node
()[source]¶ Decorated as a property, returns the implementation for DbNode. It needs to return a subclass of sqlalchemy.Base, which means that for different ORM’s a corresponding dummy-model must be written.
-
first
()[source]¶ Executes query in the backend asking for one instance.
Returns: One row of aiida results
-
get_aiida_res
(key, res)[source]¶ Some instance returned by ORM (django or SA) need to be converted to Aiida instances (eg nodes)
Parameters: - key – the key that this entry would be returned with
- res – the result returned by the query
Returns: an aiida-compatible instance
-
get_filter_expr_from_attributes
(operator, value, attr_key, column=None, column_name=None, alias=None)[source]¶ A classmethod that returns an valid SQLAlchemy expression.
Parameters: - operator – The operator provided by the user (‘==’, ‘>’, …)
- value – The value to compare with, e.g. (5.0, ‘foo’, [‘a’,’b’])
- attr_key (str) – The path to that attribute as a tuple of values. I.e. if that attribute I want to filter by is the 2nd element in a list stored under the key ‘mylist’, this is (‘mylist’, ‘2’).
- column – Optional, an instance of sqlalchemy.orm.attributes.InstrumentedAttribute or
- column_name (str) – The name of the column, and the backend should get the InstrumentedAttribute.
- alias – The aliased class.
Returns: An instance of sqlalchemy.sql.elements.BinaryExpression
-
iterdict
(batch_size=100)[source]¶ Returns: An iterator over all the results of a list of dictionaries.
-
modify_expansions
(alias, expansions)[source]¶ Modify names of projections if ** was specified. This is important for the schema having attributes in a different table.
-