Each AiiDA installation can have multiple profiles, each of which can have its own individual database and file repository to store the contents of the provenance graph. Profiles allow you to run multiple projects completely independently from one another with just a single AiiDA installation and at least one profile is required to run AiiDA. A new profile can be created using verdi quicksetup or verdi setup, which works similar to the former but gives more control to the user.
The verdi profile command line interface provides various commands to manage the profiles of an AiiDA installation. To list the currently configured profiles, use verdi profile list:
verdi profile list
Info: configuration folder: /home/user/.virtualenvs/aiida/.aiida * project-one project-two
In this particular example, there are two configured profiles, project-one and project-two. The first one is highlighted and marked with a * symbol, meaning it is the default profile. A profile being the default means simply that any verdi command will always be executed for that profile. You can change the profile on a per-call basis with the --p/--profile option. To change the default profile use verdi profile setdefault PROFILE.
project-one
project-two
*
verdi
--p/--profile
verdi profile setdefault PROFILE
Each profile defines various parameters, such as the location of the file repository on the file system and the connection parameters for the database. To display these parameters, use verdi profile show:
verdi profile show
Info: Profile: project-one ---------------------- ------------------------------------------------ aiidadb_backend django aiidadb_engine postgresql_psycopg2 aiidadb_host localhost aiidadb_name aiida_project_one aiidadb_pass correcthorsebatterystaple aiidadb_port 5432 aiidadb_repository_uri file:///home/user/.virtualenvs/aiida/repository/ aiidadb_user aiida default_user_email user@email.com options {'daemon_default_workers': 3} profile_uuid 4c272a87d7f543b08da9fe738d88bb13 ---------------------- ------------------------------------------------
By default, the parameters of the default profile are shown, but one can pass the profile name of another, e.g., verdi profile show project-two to change that.
verdi profile show project-two
A profile can be deleted using the verdi profile delete command. By default, deleting a profile will also delete its file repository and the database. This behavior can be changed using the --skip-repository and --skip-db options.
verdi profile delete
--skip-repository
--skip-db
Note
In order to delete the database, the system user needs to have the required rights, which is not always guaranteed depending on the system. In such cases, the database deletion may fail and the user will have to perform the deletion manually through PostgreSQL.
The verdi command line interface has many commands and parameters, which can be tab-completed to simplify its use. To enable tab-completion, the following shell command should be executed:
$ eval "$(_VERDI_COMPLETE=source verdi)"
Place this command in your shell or virtual environment activation script to automatically enable tab completion when opening a new shell or activating an environment. This file is shell specific, but likely one of the following:
the startup file of your shell (.bashrc, .zsh, …), if aiida is installed system-wide the activation script of your virtual environment a startup file for your conda environment
the startup file of your shell (.bashrc, .zsh, …), if aiida is installed system-wide
.bashrc
.zsh
the activation script of your virtual environment
a startup file for your conda environment
Important
After you have added the line to the start up script, make sure to restart the terminal or source the script for the changes to take effect.
AiiDA provides various configurational options for profiles, which can be controlled with the verdi config command. To set a configurational option, simply pass the name of the option and the value to set verdi config OPTION_NAME OPTION_VALUE. The available options are tab-completed, so simply type verdi config and thit <TAB> twice to list them.
verdi config OPTION_NAME OPTION_VALUE
verdi config
For example, if you want to change the default number of workers that are created when you start the daemon, you can run:
$ verdi config daemon.default_workers 5 Success: daemon.default_workers set to 5 for profile-one
You can check the currently defined value of any option by simply calling the command without specifying a value, for example:
$ verdi config daemon.default_workers 5
If no value is displayed, it means that no value has ever explicitly been set for this particular option and the default will always be used. By default any option set through verdi config will be applied to the current default profile. To change the profile you can use the profile option.
To undo the configuration of a particular option and reset it so the default value is used, you can use the --unset option:
--unset
$ verdi config daemon.default_workers --unset Success: daemon.default_workers unset for profile-one
If you want to set a particular option that should be applied to all profiles, you can use the --global flag:
--global
$ verdi config daemon.default_workers 5 --global Success: daemon.default_workers set to 5 globally
and just as on a per-profile basis, this can be undone with the --unset flag:
$ verdi config daemon.default_workers --unset --global Success: daemon.default_workers unset globally
Changes that affect the daemon (e.g. logging.aiida_loglevel) will only take affect after restarting the daemon.
logging.aiida_loglevel
An AiiDA instance is defined as the installed source code plus the configuration folder that stores the configuration files with all the configured profiles. It is possible to run multiple AiiDA instances on a single machine, simply by isolating the code and configuration in a virtual environment.
To isolate the code, make sure to install AiiDA into a virtual environment, e.g., with conda or venv, as described here. Whenever you activate this particular environment, you will be running the particular version of AiiDA (and all the plugins) that you installed specifically for it.
This is separate from the configuration of AiiDA, which is stored in the configuration directory which is always named .aiida and by default is stored in the home directory. Therefore, the default path of the configuration directory is ~/.aiida. By default, each AiiDA instance (each installation) will store associated profiles in this folder. A best practice is to always separate the profiles together with the code to which they belong. The typical approach is to place the configuration folder in the virtual environment itself and have it automatically selected whenever the environment is activated.
.aiida
~/.aiida
The location of the AiiDA configuration folder, can be controlled with the AIIDA_PATH environment variable. This allows us to change the configuration folder automatically, by adding the following lines to the activation script of a virtual environment. For example, if the path of your virtual environment is /home/user/.virtualenvs/aiida, add the following line:
AIIDA_PATH
/home/user/.virtualenvs/aiida
$ export AIIDA_PATH='/home/user/.virtualenvs/aiida'
Make sure to reactivate the virtual environment, if it was already active, for the changes to take effect.
For conda, create a directory structure etc/conda/activate.d in the root folder of your conda environment (e.g. /home/user/miniconda/envs/aiida), and place a file aiida-init.sh in that folder which exports the AIIDA_PATH.
conda
etc/conda/activate.d
/home/user/miniconda/envs/aiida
aiida-init.sh
You can test that everything works by first echoing the environment variable with echo $AIIDA_PATH to confirm it prints the correct path. Finally, you can check that AiiDA know also properly realizes the new location for the configuration folder by calling verdi profile list. This should display the current location of the configuration directory:
echo $AIIDA_PATH
Info: configuration folder: /home/user/.virtualenvs/aiida/.aiida Critical: configuration file /home/user/.virtualenvs/aiida/.aiida/config.json does not exist
The second line you will only see if you haven’t yet setup a profile for this AiiDA instance. For information on setting up a profile, refer to creating profiles.
Besides a single path, the value of AIIDA_PATH can also be a colon-separated list of paths. AiiDA will go through each of the paths and check whether they contain a configuration directory, i.e., a folder with the name .aiida. The first configuration directory that is encountered will be used as the configuration directory. If no configuration directory is found, one will be created in the last path that was considered. For example, the directory structure in your home folder ~/ might look like this:
~/
. ├── .aiida └── project_a ├── .aiida └── subfolder
If you leave the AIIDA_PATH variable unset, the default location ~/.aiida will be used. However, if you set:
$ export AIIDA_PATH='~/project_a:'
the configuration directory ~/project_a/.aiida will be used.
~/project_a/.aiida
Warning
If there was no .aiida directory in ~/project_a, AiiDA would have created it for you, so make sure to set the AIIDA_PATH correctly.
~/project_a
The daemon can be set up as a system service, such that it automatically starts at system startup. How to do this, is operating system specific. For Ubuntu, here is a template for the service file and ansible instructions to install the service.
AiiDA supports running hundreds of thousands of calculations and graphs with millions of nodes. However, optimal performance at that scale might require some tweaks to the AiiDA configuration to balance the CPU and disk load. Here are a few general tips that might improve the AiiDA performance:
Prevent your operating system from indexing the file repository. Many Linux distributions include the locate command to quickly find files and folders, and run a daily cron job updatedb.mlocate to create the corresponding index. A large file repository can take a long time to index, up to the point where the hard drive is constantly indexing. In order to exclude the repository folder from indexing, add its path to the PRUNEPATH variable in the /etc/updatedb.conf configuration file (use sudo). Optimize the number of daemon workers The verdi deamon can manage an arbitrary number of parallel workers; by default only one is activated. If verdi daemon status shows the daemon worker(s) constantly at high CPU usage, use verdi daemon incr X to add X daemon workers. It is recommended that the number of workers does not exceed the number of CPU cores. Ideally, if possible, one should use one or two cores less than the machine has, to avoid to degrade the PostgreSQL database performance. Move the Postgresql database to a fast disk (SSD), ideally on a large partition. Stop the AiiDA daemon and back up your database. Find the data directory of your postgres installation (something like /var/lib/postgresql/9.6/main, /scratch/postgres/9.6/main, …). The best way is to become the postgres UNIX user and enter the postgres shell: psql SHOW data_directory; \q If you are unable to enter the postgres shell, try looking for the data_directory variable in a file /etc/postgresql/9.6/main/postgresql.conf or similar. Stop the postgres database service: service postgresql stop Copy all files and folders from the postgres data_directory to the new location: cp -a SOURCE_DIRECTORY DESTINATION_DIRECTORY .. note:: Flag ``-a`` will create a directory within ``DESTINATION_DIRECTORY``, e.g.:: cp -a OLD_DIR/main/ NEW_DIR/ creates ``NEW_DIR/main``. It will also keep the file permissions (necessary). The file permissions of the new and old directory need to be identical (including subdirectories). In particular, the owner and group should be both ``postgres`` (except for symbolic links in ``server.crt`` and ``server.key`` that may or may not be present). .. note:: If the permissions of these links need to be changed, use the ``-h`` option of ``chown`` to avoid changing the permissions of the destination of the links. In case you have changed the permission of the links destination by mistake, they should typically be (beware that this might depend on your actual distribution!):: -rw-r--r-- 1 root root 989 Mar 1 2012 /etc/ssl/certs/ssl-cert-snakeoil.pem -rw-r----- 1 root ssl-cert 1704 Mar 1 2012 /etc/ssl/private/ssl-cert-snakeoil.key Point the data_directory variable in your postgres configuration file (e.g. /etc/postgresql/9.6/main/postgresql.conf) to the new directory. Restart the database daemon: service postgresql start Finally, check that the data directory has indeed changed: psql SHOW data_directory; \q and try a simple AiiDA query with the new database. If everything went fine, you can delete the old database location.
Many Linux distributions include the locate command to quickly find files and folders, and run a daily cron job updatedb.mlocate to create the corresponding index. A large file repository can take a long time to index, up to the point where the hard drive is constantly indexing.
locate
updatedb.mlocate
In order to exclude the repository folder from indexing, add its path to the PRUNEPATH variable in the /etc/updatedb.conf configuration file (use sudo).
PRUNEPATH
/etc/updatedb.conf
sudo
The verdi deamon can manage an arbitrary number of parallel workers; by default only one is activated. If verdi daemon status shows the daemon worker(s) constantly at high CPU usage, use verdi daemon incr X to add X daemon workers. It is recommended that the number of workers does not exceed the number of CPU cores. Ideally, if possible, one should use one or two cores less than the machine has, to avoid to degrade the PostgreSQL database performance.
verdi daemon status
verdi daemon incr X
X
Stop the AiiDA daemon and back up your database.
Find the data directory of your postgres installation (something like /var/lib/postgresql/9.6/main, /scratch/postgres/9.6/main, …).
/var/lib/postgresql/9.6/main
/scratch/postgres/9.6/main
The best way is to become the postgres UNIX user and enter the postgres shell: psql SHOW data_directory; \q If you are unable to enter the postgres shell, try looking for the data_directory variable in a file /etc/postgresql/9.6/main/postgresql.conf or similar.
The best way is to become the postgres UNIX user and enter the postgres shell:
psql SHOW data_directory; \q
If you are unable to enter the postgres shell, try looking for the data_directory variable in a file /etc/postgresql/9.6/main/postgresql.conf or similar.
data_directory
/etc/postgresql/9.6/main/postgresql.conf
Stop the postgres database service:
service postgresql stop
Copy all files and folders from the postgres data_directory to the new location:
cp -a SOURCE_DIRECTORY DESTINATION_DIRECTORY .. note:: Flag ``-a`` will create a directory within ``DESTINATION_DIRECTORY``, e.g.:: cp -a OLD_DIR/main/ NEW_DIR/ creates ``NEW_DIR/main``. It will also keep the file permissions (necessary). The file permissions of the new and old directory need to be identical (including subdirectories). In particular, the owner and group should be both ``postgres`` (except for symbolic links in ``server.crt`` and ``server.key`` that may or may not be present). .. note:: If the permissions of these links need to be changed, use the ``-h`` option of ``chown`` to avoid changing the permissions of the destination of the links. In case you have changed the permission of the links destination by mistake, they should typically be (beware that this might depend on your actual distribution!):: -rw-r--r-- 1 root root 989 Mar 1 2012 /etc/ssl/certs/ssl-cert-snakeoil.pem -rw-r----- 1 root ssl-cert 1704 Mar 1 2012 /etc/ssl/private/ssl-cert-snakeoil.key
Point the data_directory variable in your postgres configuration file (e.g. /etc/postgresql/9.6/main/postgresql.conf) to the new directory.
Restart the database daemon:
service postgresql start
Finally, check that the data directory has indeed changed:
and try a simple AiiDA query with the new database. If everything went fine, you can delete the old database location.
Whenever updating your AiiDA installation, make sure you follow these instructions very carefully, even when merely upgrading the patch version! Failing to do so, may leave your installation in a broken state, or worse may even damage your data, potentially irreparably.
Activate the Python environment where AiiDA is installed. Finish all running processes. All finished processes will be automatically migrated, but it is not possible to resume unfinished processes. Stop the daemon using verdi daemon stop. Create a backup of your database and repository. Warning Once you have migrated your database, you can no longer go back to an older version of aiida-core (unless you restore your database and repository from a backup). Update your aiida-core installation. If you have installed AiiDA through conda simply run: conda update aiida-core. If you have installed AiiDA through pip simply run: pip install --upgrade aiida-core. If you have installed from the git repository using pip install -e ., first delete all the .pyc files (find . -name "*.pyc" -delete) before updating your branch with git pull. Run reentry scan to update the cache of registered entry points. Migrate your database with verdi -p <profile_name> database migrate. Depending on the size of your database and the number of migrations to perform, data migration can take time, so please be patient.
Activate the Python environment where AiiDA is installed.
Finish all running processes. All finished processes will be automatically migrated, but it is not possible to resume unfinished processes.
Stop the daemon using verdi daemon stop.
verdi daemon stop
Create a backup of your database and repository.
Once you have migrated your database, you can no longer go back to an older version of aiida-core (unless you restore your database and repository from a backup).
aiida-core
Update your aiida-core installation.
If you have installed AiiDA through conda simply run: conda update aiida-core. If you have installed AiiDA through pip simply run: pip install --upgrade aiida-core. If you have installed from the git repository using pip install -e ., first delete all the .pyc files (find . -name "*.pyc" -delete) before updating your branch with git pull.
If you have installed AiiDA through conda simply run: conda update aiida-core.
conda update aiida-core
If you have installed AiiDA through pip simply run: pip install --upgrade aiida-core.
pip
pip install --upgrade aiida-core
If you have installed from the git repository using pip install -e ., first delete all the .pyc files (find . -name "*.pyc" -delete) before updating your branch with git pull.
pip install -e .
.pyc
find . -name "*.pyc" -delete
git pull
Run reentry scan to update the cache of registered entry points.
Migrate your database with verdi -p <profile_name> database migrate. Depending on the size of your database and the number of migrations to perform, data migration can take time, so please be patient.
verdi -p <profile_name> database migrate
After the database migration finishes, you will be able to continue working with your existing data.
If the update involved a change in the major version number of aiida-core, expect backwards incompatible changes and check whether you also need to update installed plugin packages.
Additional instructions on how to migrate from 0.12.x versions.
Additional instructions on how to migrate from versions 0.4 – 0.11.
For a list of breaking changes between the 0.x and the 1.x series of AiiDA, see here.
A full backup of an AiiDA instance and AiiDA managed data requires a backup of:
the profile configuration in the config.json file located in the .aiida folder. Typically located at ~/.aiida (see also Setup).
config.json
files associated with nodes in the repository folder (one per profile). Typically located in the .aiida folder.
queryable metadata in the PostgreSQL database (one per profile).
For small repositories (with less than ~100k files), simply back up the .aiida folder using standard backup software. For example, the rsync utility supports incremental backups, and a backup command might look like rsync -avHzPx (verbose) or rsync -aHzx.
rsync
rsync -avHzPx
rsync -aHzx
For large repositories with millions of files, even incremental backups can take a significant amount of time. AiiDA provides a helper script that takes advantage of the AiiDA database in order to figure out which files have been added since your last backup. The instructions below explain how to use it:
Configure your backup using verdi -p PROFILENAME devel configure-backup where PROFILENAME is the name of the AiiDA profile that should be backed up. This will ask for information on: The “backup folder”, where the backup configuration file will be placed. This defaults to a folder named backup_PROFILENAME in your .aiida directory. The “destination folder”, where the files of the backup will be stored. This defaults to a subfolder of the backup folder but we strongly suggest to back up to a different drive (see note below). The configuration step creates two files in the “backup folder”: a backup_info.json configuration file (can also be edited manually) and a start_backup.py script. Notes on using a SSH mount for the backups (on Linux) Using the same disk for your backup forgoes protection against the most common cause of data loss: disk failure. One simple option is to use a destination folder mounted over ssh. On Ubuntu, install sshfs using sudo apt-get install sshfs. Imagine you run your calculations on server_1 and would like to back up regularly to server_2. Mount a server_2 directory on server_1 using the following command on server_1: sshfs -o idmap=user -o rw backup_user@server_2:/home/backup_user/backup_destination_dir/ /home/aiida_user/remote_backup_dir/ Use gnome-session-properties in the terminal to add this line to the actions performed at start-up. Do not add it to your shell’s startup file (e.g. .bashrc) or your computer will complain that the mount point is not empty whenever you open a new terminal. Run the start_backup.py script in the “backup folder” to start the backup. This will back up all data added after the oldest_object_backedup date. It will only carry out a new backup every periodicity days, until a certain end date if specified (using end_date_of_backup or days_to_backup), see this reference page for a detailed description of all options. Once you’ve checked that it works, make sure to run the script periodically (e.g. using a daily cron job). Setting up a cron job on Linux This is a quick note on how to setup a cron job on Linux (you can find many more resources online). On Ubuntu, you can set up a cron job using: sudo crontab -u USERNAME -e It will open an editor, where you can add a line of the form: 00 03 * * * /home/USERNAME/.aiida/backup/start_backup.py 2>&1 | mail -s "Incremental backup of the repository" USER_EMAIL@domain.net or (if you need to backup a different profile than the default one): 00 03 * * * verdi -p PROFILENAME run /home/USERNAME/.aiida/backup/start_backup.py 2>&1 | mail -s "Incremental backup of the repository" USER_EMAIL@domain.net This will launch the backup of the database every day at 3 AM (03:00), and send the output (or any error message) to the email address specified at the end (provided the mail command – from mailutils – is configured appropriately).
Configure your backup using verdi -p PROFILENAME devel configure-backup where PROFILENAME is the name of the AiiDA profile that should be backed up. This will ask for information on:
verdi -p PROFILENAME devel configure-backup
PROFILENAME
The “backup folder”, where the backup configuration file will be placed. This defaults to a folder named backup_PROFILENAME in your .aiida directory.
backup_PROFILENAME
The “destination folder”, where the files of the backup will be stored. This defaults to a subfolder of the backup folder but we strongly suggest to back up to a different drive (see note below).
The configuration step creates two files in the “backup folder”: a backup_info.json configuration file (can also be edited manually) and a start_backup.py script.
backup_info.json
start_backup.py
Using the same disk for your backup forgoes protection against the most common cause of data loss: disk failure. One simple option is to use a destination folder mounted over ssh.
On Ubuntu, install sshfs using sudo apt-get install sshfs. Imagine you run your calculations on server_1 and would like to back up regularly to server_2. Mount a server_2 directory on server_1 using the following command on server_1:
sshfs
sudo apt-get install sshfs
sshfs -o idmap=user -o rw backup_user@server_2:/home/backup_user/backup_destination_dir/ /home/aiida_user/remote_backup_dir/
Use gnome-session-properties in the terminal to add this line to the actions performed at start-up. Do not add it to your shell’s startup file (e.g. .bashrc) or your computer will complain that the mount point is not empty whenever you open a new terminal.
gnome-session-properties
Run the start_backup.py script in the “backup folder” to start the backup.
This will back up all data added after the oldest_object_backedup date. It will only carry out a new backup every periodicity days, until a certain end date if specified (using end_date_of_backup or days_to_backup), see this reference page for a detailed description of all options.
oldest_object_backedup
periodicity
end_date_of_backup
days_to_backup
Once you’ve checked that it works, make sure to run the script periodically (e.g. using a daily cron job).
This is a quick note on how to setup a cron job on Linux (you can find many more resources online).
On Ubuntu, you can set up a cron job using:
sudo crontab -u USERNAME -e
It will open an editor, where you can add a line of the form:
00 03 * * * /home/USERNAME/.aiida/backup/start_backup.py 2>&1 | mail -s "Incremental backup of the repository" USER_EMAIL@domain.net
or (if you need to backup a different profile than the default one):
00 03 * * * verdi -p PROFILENAME run /home/USERNAME/.aiida/backup/start_backup.py 2>&1 | mail -s "Incremental backup of the repository" USER_EMAIL@domain.net
This will launch the backup of the database every day at 3 AM (03:00), and send the output (or any error message) to the email address specified at the end (provided the mail command – from mailutils – is configured appropriately).
mail
mailutils
You might want to exclude the file repository from any separately set up automatic backups of your home directory.
PostgreSQL typically spreads database information over multiple files that, if backed up directly, are not guaranteed to restore the database. We therefore strongly recommend to periodically dump the database contents to a file (which you can then back up using your method of choice).
A few useful pointers:
In order to avoid having to enter your database password each time you use the script, you can create a file .pgpass in your home directory containing your database credentials, as described in the PostgreSQL documentation.
.pgpass
In order to dump your database, use the pg_dump utility from PostgreSQL. You can use as a starting example a bash script similar to this file.
this file
You can setup the backup script to run daily using cron (see notes in the previous section).
In order to restore a backup, you will need to:
Restore the repository folder that you backed up earlier in the same location as it used to be (you can check the location in the config.json file inside your .aiida folder, or simply using verdi profile show). Create an empty database following the instructions described in database skipping the verdi setup phase. The database should have the same name and database username as the original one (i.e. if you are restoring on the original postgresql cluster, you may have to either rename or delete the original database). Change directory to the folder containing the database dump created with pg_dump, and load it using the psql command. Example commands on Linux Ubuntu This is an example command, assuming that your dump is named aiidadb-backup.psql: psql -h localhost -U aiida -d aiidadb -f aiidadb-backup.psql After supplying your database password, the database should be restored. Note that, if you installed the database on Ubuntu as a system service, you need to type sudo su - postgres to become the postgres UNIX user.
Restore the repository folder that you backed up earlier in the same location as it used to be (you can check the location in the config.json file inside your .aiida folder, or simply using verdi profile show).
Create an empty database following the instructions described in database skipping the verdi setup phase. The database should have the same name and database username as the original one (i.e. if you are restoring on the original postgresql cluster, you may have to either rename or delete the original database).
verdi setup
Change directory to the folder containing the database dump created with pg_dump, and load it using the psql command.
pg_dump
psql
This is an example command, assuming that your dump is named aiidadb-backup.psql:
aiidadb-backup.psql
psql -h localhost -U aiida -d aiidadb -f aiidadb-backup.psql
After supplying your database password, the database should be restored. Note that, if you installed the database on Ubuntu as a system service, you need to type sudo su - postgres to become the postgres UNIX user.
sudo su - postgres
postgres
Setups with multiple users for a single AiiDA instance are currently not supported. Instead, each AiiDA user should install AiiDA in a Unix/Windows account on their own computer. Under this account it is possible to configure all the credentials necessary to connect to remote computers. Using independent accounts will ensure that, for instance, the SSH credentials to connect to supercomputers are not shared with others.
Data can be shared between instances using AiiDA’s export and import functionality. Sharing (subsets of) the AiiDA graph can be done as often as needed.