Occasionally the python venv can get in an inconsistent state, in which case the easiest solution is to delete and recreate it. Symptoms of a broken venv can include errors during provisioning like:
To create a new virtual environment (assuming tpaexec was installed into the default location):
Strange AWS errors regarding credentials
If the time & date of the TPA server isn't correct, you can get AWS errors similar to this during provisioning:
Solution - set the time and date correctly.
Logging
By default, all tpaexec logging will be saved in logfile <clusterdir>/ansible.log
To change the logfile location, set environment variable ANSIBLE_LOG_PATH to the desired location - e.g.
To increase the verbosity of logging, just add -v/-vv/-vvv/-vvvv/-vvvvv to tpaexec command line:
Cluster test
An easy way to smoketest an existing cluster is to run:
This will do a functional test of the cluster components, followed by a performance test of the cluster, using pgbench. As pgbench can take a while to complete, benchmarking can be omitted by running:
Tags in the test role are repmgr,postgres,barman,pgbench
Note that when specifying multiple tags, they should be comma delimited, with
no spaces; for example:
TPA server test
To check the installation of the TPA server itself, run:
Skipping or including specific tags
When re-running a tpaexec provision or deploy after a failure, in the interests
of time, it can sometimes be useful to miss out tasks by skipping specific tags.
For example to miss out the repmgr tasks:
To jump straight to re-run a particular task by specifying a tag--for example,
to immediately run BDR tasks:
Note that this assumes that the previous tasks all completed successfully.
To find all the tags for the relevant architecture that might be useful, run: