Introduction
This blog describes the method to recover the Ansible Automation Platform (AAP) database hosted on an AWS RDS PostgresDB in case of data corruption. This method can be also used to clone the database and use it in the development environment (daily snapshots).
Step-By-Step Process
There are several ways to provide the Disaster Recovery for the Ansible Automation Platform installed in AWS. AAP provides a built-in backup method that can be executed by using the same installation script with ‘-b’ switch: setup.sh -b. This approach backs up the entire AAP configuration, including the Postgres DB, all the controller, execution, and hub nodes. As the result, we have a backup that can be used to recover the entire environment.
This approach may prove challenging to implement in a cloud environment as it necessitates a prolonged interruption of the entire AAP environment, and it is best to perform recovery in a newly provisioned environment.
This guide provides a detailed explanation of the steps needed to restore the Ansible Automation Platform’s DB in AWS when using RDS, without the need for an AAP backup file.
Prerequisites:
- AAP installed with AWS RDS Database
- RDS DB is being protected with AWS Snapshots
- RDS DB snapshot is available
- Access to relevant sections of AWS console
- Root access to controller and hub nodes
How to restore DB for AAP:
- Log into AWS
- Navigate to RDS > Snapshots
- Select the Snapshot to restore:

- Select Actions –> Restore Snapshot:

- In the configuration of Restore snapshot, specify a new DB instance identifier, for example —> aapdb02

- Ensure all other settings are the same as for the original instance, especially VPC security groups
- Click Restore DB Instance and wait patiently until is restored and online
- Log into the controller node, preferably as a root
- Verify connection to a new db using psql or podman container. The syntax is like below, with the exception of the username (-U) and the database we are connecting to (the name can be found in the inventory file).
[root@aapcontroller01 ~]# psql -h aapdb02.cbr8auhjg1vp.us-west-1.rds.amazonaws.com -U awx -d aapdb Password for user awx: psql (13.7, server 12.10) SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off) Type "help" for help. aapdb=> l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -----------+----------+----------+-------------+-------------+----------------------- aapdb | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres + | | | | | postgres=CTc/postgres+ | | | | | awx=CTc/postgres aaphubdb | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres + | | | | | postgres=CTc/postgres+ | | | | | awx=CTc/postgres postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | rdsadmin | rdsadmin | UTF8 | en_US.UTF-8 | en_US.UTF-8 | rdsadmin=CTc/rdsadmin template0 | rdsadmin | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/rdsadmin + | | | | | rdsadmin=CTc/rdsadmin template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres (6 rows) aapdb=>
- Once connectivity has been confirmed, stop the services on all servers (controllers and hubs):
Run the following command on controllers:
automation-controller-service stop
Run the following command on the hubs:
Run the following command on the hubs:
- On controller nodes, change directory to: /etc/tower/conf.d/
cd /etc/tower/conf.d
- Edit the postgres.py file and update the DB Host name on all controller nodes
Ansible Automation Platform controller database settings. DATABASES = { 'default': { 'ATOMIC_REQUESTS': True, 'ENGINE': 'awx.main.db.profiled_pg', 'NAME': 'aapdb', 'USER': 'awx', 'PASSWORD': """Password""", 'HOST': 'aapdb02.cbr8auhjg1vp.us-west-1.rds.amazonaws.com', 'PORT': '5432', 'OPTIONS': { 'sslmode': 'prefer', 'sslrootcert': '/etc/pki/tls/certs/ca-bundle.crt', }, } }
- On the hub nodes, change directory to /etc/pulp:
cd /etc/pulp
- Edit the settings.py file and update the database name in DATABASES section:
DATABASES = {'default': {'HOST': 'aapdb02.cbr8auhjg1vp.us-west-1.rds.amazonaws.com', 'ENGINE': 'django.db.backends.postgresql_psycopg2', 'NAME': 'aaphubdb', 'USER': 'awx', 'PASSWORD': 'P@ssw0rd', 'PORT': 5432, 'OPTIONS': {'sslmode': 'prefer', 'sslrootcert': '/etc/pki/tls/certs/ca-bundle.crt'}}} REDIS_HOST = 'localhost' REDIS_PORT = 6379 CACHE_ENABLED = True GALAXY_COLLECTION_SIGNING_SERVICE = 'ansible-default' PRIVATE_KEY_PATH = '/etc/pulp/certs/token_private_key.pem' PUBLIC_KEY_PATH = '/etc/pulp/certs/token_public_key.pem' TOKEN_SERVER = 'https://aaphub01.example.net/token' TOKEN_SIGNATURE_ALGORITHM = 'ES256' ALLOWED_CONTENT_CHECKSUMS = ['sha224', 'sha256', 'sha384', 'sha512'] SECRET_KEY = 'REDUCTED' CONTENT_ORIGIN = 'https://aaphub01.example.net' X_PULP_API_PROTO = 'https' X_PULP_API_HOST = 'aaphub01.example.net' X_PULP_API_PORT = '443' X_PULP_API_PREFIX = 'pulp_ansible/galaxy/automation-hub/api' GALAXY_API_DEFAULT_DISTRIBUTION_BASE_PATH = 'published' GALAXY_ENABLE_API_ACCESS_LOG = False GALAXY_ENABLE_UNAUTHENTICATED_COLLECTION_ACCESS = False GALAXY_ENABLE_UNAUTHENTICATED_COLLECTION_DOWNLOAD = False GALAXY_REQUIRE_CONTENT_APPROVAL = True GALAXY_AUTO_SIGN_COLLECTIONS = False REDIS_URL = 'unix:///var/run/redis/redis.sock' ANSIBLE_API_HOSTNAME = 'https://aaphub01.example.net' ANSIBLE_CONTENT_HOSTNAME = 'https://aaphub01.example.net' CONTENT_BIND = 'unix:/var/run/pulpcore-content/pulpcore-content.sock' CONNECTED_ANSIBLE_CONTROLLERS = ['https://aapcontroller01.example.net', 'https://aapcontroller02.example.net'] DEPLOY_ROOT = "/var/lib/pulp" MEDIA_ROOT = "/var/lib/pulp/media" STATIC_ROOT = "/var/lib/pulp/assets" WORKING_DIRECTORY = "/var/lib/pulp/tmp" FILE_UPLOAD_TEMP_DIR = "/var/lib/pulp/tmp" DB_ENCRYPTION_KEY = "/etc/pulp/certs/database_fields.symmetric.key"
- Reboot the hub nodes
- Start the service on all controller nodes:
automation-controller-service start
- On the controller node, run the following command as root. The command will verify the connection to the DB and connection between the controller nodes. All nodes should be visible and have the recent heartbeat timestamp:
[root@aapcontroller01 ~]# awx-manage list_instances [controlplane capacity=270 policy=100%] aapcontroller01.example.net capacity=135 node_type=hybrid version=4.2.0 heartbeat="2022-06-23 05:44:45" aapcontroller02.example.net capacity=135 node_type=hybrid version=4.2.0 heartbeat="2022-06-23 05:44:44" [default capacity=270 policy=100%] aapcontroller01.example.net capacity=135 node_type=hybrid version=4.2.0 heartbeat="2022-06-23 05:44:45" aapcontroller02.example.net capacity=135 node_type=hybrid version=4.2.0 heartbeat="2022-06-23 05:44:44"
- Connect to each controller node using web browser and verify the communication and configuration
- Connect to each hub node using web browser and verify the communication and configuration
I hope this blog provided you with the necessary insights how to recover the database. Please note that the same process works for the AAP Database hosted in Azure.
AAP with AWS RDS ensures disaster recovery success. Automated backups and failover guarantee continuous app availability and data protection. With its ease of use and flexible architecture, Ansible is the ideal tool for managing disaster recovery in the cloud. Get ahead of the curve with this flexible and easy-to-use solution. Want to learn more? Contact us today!
Related Articles
Azure Site Recovery and MCS Provisioned WorkloadsAzure Disaster Recovery and Backup
CIA Triad – The Mother of Data Security
Veritas 360 Data Management