Australia | AWS RDS Disaster Recovery for AAP

Sebastian Baszcyj - 14.02.202320230214

AWS RDS Disaster Recovery for AAP

Australia | AWS RDS Disaster Recovery for AAP

Introduction

This blog describes the method to recover the Ansible Automation Platform (AAP) database hosted on an AWS RDS PostgresDB in case of data corruption. This method can be also used to clone the database and use it in the development environment (daily snapshots). 

Step-By-Step Process

There are several ways to provide the Disaster Recovery for the Ansible Automation Platform installed in AWS. AAP provides a built-in backup method that can be executed by using the same installation script with ‘-b’ switch: setup.sh -b. This approach backs up the entire AAP configuration, including the Postgres DB, all the controller, execution, and hub nodes. As the result, we have a backup that can be used to recover the entire environment. 

This approach may prove challenging to implement in a cloud environment as it necessitates a prolonged interruption of the entire AAP environment, and it is best to perform recovery in a newly provisioned environment. 

This guide provides a detailed explanation of the steps needed to restore the Ansible Automation Platform’s DB in AWS when using RDS, without the need for an AAP backup file.

Prerequisites:

  1. AAP installed with AWS RDS Database 
  1. RDS DB is being protected with AWS Snapshots 
  1. RDS DB snapshot is available 
  1. Access to relevant sections of AWS console 
  1. Root access to controller and hub nodes

How to restore DB for AAP: 

  1. Log into AWS 
  1. Navigate to RDS > Snapshots 
  1. Select the Snapshot to restore:
Australia | AWS RDS Disaster Recovery for AAP
  1. Select Actions –> Restore Snapshot:
Australia | AWS RDS Disaster Recovery for AAP
  1. In the configuration of Restore snapshot, specify a new DB instance identifier, for example —> aapdb02 
Australia | AWS RDS Disaster Recovery for AAP
  1. Ensure all other settings are the same as for the original instance, especially VPC security groups 
  1. Click Restore DB Instance and wait patiently until is restored and online 
  1. Log into the controller node, preferably as a root
  2. Verify connexion to a new db using psql or podman container. The syntax is like below, with the exception of the username (-U) and the database we are connecting to (the name can be found in the inventory file). 
[root@aapcontroller01 ~]# psql -h aapdb02.cbr8auhjg1vp.us-west-1.rds.amazonaws.com -U awx -d aapdb   Password for user awx:   psql (13.7, server 12.10)   SSL connexion (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)   Type "help" for help.      aapdb=> l                                     List of databases      Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges      -----------+----------+----------+-------------+-------------+-----------------------    aapdb     | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres         +              |          |          |             |             | postgres=CTc/postgres+              |          |          |             |             | awx=CTc/postgres    aaphubdb  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres         +              |          |          |             |             | postgres=CTc/postgres+              |          |          |             |             | awx=CTc/postgres    postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |    rdsadmin  | rdsadmin | UTF8     | en_US.UTF-8 | en_US.UTF-8 | rdsadmin=CTc/rdsadmin    template0 | rdsadmin | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/rdsadmin          +              |          |          |             |             | rdsadmin=CTc/rdsadmin    template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +              |          |          |             |             | postgres=CTc/postgres   (6 rows)   aapdb=> 
  1. Once connectivity has been confirmed, stop the services on all servers (controllers and hubs):

Run the following command on controllers: 

automation-controller-service stop 

Run the following command on the hubs: 

Run the following command on the hubs: 
  1. On controller nodes, change directory to: /etc/tower/conf.d/ 
cd /etc/tower/conf.d 
  1. Edit the postgres.py file and update the DB Host name on all controller nodes
 Ansible Automation Platform controller database settings.      DATABASES = {      'default': {          'ATOMIC_REQUESTS': True,          'ENGINE': 'awx.main.db.profiled_pg',          'NAME': 'aapdb',          'USER': 'awx',          'PASSWORD': """Password""",          'HOST': 'aapdb02.cbr8auhjg1vp.us-west-1.rds.amazonaws.com',          'PORT': '5432',          'OPTIONS': { 'sslmode': 'prefer',                       'sslrootcert': '/etc/pki/tls/certs/ca-bundle.crt',          },      }   } 
  1. On the hub nodes, change directory to /etc/pulp: 
cd /etc/pulp 
  1. Edit the settings.py file and update the database name in DATABASES section: 
DATABASES = {'default': {'HOST': 'aapdb02.cbr8auhjg1vp.us-west-1.rds.amazonaws.com', 'ENGINE': 'django.db.backends.postgresql_psycopg2', 'NAME': 'aaphubdb', 'USER': 'awx', 'PASSWORD': 'P@ssw0rd', 'PORT': 5432, 'OPTIONS': {'sslmode': 'prefer', 'sslrootcert': '/etc/pki/tls/certs/ca-bundle.crt'}}}   REDIS_HOST = 'localhost'   REDIS_PORT = 6379   CACHE_ENABLED = True   GALAXY_COLLECTION_SIGNING_SERVICE = 'ansible-default'   PRIVATE_KEY_PATH = '/etc/pulp/certs/token_private_key.pem'   PUBLIC_KEY_PATH = '/etc/pulp/certs/token_public_key.pem'   TOKEN_SERVER = 'https://aaphub01.example.net/token'   TOKEN_SIGNATURE_ALGORITHM = 'ES256'   ALLOWED_CONTENT_CHECKSUMS = ['sha224', 'sha256', 'sha384', 'sha512']   SECRET_KEY = 'REDUCTED'   CONTENT_ORIGIN = 'https://aaphub01.example.net'   X_PULP_API_PROTO = 'https'   X_PULP_API_HOST = 'aaphub01.example.net'   X_PULP_API_PORT = '443'   X_PULP_API_PREFIX = 'pulp_ansible/galaxy/automation-hub/api'   GALAXY_API_DEFAULT_DISTRIBUTION_BASE_PATH = 'published'   GALAXY_ENABLE_API_ACCESS_LOG = False   GALAXY_ENABLE_UNAUTHENTICATED_COLLECTION_ACCESS = False   GALAXY_ENABLE_UNAUTHENTICATED_COLLECTION_DOWNLOAD = False   GALAXY_REQUIRE_CONTENT_APPROVAL = True   GALAXY_AUTO_SIGN_COLLECTIONS = False   REDIS_URL = 'unix:///var/run/redis/redis.sock'   ANSIBLE_API_HOSTNAME = 'https://aaphub01.example.net'   ANSIBLE_CONTENT_HOSTNAME = 'https://aaphub01.example.net'   CONTENT_BIND = 'unix:/var/run/pulpcore-content/pulpcore-content.sock'   CONNECTED_ANSIBLE_CONTROLLERS = ['https://aapcontroller01.example.net', 'https://aapcontroller02.example.net']   DEPLOY_ROOT = "/var/lib/pulp"   MEDIA_ROOT = "/var/lib/pulp/media"   STATIC_ROOT = "/var/lib/pulp/assets"   WORKING_DIRECTORY = "/var/lib/pulp/tmp"   FILE_UPLOAD_TEMP_DIR = "/var/lib/pulp/tmp"   DB_ENCRYPTION_KEY = "/etc/pulp/certs/database_fields.symmetric.key" 
  1. Reboot the hub nodes 
  1. Start the service on all controller nodes: 
automation-controller-service start 
  1. On the controller node, run the following command as root. The command will verify the connexion to the DB and connexion between the controller nodes. All nodes should be visible and have the recent heartbeat timestamp: 
[root@aapcontroller01 ~]# awx-manage list_instances   [controlplane capacity=270 policy=100%]   aapcontroller01.example.net capacity=135 node_type=hybrid version=4.2.0 heartbeat="2022-06-23 05:44:45"   aapcontroller02.example.net capacity=135 node_type=hybrid version=4.2.0 heartbeat="2022-06-23 05:44:44"      [default capacity=270 policy=100%]   aapcontroller01.example.net capacity=135 node_type=hybrid version=4.2.0 heartbeat="2022-06-23 05:44:45"   aapcontroller02.example.net capacity=135 node_type=hybrid version=4.2.0 heartbeat="2022-06-23 05:44:44" 
  1. Connect to each controller node using web browser and verify the communication and configuration 
  1. Connect to each hub node using web browser and verify the communication and configuration 

I hope this blog provided you with the necessary insights how to recover the database. Please note that the same process works for the AAP Database hosted in Azure. 

AAP with AWS RDS ensures disaster recovery success. Automated backups and failover guarantee continuous app availability and data protection. With its ease of use and flexible architecture, Ansible is the ideal tool for managing disaster recovery in the cloud.  Get ahead of the curve with this flexible and easy-to-use solution. Want to learn more? Contact us today!

Related Articles

Azure Site Recovery and MCS Provisioned Workloads
Azure Disaster Recovery and Backup
CIA Triad – The Mother of Data Security
Veritas 360 Data Management

THANK YOU FOR YOUR SUBMISSION!

Australia | AWS RDS Disaster Recovery for AAP

The form was submitted successfully.

Join the Insentra Community with the Insentragram Newsletter

Hungry for more?

If you’re waiting for a sign, this is it.

We’re a certified amazing place to work, with an incredible team and fascinating projects – and we’re ready for you to join us! Go through our simple application process. Once you’re done, we will be in touch shortly!

Who is Insentra?

Imagine a business which exists to help IT Partners & Vendors grow and thrive.

Insentra is a 100% channel business. This means we provide a range of Advisory, Professional and Managed IT services exclusively for and through our Partners.

Our #PartnerObsessed business model achieves powerful results for our Partners and their Clients with our crew’s deep expertise and specialised knowledge.

We love what we do and are driven by a relentless determination to deliver exceptional service excellence.

Australia | AWS RDS Disaster Recovery for AAP

Insentra ISO 27001:2013 Certification

SYDNEY, WEDNESDAY 20TH APRIL 2022 – We are proud to announce that Insentra has achieved the  ISO 27001 Certification.