New Zealand | Ansible Disaster Recovery Guide AWS

Sebastian Baszcyj - 05.08.202220220805

Ansible Disaster Recovery Guide AWS

New Zealand | Ansible Disaster Recovery Guide AWS

Step-by-Step Guide to Disaster Recovery for the Ansible Automation Platform installed in AWS

Not to sound negative, but organisations should always try to prepare for the worst and hope for the best.

Disaster Recovery (DR) is critical for every organisation. Ensuring business remains uninterrupted is key, whether you need to prepare for unforeseen incidents like a data centre outage or reside somewhere susceptible to natural disasters. But how can you guarantee the changes don’t impact the end user?

There are several ways to provide Disaster Recovery for the Red Hat Ansible Automation Platform installed in AWS (you might have seen my ‘How-to guide on Ansible Tower Backup and Restore on Azure’). AAP provides a built-in backup method which can be executed by using the same installation script with ‘-b’ switch: setup.sh -b. This approach backs up the entire AAP configuration including; the Postgres DB, all the controller, execution, and hub nodes. As a result, we have a backup which can be used to recover the entire environment. 

Often this approach might be too difficult to implement in a cloud environment as it requires a prolonged outage for the entire AAP environment and preferably the recovery is done in a freshly provisioned environment.  This step-by-step guide describes the steps required to restore Ansible’s Automation Platform DB in AWS when RDS is used without the need for the AAP backup file.

PREREQUISITES:

  1. AAP installed with AWS RDS Database
  2. RDS DB is being protected with AWS Snapshots
  3. RDS DB snapshot is available
  4. Access to relevant sections of AWS console
  5. Access to controller and hub nodes

HOW TO RESTORE DB FOR AAP:

  1. Log into AWS
  2. Navigate to RDS > Snapshots
  3. Select the Snapshot to restore:
New Zealand | Ansible Disaster Recovery Guide AWS
  1. Select Actions –> Restore Snapshot:
New Zealand | Ansible Disaster Recovery Guide AWS
  1. In the configuration of Restore snapshot, specify a new DB instance identifier, for example —> aapdb02
New Zealand | Ansible Disaster Recovery Guide AWS
  1. Ensure all other settings are the same as for the original instance (especially VPC security groups)
  2. Click Restore DB Instance and wait patiently until it is restored and online
  3. Log into the controller node, preferably as a root
  4. Verify connexion to a new DB using psql or podman container. The syntax is like the below, with the exception of the username (-U) and the database we are connecting to (the name can be found in the inventory file)
[root@aapcontroller01 ~]# psql -h aapdb02.cbr8auhjg1vp.us-west-1.rds.amazonaws.com -U awx -d aapdb Password for user awx:  psql (13.7, server 12.10) SSL connexion (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off) Type "help" for help.   aapdb=> \l                                   List of databases    Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges    -----------+----------+----------+-------------+-------------+-----------------------  aapdb     | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres         +            |          |          |             |             | postgres=CTc/postgres+            |          |          |             |             | awx=CTc/postgres  aaphubdb  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres         +            |          |          |             |             | postgres=CTc/postgres+            |          |          |             |             | awx=CTc/postgres  postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |   rdsadmin  | rdsadmin | UTF8     | en_US.UTF-8 | en_US.UTF-8 | rdsadmin=CTc/rdsadmin  template0 | rdsadmin | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/rdsadmin          +            |          |          |             |             | rdsadmin=CTc/rdsadmin  template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +            |          |          |             |             | postgres=CTc/postgres (6 rows) aapdb=>  
  1. Once connectivity has been confirmed, stop the services on all servers (controllers and hubs):

Run the following command on controllers:

automation-controller-service stop

Run the following command on the hubs:

systemctl stop pulp* nginx redis
  1. On controller nodes, change directory to: /etc/tower/conf.d/
cd /etc/tower/conf.d
  1. Edit the postgres.py file and update the DB Host name on all controller nodes
# Ansible Automation Platform controller database settings.   DATABASES = {    'default': {        'ATOMIC_REQUESTS': True,        'ENGINE': 'awx.main.db.profiled_pg',        'NAME': 'aapdb',        'USER': 'awx',        'PASSWORD': """Password""",        'HOST': 'aapdb02.cbr8auhjg1vp.us-west-1.rds.amazonaws.com',        'PORT': '5432',        'OPTIONS': { 'sslmode': 'prefer',                     'sslrootcert': '/etc/pki/tls/certs/ca-bundle.crt',        },    } }   
  1. On the hub nodes, change directory to /etc/pulp:
cd /etc/pulp
  1. Edit the settings.py file and update the database name in DATABASES section:
DATABASES = {'default': {'HOST': 'aapdb02.cbr8auhjg1vp.us-west-1.rds.amazonaws.com', 'ENGINE': 'django.db.backends.postgresql_psycopg2', 'NAME': 'aaphubdb', 'USER': 'awx', 'PASSWORD': 'reducted', 'PORT': 5432, 'OPTIONS': {'sslmode': 'prefer', 'sslrootcert': '/etc/pki/tls/certs/ca-bundle.crt'}}} REDIS_HOST = 'localhost' REDIS_PORT = 6379 CACHE_ENABLED = True GALAXY_COLLECTION_SIGNING_SERVICE = 'ansible-default' PRIVATE_KEY_PATH = '/etc/pulp/certs/token_private_key.pem' PUBLIC_KEY_PATH = '/etc/pulp/certs/token_public_key.pem' TOKEN_SERVER = 'https://aaphub01.example.net/token' TOKEN_SIGNATURE_ALGORITHM = 'ES256' ALLOWED_CONTENT_CHECKSUMS = ['sha224', 'sha256', 'sha384', 'sha512'] SECRET_KEY = 'reducted' CONTENT_ORIGIN = 'https://aaphub01.example.net' X_PULP_API_PROTO = 'https' X_PULP_API_HOST = 'aaphub01.example.net' X_PULP_API_PORT = '443' X_PULP_API_PREFIX = 'pulp_ansible/galaxy/automation-hub/api' GALAXY_API_DEFAULT_DISTRIBUTION_BASE_PATH = 'published' GALAXY_ENABLE_API_ACCESS_LOG = False GALAXY_ENABLE_UNAUTHENTICATED_COLLECTION_ACCESS = False GALAXY_ENABLE_UNAUTHENTICATED_COLLECTION_DOWNLOAD = False GALAXY_REQUIRE_CONTENT_APPROVAL = True GALAXY_AUTO_SIGN_COLLECTIONS = False REDIS_URL = 'unix:///var/run/redis/redis.sock' ANSIBLE_API_HOSTNAME = 'https://aaphub01.example.net' ANSIBLE_CONTENT_HOSTNAME = 'https://aaphub01.example.net' CONTENT_BIND = 'unix:/var/run/pulpcore-content/pulpcore-content.sock' CONNECTED_ANSIBLE_CONTROLLERS = ['https://aapcontroller01.example.net', 'https://aapcontroller02.example.net'] DEPLOY_ROOT = "/var/lib/pulp" MEDIA_ROOT = "/var/lib/pulp/media" STATIC_ROOT = "/var/lib/pulp/assets" WORKING_DIRECTORY = "/var/lib/pulp/tmp" FILE_UPLOAD_TEMP_DIR = "/var/lib/pulp/tmp" DB_ENCRYPTION_KEY = "/etc/pulp/certs/database_fields.symmetric.key" 
  1. Reboot the hub nodes
  2. Start the service on all controller nodes:
automation-controller-service start
  1. On the controller node run the following command as root. The command will verify the connexion to the DB and connexion between the controller nodes. All nodes should be visible and have the recent heartbeat timestamp:
[root@aapcontroller01 ~]# awx-manage list_instances [controlplane capacity=270 policy=100%] aapcontroller01.example.net capacity=135 node_type=hybrid version=4.2.0 heartbeat="2022-06-23 05:44:45" aapcontroller02.example.net capacity=135 node_type=hybrid version=4.2.0 heartbeat="2022-06-23 05:44:44"   [default capacity=270 policy=100%] aapcontroller01.example.net capacity=135 node_type=hybrid version=4.2.0 heartbeat="2022-06-23 05:44:45" aapcontroller02.example.net capacity=135 node_type=hybrid version=4.2.0 heartbeat="2022-06-23 05:44:44" 
  1. Connect to each controller node using web browser and verify the communication and configuration
  2. Connect to each hub node using web browser and verify the communication and configuration

Don’t let an outage catch you off guard! Reach out to Insentra if you would like to explore a joint Disaster Recovery solution tailored to your needs.

Join the Insentra Community with the Insentragram Newsletter

Hungry for more?

If you’re waiting for a sign, this is it.

We’re a certified amazing place to work, with an incredible team and fascinating projects – and we’re ready for you to join us! Go through our simple application process. Once you’re done, we will be in touch shortly!

New Zealand | Ansible Disaster Recovery Guide AWS

Unleashing the power of Microsoft Copilot

This comprehensive guide provides everything you need to get your organisation ready for and successfully deploy Copilot.

Who is Insentra?

Imagine a business which exists to help IT Partners & Vendors grow and thrive.

Insentra is a 100% channel business. This means we provide a range of Advisory, Professional and Managed IT services exclusively for and through our Partners.

Our #PartnerObsessed business model achieves powerful results for our Partners and their Clients with our crew’s deep expertise and specialised knowledge.

We love what we do and are driven by a relentless determination to deliver exceptional service excellence.

New Zealand | Ansible Disaster Recovery Guide AWS

Insentra ISO 27001:2013 Certification

SYDNEY, WEDNESDAY 20TH APRIL 2022 – We are proud to announce that Insentra has achieved the  ISO 27001 Certification.