Project Background
Securing distributed software requires configuring TLS to encrypt communications:
Node to node communication
Client to node communication
Setting up TLS can be a cumbersome task as covers many different concepts, and not skills readily available from all delivery teams. There are many providers available to assist, such as Let's Encrypt, however it is useful to provide a harness with a master node becoming the CA authority and target nodes to deploy created certificates to.
Using Ansible and OpenSSL to achieve this purpose, as gives greater flexibility when moving target deployments between Windows and Linux and dopes not require target nodes to be connectable online to external certificate providers. This process is valid for non-production systems. Production systems would need to use an industry standard CA authority.
What was my role
Creating security recommendations for regional delivery teams, in line with product architecture and global security office.
Development and integration of Ansible roles to handle the certificate creation, management and application hardening to secure environments.
What I learnt on this project
Evaluating several mechanisms for generating TLS certificates.
Full end to end from CA authority to CSRs, certificate signing, keystore and trustore creation.
Introduction
There are several important rules to know when generating certificates:
The name present in the certificate must match the public DNS of the host. We can not share the same certificate on all nodes unless using star certificates. Any TLS client connecting to a node will check that certificate name and hostname matches unless disabling hostname verification.
The name present in the certificate, should match the reverse DNS name corresponding to the IP of the host. Java clients connecting to a node, will do a reverse DNS lookup to get the public name of the host they are connecting to.
These two rules are meant to prevent Man in the middle attacks. A TLS certificate allows checking you’re talking to the wanted target, not something in between which could spy and steal information.
When a machine has multiple names (think about DNS aliases, virtual hosts), a certificate can contain multiple names. The main name is called CN (Common Name), while other names are called SAN (Subject Alt Names).
Certificate automation process
The above depiction shows the automation of the creation of the certificate authority (Ansible master node) and the process used to create certificates for the target node(s) in the solution. The process followed as per steps depicted above are:
Create certificate authority root cert on the Ansible server. The Ansible master node will then become the governing authority to handle signing requests for target nodes.
Install OpenSSL Ansible role and playbook in target node(s). Note: in Linux OpenSSL is often part of the base OS, in windows OpenSSL will need installation via Ansible. For each target node steps 3-8 will be repeated.
Generate the private key and certificate signed request (CSR) on the target node. As part of this step the subject alternative name will be set to domain of the public DNS of the node itself.
Ansible will pull the CSR from the target node(s) to the master Ansible server.
Ansible will sign the CSR (as certificate authority) with the CA key and create the certificate for target node(s).
Signed certificate and CA certificate will be pushed back to the target node(s).
Import into the CA trust on the target node(s).
Creation of the keystore in PKCS which can imported into web container configuration (Tomcat for example) for TLS communication.
The TLS keys .key are sensitive, they do not travel, they stay where they are generated.
On the contrary, certificates .crt and CSRs .csr only contain public information.
Ansible scripts
Additional security configuration for TLS
After the certificate generation is performed for target nodes, TLS configuration needs to be applied also on network, application and web container level. The following configuration is required:
Tomcat TLS connector to enable TLS communication on web containers. Describing here for tomcat, a similar process would need to be put in place for a other web container providers. Modification of the server.xml is required to enable the TLS connector:
Note: Change clientAuth to true for mutual SSL communication
Found it easiest to integrate into Ansible roles/playbooks using Jinji template for server.xml for tomcat TLS configuration
Network rules/Security group configuration on additional ports to be opened for TLS communication. Once TLS is fully working, then the production environment needs to be closed to HTTP non secure communications.
This can be performed during provisioning infrastructure via Terraform
Application TLS configuration needs to be made according to application configuration/properties. Firstly the application itself needs to be designed for TLS support, then it is generally the configuration of keystore locations, protocols etc. This changes per module/application.