Background: A leading Swiss bank required a comprehensive system management solution for its critical systems, numbering around 200. These systems were highly critical and mostly non-replicable, with redundancy guaranteed at the hypervisor level rather than through system duplication. Initially, our services were engaged for a few weeks each year, focusing on one-off engagements like migrating Red Hat Satellite from version 5 to 6. As the bank’s Linux segment grew, our involvement increased until we effectively became a dedicated resource within their IT team for four years.
Project Objectives:
- Consolidate the bank’s organically grown Linux landscape based on RHEL.
- Transform the infrastructure into a VMware-based setup with standardized provisioning and maintenance.
- Implement automation and security measures to enhance system management.
- Introduce new operating models, providing choices of IaaS, PaaS, or SaaS depending on departmental needs.
- Ensure compliance with multiple security standards, including SELinux, FIPS, PCI DSS, NIST, and HIPAA.
- Manage system maintenance and secure interfacing with external partners.
Tools and Technologies Used:
- System Management: Red Hat Satellite
- Configuration Management: Puppet, Ansible
- Virtualization: VMware
- Security and Compliance: SELinux, FIPS, OpenSCAP, StackRox, Anchore/Grype
- Container Platforms: OpenShift clusters, Docker, Podman, Kubernetes, k3s
- Interfacing Technologies: Google Cloud Storage (GCS), SFTP
- Ticketing and Documentation: Jira, Confluence
Project Execution: Over the years, our architect/engineer led the consolidation of various bare metal and VMware setups running different RHEL versions into a cohesive, VMware-based infrastructure. Initially, the provisioning and maintenance processes were inadequately automated, leading to inconsistencies in disk partitioning, software installation, and configuration management.
The architect/engineer developed a comprehensive plan to standardize these processes, transforming the infrastructure into a 2.0 era of standards and provisioning procedures. Key measures included:
- Automation: Implementing Puppet and Ansible to automate provisioning and maintenance tasks, ensuring consistency and efficiency across all systems.
- Security: Transitioning all systems to use SELinux and FIPS as a basic security baseline. This involved enhancing the security posture to meet multiple standards like PCI DSS, NIST, HIPAA, and others.
- Operational Models: Introducing IaaS, PaaS, and SaaS options, allowing different departments to choose the most suitable model for their needs.
- Internal Tools: Introducing Jira for ticketing and Confluence for documentation, streamlining internal processes and improving collaboration.
- Transformation of Services: Transitioning most Docker and K3S setups to rootless equivalents like rootless Podman and rootless K3S. Some payloads were moved to the OCP setup. By the end, there were no insecure Docker setups existing.
In addition to system maintenance, the team was responsible for interfacing with external partners. This involved secure file exchange (authenticated, encrypted during transmission and on disk, and ensuring integrity) as well as managing APIs. Technologies used in this area included Google Cloud Storage (GCS) and SFTP.
As the infrastructure evolved, monitoring, alerting, and security scanning became critical components. Special vulnerability scanning was implemented using StackRox and Anchore/Grype to ensure all bare metal systems, VMs, and containers were accounted for and properly scanned. Given the diverse setup, this required crafting custom solutions to meet all requirements.
To maintain continuous compliance, OpenSCAP was employed for ongoing compliance testing, and Puppet was used as a continuous configuration management system. This ensured a high level of security and compliance across the entire infrastructure.
Challenges:
- Consolidating disparate systems with varying configurations and RHEL versions.
- Automating provisioning and maintenance processes.
- Ensuring high security and compliance standards for critical systems.
- Adapting to a diverse setup involving different container platforms and virtualization technologies.
- Managing secure file exchange and interfacing with external partners.
Solutions and Innovations:
- Standardizing provisioning and maintenance through automation with Puppet and Ansible.
- Implementing robust security measures with SELinux, FIPS, and continuous compliance testing.
- Crafting custom solutions for comprehensive vulnerability scanning across all platforms.
- Introducing operational models to provide flexible infrastructure options for different departments.
- Utilizing Jira and Confluence to enhance internal processes and collaboration.
- Ensuring secure file exchange and interfacing with external partners using GCS and SFTP.
Outcomes and Benefits:
- Consolidated and standardized infrastructure, improving operational efficiency and reliability.
- Enhanced security posture, meeting multiple compliance standards.
- Streamlined provisioning and maintenance processes through automation.
- Flexible operational models, allowing departments to choose IaaS, PaaS, or SaaS as needed.
- Improved internal processes and collaboration with the introduction of Jira and Confluence.
- Secure file exchange and effective interfacing with external partners.
Lessons Learned:
- The importance of standardizing processes to achieve consistency and efficiency.
- The need for continuous security and compliance measures in managing critical systems.
- The value of leveraging automation tools to streamline operations.
- The benefit of flexible operational models to meet varying departmental needs.
- The critical role of secure interfacing with external partners in maintaining data integrity and security.
Conclusion: The project to consolidate and secure critical systems for a leading Swiss bank was a significant success. Through meticulous planning, innovative solutions, and continuous improvement, the team transformed the bank’s infrastructure into a standardized, secure, and efficient environment. The introduction of new operational models, internal tools, and secure interfacing with external partners further enhanced the bank’s capabilities, setting a benchmark for managing and securing critical IT infrastructures.