Infrastructure Engineering-focused Site Reliability Engineers (SRE) are responsible for automating the deployment, configuration, and monitoring of our corporate infrastructure, including cloud-provisioned infrastructure as well as the third-party and internally developed services running on them. We are most heavily utilizing AWS, but as our customer base is growing and diversifying, our Azure and GCP footprints are rapidly expanding.
As a Cloud Infrastructure SRE, you’ll be deeply involved in developing and automating operational controls for our entire corporate cloud infrastructure, DevOps pipeline tools, and operational security systems, relying heavily on Infrastructure as Code (IaC) best practices for defining secure and resilient systems and processes. Evangelizing operational efficiencies, cost savings, and DevOps
practices across various parts of the system will make you an effective member of the SRE team.
We’re looking for candidates who love to learn and are able to adapt quickly. If you are passionate about IaC, cybersecurity, and problem solving in a collaborative and modern technical environment, then this role is a great fit for you.
- Develop and manage IaC code and templates for secure and resilient cloud and tool management
- Develop operational controls for cloud infrastructure
- Develop operational controls and automations for security infrastructure (e.g. IAM, PKI)
- Administer and maintain development tools at the user, service, and infrastructure level
- Unix shell scripting
- Unix administration/troubleshooting
- IaC Orchestration (Terraform [preferred], Ansible, Chef, Puppet, SaltStack/Heat, Cloud Formation)
- Cloud Orchestration (AWS preferred)
- Monitoring tools (Prometheus, Grafana, ELK, Cloudwatch)
- Access and Identity Management (Okta, Kerberos, IdP)
- PKI (CSR, CA, Vault)
- Build/packaging tech (Glide, Gradle, Maven)
- CI coding (Jenkinsfile, .gitlab-ci.yml, .travis.yml)
- Any OO or functional programming language