Company Overview:

EdgeCo Holdings is comprised of several affiliated companies focused on providing a broad array of sophisticated financial products, technology, and support services in the areas of full-service retirement plan administration, brokerage services, and trust & custody solutions. EdgeCo provides these services through numerous subsidiary entities including American Trust, Mid Atlantic Trust Company, NewEdge Capital Group and PensionPro Software.

 

Job Summary:

American Trust seeks a motivated Sr. Staff Site Reliability Engineer (SRE) that thrives in a rapidly evolving technology environment. As part of the SRE Team, you will foster a culture of SRE for horizontal activities and DevOps for products and tools across our operations teams. The team you work in will have diverse expertise in systems, networking, and software engineering to provide the stability, performance, and reliability our customers need. We work with multiple service and development teams to identify cross-team issues which create risk for operations across the organization and resolve those issues with a mixture of engineering, troubleshooting expertise, and general operational guidance.

 

Responsibilities:

Solve complex problems related to infrastructure services and build automation to prevent problem recurrence.
Ensure that the underlying infrastructure runs smoothly, and that systems and tools work as expected.
Monitor critical applications and services to minimize downtime and ensure their availability.
Design, write, and deploy software to improve the availability, scalability, and efficiency of products and services used by American Trust.
Work with the Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and technology areas.
Partner with development teams in defining and implementing improvements in service architecture.
Act as the ultimate escalation point for complex or critical issues that still need to be documented as Standard Operating Procedures (SOPs).
Utilize a deep understanding of service topology and the dependencies required to troubleshoot issues and define mitigations.

 

Required Skills:

M.S / B.S. in Computer Science, Computer Engineering, Software Engineering, or related areas is preferred.
4+ years of experience in computing, network, storage, and database troubleshooting for improving capacity, reliability, scalability, and availability working as a Site Reliability Engineer.
Experience with Linux, Windows, and patch automation using Python and PowerShell.
System Administration, including Linux and Windows internals, TCP/IP, DNS, Load balancing technologies.
Experience with Security services and devices, including Firewalls, IDS/IDP, and SIEM.
Cloud network experience.
Experience in Oracle Licensing & HW configuration.
Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services.
Independent and self-motivated worker.
Creative problem solver, eager to learn and take on new challenge with the ability to “figure things out” when faced with modern technologies.
Excellent organizational, verbal, and written communication skills.

 

Preferred Skills

Experience in a 24×7 high-availability production environment
Experience VDI and Remote Administration Services
Developing/operating large-scale distributed services/applications.
Infrastructure automation through Terraform, Chef, Ansible, Puppet, Packer, or similar
Knowledge of cloud computing technologies, network monitoring
Experience with Cloud Orchestration frameworks, development, and SRE support of these systems
Experience with CI/CD pipelines including VCS (git, svn, etc.), Gitlab Runners, Jenkins, Rundeck
Oracle Database knowledge in ATP, ADW, and programming in SQL, PL/SQL
Experience in developing scripts to automate software deployments and installations
Demonstrate a clear understanding of automation and orchestration principles.

 

Benefits:

Compensation will be comprised of a base salary and opportunity to qualify for a quarterly performance-based bonus program. EdgeCo Holdings benefit package includes health, dental, vision, short-term disability, long-term disability, life insurance, PTO and 401(k) match (after applicable waiting periods).