Job Overview
We are seeking a hands-on Site Reliability Engineer (SRE) with strong observability experience to join our team onsite in Fort Mill, SC. The ideal candidate will have robust knowledge of SRE principles, advanced experience with observability tools, and a proven track record in production support and automation. You should be capable of building dashboards from scratch, driving process improvements, and collaborating closely with development teams.
ResponsibilitiesBuild and maintain dashboards using tools such as Grafana, Dynatrace, and ELK to ensure deep visibility into production environments.
Design, implement, and maintain SRE practices, including Error Budgets, SLOs (Service Level Objectives), SLIs (Service Level Indicators), and NFRs (Non-Functional Requirements) to support business reliability objectives.
Lead root cause investigations for incidents and proactively identify and address system anomalies.
Drive the reduction of TOIL by automating repetitive tasks and streamlining processes across the SDLC or IT operations.
Develop, implement, and enhance CI/CD pipelines using Git, GitHub Actions, GitHub Workflows, Jenkins, and similar tools.
Work closely with software engineers to ensure successful releases by improving application design, deployment, and monitoring workflows.
Assess, define, and roll out SRE approaches and solutions for various products while leading the development of SRE dashboards.
Design, develop, and deliver infrastructure automation leveraging Ansible Tower, Terraform, and other Infrastructure-as-Code (IaC) technologies.
Maintain, troubleshoot, and optimize cloud infrastructure, with strong hands-on experience on AWS and container orchestration.
Leverage observability and monitoring platforms (Dynatrace, Splunk, Elastic Stack, SolarWinds DPA) for real-time alerting, monitoring, and issue resolution.
Mentor development teams on SRE best practices and methodologies, and drive continuous improvement initiatives focused on reliability and cost optimization.
Demonstrable hands-on experience with observability tools: Grafana, Dynatrace, ELK (Elastic Stack), and scripting.
Deep understanding and application of core SRE principles: CUJ, SLO, SLI, Error Budgeting, and NFRs.
Experience building (not just consuming) dashboards in observability platforms.
Proficiency in .Net, SQL, React, Python, Ansible Tower, Terraform, Splunk, SolarWinds DPA, and other scripting/programming languages.
Cloud platform expertise (AWS strongly preferred).
Proven experience with CI/CD practices using Git, GitHub Actions, GitHub Workflows, and Jenkins.
Strong knowledge of Infrastructure as Code (IaC) and container orchestration (e.g., Kubernetes).
Production support and root cause analysis experience in high-availability environments.
Strong communication skills, proven ability to collaborate across teams, and a problem-solving mindset.
Familiarity with AIOps principles and automation best practices.
Experience with automation design and implementation to reduce manual effort within SDLC and IT operations.
Leadership in building SRE dashboards and developing error budget frameworks for products.
Experience in driving incident management and on-call processes.
...and precision for wholesale and correspondent lending. The Opportunity: The Encompass Support Specialist plays a pivotal role in providing comprehensive support and administration for the companys loan origination system and its integrations. This position demands...
...Job ID: 276785 Store Name/Number: PA-Ross Park (0274) Address: 1000 Ross Park Mall Drive, Pittsburgh, PA 15237, United States (US) Hourly/Salaried: Salaried (Exempt) Job Type: Full Time Position Type: Regular Job Function: Stores - Leadership Company...
We are currently seeking to add career minded individuals to our dynamic and growing organization. Responsibilities: -Concrete work - General Site Cleanup - Material handling - Load, unload, and identify building materials, machinery, and tools, and distribute...
...the LEAP Program (Lee Education Apprenticeship Program) at Lee Construction Group you will receive hands-on-training, technical instruction, and a paycheck. This apprenticeship program is an entry level general construction carpentry program. It is designed to prepare...
...authenticity, which is why we bring exclusive styles of high-quality fashion and performance sunglasses to all. A world leader in the... ...more by following us on LinkedIn! GENERAL FUNCTION The Assistant Manager I is a core member of the leadership team that drives...