SRE

Posted 4 Days Ago
Be an Early Applicant
28 Locations
Remote
Senior level
Software • Consulting
The Role
The Site Reliability Engineer (SRE) focuses on integrating software engineering with infrastructure operations, ensuring software reliability and efficiency. Responsibilities include application monitoring, incident response, change management, system reliability, and collaborating with development teams to streamline processes and enhance scalability and performance.
Summary Generated by Built In

Important Information

  • Experience: More than 4 years
  • Job Mode: Full-time
  • Work Mode: Hybrid

Job Summary

  • Site Reliability Engineering (SRE) is a discipline that blends software engineering with infrastructure and operations, aimed at building scalable and highly reliable software systems.
  • Focus on application monitoring, emergency response, and change management to ensure reliability and efficiency.
  • Collaborate with development teams throughout the software lifecycle to solve system-related issues and automate routine tasks.
  • Enhance system reliability, scalability, and performance by leveraging modern tools and processes.

Responsibilities and Duties

  • Application Monitoring: Utilize tools and automation for continuous application monitoring and reliability.
  • Emergency Response: Respond promptly to emergency incidents, perform root cause analysis, and resolve ongoing production issues.
  • Change Management: Manage and streamline release and change management processes to improve system performance.
  • Collaboration: Partner with development teams to solve system issues, automate routine tasks, and eliminate toil.
  • Reliability and Scalability: Ensure systems are highly reliable, scalable, and efficient to meet performance standards.

Qualifications and Skills

  • Strong understanding of monitoring tools such as Azure Monitoring, App Insights, Prometheus, and Grafana.
  • Experience with Infrastructure as Code tools like Terraform, ARM/Bicep, or Pulumi.
  • Proficiency in release management tooling such as ArgoCD, Harness, and Octopus.
  • Familiarity with incident alert tools like PagerDuty or Opsgenie.
  • Expertise in container orchestration tools like Kubernetes and AKS.
  • Proficiency in scripting (C#, Python, Bash, PowerShell -one of them is mandatory)
  • Strong collaboration and problem-solving abilities to resolve system issues effectively.
  • Knowledge of project tracking and version management tools like JIRA, SVN, and GitHub.

Role-specific Requirements

  • Proven experience in application monitoring and automated reliability processes.
  • Strong background in managing system reliability and performing root cause analysis during emergency responses.
  • Hands-on experience in change management processes and production environment releases.
  • Advanced knowledge of tools and practices for infrastructure automation and incident handling.
  • Familiarity with scalable system architecture principles and best practices.

Technologies

  • Monitoring Tools: Azure Monitoring, App Insights, Prometheus, Grafana
  • Infrastructure as Code: Terraform, ARM/Bicep, Pulumi
  • Release Management Tools: ArgoCD, Harness, Octopus
  • Incident Alert Tools: PagerDuty, Opsgenie
  • Container Orchestration: Kubernetes, AKS
  • Project Management Tools: JIRA, SVN, GitHub
  • Scripting: C#, Python, Bash or PowerShell

Skillset Competencies

  • Advanced monitoring and incident management techniques.
  • Infrastructure as Code and automation of routine workflows.
  • Expertise in release and change management processes.
  • Strong knowledge of container orchestration and scalable system design.
  • Excellent communication, collaboration, and problem-solving skills.
  • Ability to work effectively in cross-functional and virtual teams.

 About Encora

Encora is a trusted partner for digital engineering and modernization, working with some of the world’s leading enterprises and digital-native companies. With over 9,000 experts in 47+ offices worldwide, Encora offers expertise in areas such as Product Engineering, Cloud Services, Data & Analytics, AI & LLM Engineering, and more. At Encora, hiring is based on skills and qualifications, embracing diversity and inclusion regardless of age, gender, nationality, or background.

Top Skills

Aks
App Insights
Argocd
Arm/Bicep
Azure Monitoring
Bash
C#
Git
Grafana
Harness
JIRA
Kubernetes
Octopus
Opsgenie
Pagerduty
Powershell
Prometheus
Pulumi
Python
Svn
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Chennai
7,456 Employees
Hybrid Workplace
Year Founded: 1980

What We Do

Headquartered in Santa Clara, California, and backed by renowned private equity firms Advent International and Warburg Pincus, Encora is the preferred technology modernization and innovation partner to some of the world’s leading enterprise companies. It provides award-winning digital engineering services including Product Engineering & Development, Cloud Services, Quality Engineering, DevSecOps, Data & Analytics, Digital Experience, Cybersecurity, and AI & LLM Engineering. Encora's deep cluster vertical capabilities extend across diverse industries, including HiTech, Healthcare & Life Sciences, Retail & CPG, Energy & Utilities, Banking Financial Services & Insurance, Travel, Hospitality & Logistics, Telecom & Media, Automotive, and other specialized industries.
With over 9,000 associates in 47+ offices and delivery centers across the U.S., Canada, Latin America, Europe, India, and Southeast Asia, Encora delivers nearshore agility to clients anywhere in the world, coupled with expertise at scale in India. Encora’s Cloud-first, Data-first, AI-first approach enables clients to create differentiated enterprise value through technology

Similar Jobs

GitLab Logo GitLab

Intermediate Site Reliability Engineer, Environment Automation

Cloud • Security • Software • Cybersecurity • Automation
Easy Apply
Remote
28 Locations
2350 Employees

GitLab Logo GitLab

Intermediate Site Reliability Engineer, FinOps

Cloud • Security • Software • Cybersecurity • Automation
Easy Apply
Remote
29 Locations
2350 Employees

Token Metrics Logo Token Metrics

DevOps/Site Reliability Engineer (Global-Remote-Non.US)

Blockchain • Machine Learning • Analytics • Cryptocurrency
Remote
Athens, GRC
45 Employees

Autodesk Logo Autodesk

Cloud Reliability Architect / SRE

Big Data • Cloud • Digital Media • Machine Learning • Mobile • Software • Industrial
Remote
28 Locations
13285 Employees

Similar Companies Hiring

True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
53 Employees
Red 6 Thumbnail
Virtual Reality • Software • Hardware • Defense • Aerospace
Orlando, Florida
113 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account