Lead Site Reliability Engineer

Posted 3 Days Ago
Be an Early Applicant
Bengaluru, Bengaluru Urban, Karnataka
Senior level
Security • Software • Cybersecurity
The Role
In this role, you will lead and enhance the reliability and performance of production environments, manage incidents, and drive collaboration across technical teams.
Summary Generated by Built In

Role Overview:

As a Site Reliability Engineer (SRE) Technical Lead, you will be instrumental in overseeing the reliability, availability, and performance of our production environments at an advanced level. You will lead initiatives in proactive monitoring and management of incidents, fostering a culture of rapid resolution and minimal service disruption. Your extensive troubleshooting, log data analysis and debugging skills will facilitate close collaboration with DevOps, Engineering, and internal support teams, allowing us to achieve the highest levels of customer satisfaction.
This is a Hybrid position located in Bangalore. You will be required to be onsite on an as-needed basis, typically 1 to 6 times a month. We are only considering candidates within a commutable distance and are not offering relocation assistance at this time.

About the Role:

  • Proficiently utilizing AWS services like EC2, RDS, VPC, and CloudWatch, with expertise in log query analysis and monitoring optimization.
  • Driving APM monitoring solutions through hands-on experience with Prometheus, Grafana, and scripting in PMQL to enhance automation capabilities.
  • Troubleshooting and debugging issues by analyzing CloudWatch logs and providing detailed insights through log metrics and trend analysis.
  • Optimizing service performance by leveraging AWS tools, analyzing cost utilization, and implementing efficient scaling strategies.
  • Managing Kubernetes-based setups, including deployment, configuration changes, and service lifecycle management, while collaborating with DevOps teams.
  • Overseeing seamless code rollouts, maintaining production integrity, and spearheading root cause analysis for persistent issues.
  • Fostering collaboration across global teams to enhance service availability, reliability, and scalability.

About you:

  • Over 8 years of experience in the web and e-commerce domain, with a strong focus on cloud hosting, primarily AWS.
  • Skilled in log analysis and troubleshooting, with expertise in leveraging observability platforms to trace issues thoroughly.
  • Proficient in Prometheus, Grafana, or similar APM tools, with hands-on ability to optimize and enhance monitoring capabilities.
  • Passionate about digging deep into technical issues and providing actionable insights for resolution.
  • Adept at driving troubleshooting calls and ensuring end-to-end traceability for complex problems.
  • Strong knowledge of cloud environments and monitoring frameworks to support robust and scalable solutions.

#LI - Remote


Company Overview

McAfee is a leader in personal security for consumers. Focused on protecting people, not just devices, McAfee consumer solutions adapt to users’ needs in an always online world, empowering them to live securely through integrated, intuitive solutions that protects their families and communities with the right security at the right moment.

Company Benefits and Perks:

We work hard to embrace diversity and inclusion and encourage everyone at McAfee to bring their authentic selves to work every day. We offer a variety of social programs, flexible work hours and family-friendly benefits to all of our employees.

  • Bonus Program
  • Pension and Retirement Plans
  • Medical, Dental and Vision Coverage
  • Paid Time Off
  • Paid Parental Leave
  • Support for Community Involvement

We're serious about our commitment to diversity which is why McAfee prohibits discrimination based on race, color, religion, gender, national origin, age, disability, veteran status, marital status, pregnancy, gender expression or identity, sexual orientation or any other legally protected status.

Top Skills

AWS
Cloudwatch
Ec2
Grafana
Kubernetes
Prometheus
Rds
Vpc
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Santa Clara, CA
7,996 Employees
On-site Workplace

What We Do

McAfee is a global organization with a 30-year history and a brand known the world over for innovation, collaboration and trust. McAfee’s historical accomplishments are founded upon decades of threat and vulnerability research, product innovation, practical application and a brand which individuals, organizations and governments have come to trust.

Similar Jobs

Zeta Global Logo Zeta Global

Lead DevOps/ SRE Engineer

AdTech • Artificial Intelligence • Marketing Tech • Software • Analytics
Easy Apply
Hybrid
Bangalore, Bengaluru, Karnataka, IND
2194 Employees

JPMorganChase Logo JPMorganChase

Lead SRE

Financial Services
Hybrid
Bengaluru, Karnataka, IND
289097 Employees
Hybrid
Bengaluru, Karnataka, IND
289097 Employees

Arcesium Logo Arcesium

SRE Lead - Distributed Systems

Cloud • Fintech • Information Technology • Software • Financial Services
2 Locations
1500 Employees

Similar Companies Hiring

True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
53 Employees
Red 6 Thumbnail
Virtual Reality • Software • Hardware • Defense • Aerospace
Orlando, Florida
113 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account