SRE Manager

Posted 10 Days Ago
Be an Early Applicant
Kondapur, Sangareddy, Telangana
Senior level
News + Entertainment
The Role
As a SRE Manager, you will ensure the reliability, performance, and scalability of services and infrastructure while leading SRE teams, managing incidents, and promoting automation and problem management strategies.
Summary Generated by Built In

Company Description

Ivy is a global, cutting-edge software and support services provider, partnering with one of the world’s biggest online gaming and entertainment groups. Founded in 2001, we’ve grown from a small tech company in Hyderabad to one creating innovative software solutions used by millions of consumers around the world, with billions of transactions taking place to head even some of the biggest technology giants. Focused on quality at scale, we deliver excellence to our customers day in and day out, with everyone working together to make what sometimes feels impossible, possible.

This means that not only do you get to work for a dynamic organization delivering pioneering technology, gaming and business solutions, you can also have an exciting and entertaining career. At Ivy, Bright Minds Shine Brighter. 

Job Description

As a SRE Manager, you will focus on ensuring the reliability, performance, and scalability of services and infrastructure.

Reporting to the Head of Engineering you will be part of the Product & Technology team will actively participate in all aspects of Site Reliability Engineering, including technical vision, telemetry and observation decisions, automation strategy, solution delivery, and platform incident and problem management. This is a leadership role with both technical and people leadership responsibilities. As such, this role participates in short and long-term systems planning, teams and organizational planning. This position reports directly to the Director of Engineering.

What you will do

  • Provide technical and people leadership to the SRE teams by facilitating one-one-one, team, and performance review meetings
  • Fulfil the role of Escalation Manager/Critical Incident Manager on critical/ major incidents by facilitating quick and effective incident resolution to minimize player and business impact.
  • Conduct RCA and Post-Incident Reviews (PIRs) in a Blameless manner to identify root causes and prevent recurrence.
  • Build advanced Incident Management and Problem Management support (SOPs and run-books) to effectively identify, remediate, and resolve issues related to platform reliability, stability, and performance through careful analysis of telemetry data and system logs.
  • Continuously work to improve problem identification and service restoration of platforms by leading and overseeing efforts to define, enhance, and deliver automated alerting and response systems with intelligent, self-healing capabilities
  • Collaborate with platform engineers through implementation decisions to achieve highly reliable infrastructure, systems, and integrations (develop synthetic monitoring, health dashboards, reliable alerts and system performance).
  • Promote automation (CI/CD), infrastructure-as-code (IAC) practices, develop tools and process for seamless deployments, rollbacks, monitoring and troubleshooting.
  • Define and ensure proper reviews are built to minimise the Mean Time to Recover/ Discover (MTTR/ MTTD) and Mean Time to Failure (MTTF).
  • Works with development teams to set error budgets, SLIs/ SLOs and policies. Works with SRE to implement alerts and policies to minimize the impact failures and outages have on players.

Qualifications

  • Graduate or Post-Graduate with strong engineering background.
  • 10+ years of experience working in global organizations with the ability to effectively communicate with executives, leaders and individual contributors across the organization.
  • 5+ years of SRE experience working with telemetry, observation, self-healing solutions, and platform automation.
  • Proficient in analysing complex technical issues, identifying root causes, and implementing effective solutions under pressure.
  • Experience with monitoring, logging & telemetry tools like New Relic, Splunk, ELK, Nagios, Prometheus, AWS CloudWatch, Datadog, etc.
  • Experience in Disaster Recovery, Chaos Engineering with tools like Chaos Mesh and Chaos Monkey and periodically testing resiliency and failovers.
  • Hand-on experience in the monitoring of Exposure with automation and tools such as (but not limited to) GitlabCI, Jenkins, Terraform, Ansible, etc.
  • Expert in designing, creating and supporting Automation (PowerShell, Python, Ruby, AWK, SED, etc.) to run health-checks and self-healing capabilities for the platforms.
  • Experience with Networking, Content Delivery Networks (CDN, e.g. Akamai, Cloudflare), streaming platform technologies, like Apache Kafka and Databases: (Oracle, MS SQL, etc.)
  • Experience with Cloud platforms esp. Amazon Web Services (AWS)
  • Application Security, the practice of safeguarding application through access control, Authn & Authz, data encryption, secure communication using TLS/SSL and MTLS.
  • Collaboration & Change Management tools: Jira, ServiceNow, SharePoint, etc.
  • Experience in managing relationships with third-party vendors and service providers contributing to the business.

Additional Information

What we offer

At Ivy, we know that signing top players requires a great starting package, and plenty of support to inspire peak performance. Join us, and a competitive salary is just the beginning. Working for us in Hyderabad, you can expect to receive great benefits like:

  • Safe home pickup and home drop
  • Group Mediclaim policy
  • Group Critical Illness policy
  • Communication & Relocation allowance
  • Annual Health check

And outside of this, you’ll have the chance to turn recognition from leaders and colleagues into amazing prizes. Join a winning team of talented people and be a part of an inclusive and supporting community where everyone is celebrated for being themselves.

Should you need any adjustments or accommodations to the recruitment process, at either application or interview, please contact us.

At Ivy, we do what’s right. It’s one of our core values and that’s why we're taking the lead when it comes to creating a diverse, equitable and inclusive future - for our people, and the wider global sports betting and gaming sector. However you identify, across any protected characteristic, our ambition is to ensure our people across the globe feel valued, respected and their individuality celebrated. 

Top Skills

Powershell
Python
Ruby
The Company
HQ: New Jersey, NJ
13,672 Employees
On-site Workplace

What We Do

Welcome to Entain.

Our journey as Entain began when we evolved from GVC Holdings on 9th December 2020, but our brands have been paving the way and making history since the 1880s.

Today, we’re one of the world’s largest sports betting and gaming entertainment groups – a FTSE 100 company that is home to more than 25 widely recognised brands, such as bwin, Coral, Foxy, Gala, Ladbrokes and partypoker.

But that’s just the beginning. We’re constantly broadening our horizons and expanding our global influence. For example, our partnership with MGM Resorts International has allowed us to make waves in the US by powering BetMGM with our bespoke and top-of-the-line technology.

It’s with this unique technology that we’re revolutionising our industry, and we’re boldly working towards being THE world leader in sports betting, gaming and interactive entertainment. Really though, it’s the people that truly make us who we are. There’s over 24,000 of us around the world and counting, but we all play for the same team.

We’re proud to promote a culture that shatters barriers to unite, and encourages uncompromised diversity of background, thought and experience. When we win, we win together.

If you share our values and want to be part of the revolution, we want you on our team. With offices across 19 different countries, we have an excellent history of identifying and nurturing the finest talent on a global scale. We’re all about putting our customers at the heart of the action and, with us, you can help bring moments of excitement into people’s lives.

At Entain, it’s your game. We’re ready to play – are you?

Similar Jobs

Hyderabad, Telangana, IND
960 Employees

Zeta Logo Zeta

Manager Site Reliability Engineer

Cloud • Fintech • Financial Services
Hyderabad, Telangana, IND
1834 Employees

MassMutual India Logo MassMutual India

Associate, PMO

Big Data • Fintech • Information Technology • Insurance • Financial Services
Hyderabad, Telangana, IND

Pfizer Logo Pfizer

healthcare Executive

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
Hybrid
Secunderabad, Hyderabad, Telangana, IND
121990 Employees

Similar Companies Hiring

MediaNews Group Thumbnail
News + Entertainment • Digital Media • Consumer Web • Cloud
Denver, CO
4000 Employees
News 12 Thumbnail
News + Entertainment • Digital Media • Consumer Web
Bethpage, NY
400 Employees
Hedra Thumbnail
Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
San Francisco, CA
14 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account