Sr. Manager of SRE Operations

Posted 6 Days Ago
Be an Early Applicant
San Jose, CA
Hybrid
175K-200K Annually
7+ Years Experience
Cloud • Information Technology • Mobile • Security
The Role
As the Sr. Manager of SRE Operations, you will ensure the reliability of our SaaS platform, manage the SRE team, oversee incident resolution, and implement disaster recovery processes while maintaining customer satisfaction. You will lead improvements in platform operations, support on-call responsibilities, and report performance metrics to executive management.
Summary Generated by Built In

ZEDEDA is a simple and scalable cloud-based IoT edge orchestration solution that delivers visibility, control and security for the distributed edge with the freedom of deploying and managing any app on any hardware at scale and connecting to any cloud or on-premises system. With ZEDEDA customers can seamlessly manage and deploy any compute node to instantly unlock the value of IoT data, make real-time decisions, maximize operational efficiency and drive new business outcomes. We are looking for an experienced Senior Site Reliability Engineer (SRE) who is seeking new challenges and wants to make their mark by contributing to the design and upkeep of an exciting start-up. 


Reporting to the VP of Engineering, the Sr. Manager of SRE Operations is responsible for ensuring the availability of our SaaS platform and exceeding the uptime and performance requirements of our Fortune 500 customers. Together with the SRE Operations team you will implement processes and procedures that will ensure meeting the quality and predictability of disaster recovery, performance monitoring and alerting as well as reporting. ZEDEDA is ISO27001 and SOC2 certified which means that incidents need to be handled according to those standards. Being the lead of the team you will play a key role in ensuring the team performs beyond expectations and assists in growing the team. On-call responsibility is part of the role as well as implementing a strategy that supports 24 x 7 x 365 availability of the SRE Operations team, additionally you will be the initial escalation point for incidents and are responsible for ensuring they get resolved by including other teams if needed.


You will work with the SRE Technical Lead and team, as well as other groups in engineering to suggest and implement improvements for operating the platform. Regular reporting on the performance of the platform to upper management is expected. This is a hands-on role and you will perform your duties as part of the SRE Operations Team. You will interface with the Customer Experience Organization and when required meet with our customers. You are an energetic self-starter fully committed to our customers’ success by putting yourself in our customer’s shoes and constantly striving to make sure they can use our product at all times, by, 


- Creating ecstatic customers

- Ensuring frictionless deployments

- Escalation management

- On-call duties

- Radiate energy and enthusiasm

- Be a (technical) leader to the team




Qualifications

  • MS Computer Science, Information Technology or similar experience
  • 10+ years experience in SRE, with 5+ years experience in a SRE Operating Lead role
  • Leadership qualities and aspirations 
  • Project and escalation management skills
  • Proven technical writing skills
  • Excellent communication and written skills (English)

Requirements

  • An infrastructure with global presence in USA, EMEA, China and GovCloud
  • A large, complex, infrastructure with 20+ SaaS instances, 500+ VMs, 100+ databases, 10+ logging services
  • Meeting SLOs and creating robust and insightful metrics for large infrastructures and multiple SaaS instances
  • Capacity planning of a complex solution with 50k+ connected devices
  • Continuously driving cost down to maintain a competitive advantage
  • Managing a successful 24x7x365 on-call team and being point of escalation Implementing a structured incident management approach from the start of incident, resolution to root cause analyses.
  • Industry standards compliance, ISO-27001, SOC-2
  • Strong leadership skills with ability to coach and hIre A-players, and foster a culture of continuous improvement and automation.
  • Putting security at the center of everything you do.
  • Hands-on knowledge of: AWS, Azure or GCP
  • Terraform, Ansible
  • Python, Shell script(managed) Kubernetes, ArgoCD
  • GitOps, Jenkins, Github Actions
  • Datadog, Grafana Stack and Open Telemetry
  • PostgreSQL, Redis, Hashicorp Vault, InfluxDB and Open Search
  • Lacework, Blameless, Vanta

Pay & Benefits

Zededa’s main compensation philosophy is to provide you with the opportunity to progress as you grow and develop with the company. The base pay range, dependent on your skills, qualifications, experience and location for this role is between $175,000 and $225,000, and will also include commission, equity and benefits components to round out your total compensation.

Top Skills

Sre
The Company
HQ: San Jose, CA
72 Employees
On-site Workplace
Year Founded: 2016

What We Do

ZEDEDA, the leader in edge orchestration, delivers visibility, control and security for the distributed edge, with the freedom of deploying and managing any app on any hardware at scale and connecting to any cloud or on-premises systems. Distributed edge solutions require a diverse mix of technologies and domain expertise and ZEDEDA enables customers with an open, vendor-agnostic orchestration framework that breaks down silos and provides the needed agility and future-proofing as they evolve their connected operations. Customers can now seamlessly orchestrate intelligent applications at the distributed edge to gain access to critical insights, make real-time decisions and maximize operational efficiency.

ZEDEDA is a venture-backed Silicon Valley company, headquartered in San Jose, CA, with teams in India and Europe.

Jobs at Similar Companies

Jobba Trade Technologies, Inc. Logo Jobba Trade Technologies, Inc.

Customer Success Specialist

Cloud • Information Technology • Productivity • Professional Services • Software
Hybrid
Chicago, IL, USA
45 Employees

MassMutual India Logo MassMutual India

Analyst - Quality Assurance

Big Data • Fintech • Information Technology • Insurance • Financial Services
Hyderabad, Telangana, IND

Silverfort Logo Silverfort

Sales Operations Analyst

Information Technology • Sales • Security • Cybersecurity • Automation
Remote
United States
357 Employees

Similar Companies Hiring

MassMutual India Thumbnail
Insurance • Information Technology • Fintech • Financial Services • Big Data
Hyderabad, Telangana
Silverfort Thumbnail
Security • Sales • Information Technology • Cybersecurity • Automation
SG
357 Employees
Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account