Site Reliability Engineering Manager

Posted 2 Days Ago
Be an Early Applicant
Hiring Remotely in Bengaluru, Karnataka
Remote
Senior level
Cloud • Marketing Tech • Professional Services • Social Impact • Software
At Granicus, our mission is to help bring government and citizens closer together.
The Role
The Site Reliability Engineering Manager will lead a team to ensure the reliability and scalability of services, manage production support, automate processes, and improve systems while collaborating with software engineers and adhering to security best practices.
Summary Generated by Built In

Summary Description

Granicus is seeking an experienced and highly skilled Senior Site Reliability Engineering Manager (SRE) to join our SRE team. As a Manager, you will play a pivotal role in ensuring the reliability, scalability, and performance of our services. You will lead efforts in building and maintaining a robust infrastructure, automating processes, and guiding the team to implement best practices in site reliability.

Essential Function

• On-call Production Support: Manage a team of engineers to provide production support on a shift according to the team on-call roster.

• Work on the customer and internal engineering/implementation team raised tickets while not on-call for production support. For example, a client may request to correct some data on the database server which cannot be done through the web interface.

• Work on SREs backlog items.

• Monitor and Maintain Systems: Continuously monitor the health and performance of our services, systems, and infrastructure. Respond to alerts and incidents promptly to ensure high availability.

• Automate Processes: Develop and maintain automation scripts and tools to streamline operations and reduce manual intervention.

• Incident Management: Assist in troubleshooting and resolving incidents, performing root cause analysis, and implementing long-term fixes to prevent recurrence.

• System Improvements: Participate in the design and implementation of system improvements to enhance reliability, scalability, and performance.

• Collaboration: Work closely with software engineers to understand application requirements, provide feedback on design and architecture, and support deployment and release processes.

• Documentation: Create and maintain documentation for processes, procedures, and troubleshooting guides to ensure knowledge sharing within the team.

• Capacity Planning: Assist in capacity planning activities to anticipate future needs and ensure that our infrastructure can handle growth.

• Security: Implement and adhere to security best practices to protect our systems and data.

Knowledge/Skills/Abilities

  • Technical Skills: Good understanding of Linux/Unix systems, networking, and cloud services (AWS, Azure, or Google Cloud). Experience with scripting languages such as Python, Bash, or Ruby.
  • Education: Bachelor’s or Master’s degree in Computer Science, Information Technology, or a related field, or equivalent practical experience.
  • Experience: 5+ years of experience in site reliability engineering, system administration, or a similar role, with a proven track record of managing large-scale, high-availability systems. 5+ years of experience as a people manager.
  • Technical Skills: Expertise in Linux/Unix systems, networking, and cloud services (AWS, Azure, or Google Cloud). Proficiency in scripting languages (Python, Bash, Ruby) and programming languages (Go, Java, C++).
  • Tools and Technologies: Advanced knowledge of monitoring and logging tools (Prometheus, Grafana, Splunk), configuration management (Ansible, Chef, Puppet), and CI/CD pipelines.
  • Problem-Solving: Strong analytical and problem-solving skills with the ability to diagnose and resolve complex issues efficiently.
  • Communication: Excellent verbal and written communication skills, with the ability to convey complex technical concepts to non-technical stakeholders.
  • Leadership: Demonstrated ability to lead and mentor a team, drive projects to completion, and manage cross-functional initiatives.
  • Experience/Credentials:
  • 8+ years experience in a SRE, DevOps or Software Engineering role and a minimum of 5 years as a people manager.
  • Certifications: Relevant certifications such as AWS Certified DevOps Engineer, Google Cloud Professional DevOps Engineer, or similar.
  • Knowledge: In-depth understanding of containerisation (Docker, Kubernetes) and infrastructure as code (Terraform, CloudFormation).
  • Experience: Experience with database management (SQL, NoSQL), load balancing, and distributed systems.

Other Job Info

  • These statements are intended to describe the general nature and level of work being performed by employees assigned to this job. This is not intended to be an exhaustive list of all responsibilities, duties, and skills required of employees assigned to this job.
  • This role is typically performed on a computer using Zoom or Teams. Individuals will be on camera throughout the day engaging with other employees. The role is typically performed indoors within a home office environment. This role is typically performed while sitting or standing at a desk. The individual will occasionally lift light objects.

  • Academic Qualifications and Certifications:
  • Bachelor’s degree in computer science, Information Technology, or a related field, or equivalent practical experience

  • Shift Time
  • The position requires flexibility in working hours to cover for any overlap and attend team meetings as needed.
  •  Shift Time: 24/7 on-call, including weekends (typically one week every month)

Security Requirement:

  • Responsible for Granicus information security by appropriately preserving the Confidentiality, Integrity, and Availability (CIA) of Granicus information assets in accordance with the company's information security program.

Top Skills

Bash
C++
Go
Java
Python
Ruby
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Denver, CO
1,500 Employees
On-site Workplace
Year Founded: 1999

What We Do

Granicus provides technology and services that empowers government organizations to create seamless digital experiences for the people they serve. By offering the industry’s leading cloud-based solutions for communications, content management, meeting and agenda management, and digital services to more than 5,500 public sector organizations, Granicus helps turn government missions into quantifiable realities.

Why Work With Us

As a company, Granicus helps empower some of the most creative people in the world who innovate within complex public sector organizations. We help make policies more effective and to transform the citizen experience so that everything from road closures to fostering programs are better communicated, understood, and ultimately successful.

Gallery

Gallery

Similar Jobs

Atlassian Logo Atlassian

Senior Backend Software Engineer

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Remote
India
11000 Employees

Atlassian Logo Atlassian

Senior Engineering Manager, Search Infrastructure

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Remote
Bengaluru, Karnataka, IND
11000 Employees

Atlassian Logo Atlassian

Sr Engineering Manager JSM

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Remote
Bengaluru, Karnataka, IND
11000 Employees

Cisco Meraki Logo Cisco Meraki

Technical Marketing Engineer - Wireless (4+ years of relevant experience)

Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
Easy Apply
Remote
Hybrid
Bengaluru, Karnataka, IND
3000 Employees

Similar Companies Hiring

HERE Technologies Thumbnail
Software • Logistics • Internet of Things • Information Technology • Computer Vision • Automotive • Artificial Intelligence
Amsterdam, NL
6000 Employees
True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
52 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account