Engineering Manager - Site Reliability

Posted 2 Days Ago
Be an Early Applicant
Hyderabad, Telangana
Hybrid
Expert/Leader
Artificial Intelligence • Cloud • Enterprise Web • Software • Business Intelligence
Delight made easy. We make it fast and easy for businesses to delight their customers and employees.
The Role
The Engineering Manager will lead the SRE and Cost engineering team, ensuring goals related to reliability, scalability, performance, and cost. Responsibilities include overseeing incident response, fostering a culture of improvement, assessing architecture gaps, managing cloud cost, and guiding cross-functional teams while driving operational excellence.
Summary Generated by Built In

Company Description

Freshworks makes it fast and easy for businesses to delight their customers and employees. We do this by taking a fresh approach to building and delivering software that is affordable, quick to implement, and designed for the end user. Headquartered in San Mateo, California, Freshworks has a global team operating from 13 global locations to serve more than 65,000 companies -- from startups to public companies – that rely on Freshworks software-as-a-service to enable a better customer experience (CRM, CX) and employee experience (ITSM). 

Freshworks’ cloud-based software suite includes Freshdesk (omni-channel customer support), Freshsales (sales automation), Freshmarketer (marketing automation), Freshservice (IT service desk), Freshchat (AI-powered bots), supported by Neo, our underlying platform of shared services.

Freshworks is featured in global national press including CNBC, Forbes, Fortune, Bloomberg and has been a BuiltIn Best Place to work in San Francisco and Denver for the last 3 years. Our customer ratings have earned Freshworks products TrustRadius Top Rated Software ratings and G2 Best of Awards for Best Feature Set, Best Value for the Price and Best Relationship.


Job Description

What will you be responsible for:

You will lead the SRE and the Cost engineering team. This team is responsible and are the custodians of the four pillars: reliability, scalability, performance and cost. You will be responsible to ensure the reliability OKRs and drive the projects and initiatives that align to the above well architected pillars

What would your work-week look like :

  • Define team goals, objectives, and KPIs to measure OKR and operational excellence.

  • Lead a high-performing team of SREs

  • Oversee incident response, triaging, and resolution to minimize Mean Time to Recovery (MTTR).

  • Lead post-incident reviews and ensure long-term fixes are implemented

  • Should foster a culture of blamelessness and continuous improvement within their teams, encouraging open communication and learning from failures.

  • Assess the architecture and design gaps in the products and address them by building tools/solutions which can cater to the entire product and platform engineering team

  • Keeping up with emerging technologies and best practices in SRE and DevOps is essential to stay ahead in the field

Who are we looking for:

  • 10-12+ years of experience in SRE handling performance, architecture and design of applications

  • Strong understanding of cloud computing, networking, Linux systems administration, containerization (e.g., Docker, Kubernetes), and infrastructure as code (e.g., Terraform, Ansible)

  • Understanding of SRE principles, including SLOs, SLIs, SLAs, and error budgets.

  • Experience in managing incident and retrospectives

  • Experience in cloud cost management, cloud architecture

  • In-depth knowledge of cloud computing platforms (e.g., AWS)

  • Experience with infrastructure as code (IaC) tools and practices

  • Experience with monitoring, logging & telemetry tools like New Relic, Splunk, ELK, Nagios, SolarWinds, Prometheus, AWS Cloudwatch, Datadog, Opentelemetry

  • Expert in designing, creating and supporting Automation and Identify opportunities for self-healing systems, automated deployments, and other scalable solutions.

  • Experience in performance engineering and identify opportunities for performance tuning and profiling

  • Experience in prioritizing and managing technical roadmaps.

  • Strong skills in stakeholder communication, requirements gathering, and documentation.

  • Ability to lead cross-functional teams and build consensus around reliability goals

  • Improve operational processes and team practices

  • Provide technical and people leadership to the Site Reliability Engineering teams by facilitating one-one-one, team, and performance review meetings.

  • Problem-solving: Ability to analyze complex systems, troubleshoot issues, and devise effective solutions 

  • Excellent leadership and communication skills, with the ability to inspire and motivate cross-functional teams.

  • Experience in dealing with the intricacies of large-scale distributed systems and ensuring their reliability and performance.

Additional Information

At Freshworks, we are creating a global workplace that enables everyone to find their true potential, purpose, and passion irrespective of their background, gender, race, sexual orientation, religion and ethnicity. We are committed to providing equal opportunity for all and believe that diversity in the workplace creates a more vibrant, richer work environment that advances the goals of our employees, communities and the business.

Top Skills

Ansible
AWS
Docker
Kubernetes
Terraform
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Mateo , CA
5,500 Employees
Hybrid Workplace
Year Founded: 2010

What We Do

Freshworks makes it fast and easy for businesses to delight their customers and employees.

We do this by taking a fresh approach to building and delivering software that is affordable, quick to implement, and designed for the end-user.

Headquartered in San Mateo, California, Freshworks has a dedicated team operating from 13 global locations to serve customers throughout the world.

Why Work With Us

Our fresh approach to business software has enabled over 50,000 companies big and small across the globe to exceed customer and employee expectations. We deliver on the unfulfilled promise of easy-to-use SaaS software, and help our customers drive clear business results.

Gallery

Gallery

Similar Jobs

Hybrid
Hyderabad, Telangana, IND
289097 Employees
Hybrid
Hyderabad, Telangana, IND
289097 Employees
Hybrid
Hyderabad, Telangana, IND
289097 Employees
Hybrid
Hyderabad, Telangana, IND
289097 Employees

Similar Companies Hiring

Stepful Thumbnail
Software • Healthtech • Edtech • Artificial Intelligence
New York, New York
60 Employees
HERE Technologies Thumbnail
Software • Logistics • Internet of Things • Information Technology • Computer Vision • Automotive • Artificial Intelligence
Amsterdam, NL
6000 Employees
True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account