Sr .Observability Engineer - UI

Posted 19 Days Ago
Be an Early Applicant
Pune, Maharashtra
Hybrid
Senior level
Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
The Role
The Sr. Observability Engineer - UI is responsible for enhancing front-end application reliability through observability solutions, performance optimization, and collaboration with dev teams, ensuring high operational metrics and incident management.
Summary Generated by Built In

Sr .Observability Engineer - UI:

We are seeking a Front-End focused Site Reliability Engineer to enhance the monitoring, visibility, and performance of our front-end applications. This roles bridges UI software development with observability best practices, ensuring our enterprise applications deliver seamless, high-performance experiences. You will be responsible for designing and implementing observability and reliability solutions that provide actionable insights into user experience, performance bottlenecks, and system reliability.Responsibilities:
Observability Solutions:
 Design and implement observability solutions (logs, metrics, traces, dashboards, alerts) tailored for NodeJS based services and micro front-end applications. Collaborate with front-end developers to instrument code for better observability and debugging.
Reliability Engineering: Design, develop, and implement solutions to improve the reliability, availability, performance, and scalability of our systems. Work with technical leaders and infrastructure platform services to develop alerts and dashboards.
Operational Excellence: Own and improve key operational metrics (SLIs, SLOs, Error Budgets, monitoring and alerting) for team related services and drive continuous improvement through post-incident reviews and blameless postmortems of non-functional issues. Develop and maintain comprehensive monitoring, alerting to proactively identify and resolve issues. Conduct ongoing reviews to address and optimize gaps. Work with technical leaders and NOC team to improve operational processes and team practices.
Monitoring and Alerting: Develop and maintain comprehensive monitoring and alerting to proactively identify and resolve issues.
Performance Optimization: Collaborate with performance subject matter experts to identify and address production performance bottlenecks through profiling, tuning, and optimization of services and infrastructure.
Automation: Automate repetitive tasks and processes to improve efficiency and reduce manual intervention.
Collaboration: Work closely with Software, Performance and Test Engineers to influence system design and architecture for operability and reliability.
Documentation: Create and maintain clear and concise documentation for systems, processes, runbooks, and procedures.
On-Call: Participate in on-call rotation.
Incident Management: Participate in on-call rotations and lead incident response efforts, ensuring timely resolution and effective communication.  Conduct in-depth incident analysis and help drive completion of post-incident action.
Troubleshooting skills: Excellent diagnostic and problem-solving skills, with the ability to analyze complex systems and data
Qualifications:

  • Bachelor’s degree in computer science, a related field, or equivalent practical experience.
  • Proven 5+ years of SRE or similar experience
  • Strong understanding of SRE principles and practices.
  • Experience with cloud platforms (AWS, GCP, or Azure).
  • Proficiency in at least one scripting language (e.g., Python, Bash, Go).
  • Experience with monitoring and logging tools (e.g., Prometheus, Grafana).
  • Experience with NodeJS based services (e.g. NestJS) and front-end tooling (e.g. webpack, npm, single-spa)
  • Level of coding experience beyond simple scripts with programming languages such as Java, JavaScript, TypeScript, Go, or Python to help build reliability engineering
  • Experience with containerization and orchestration technologies (e.g., Docker, Kubernetes).
  • Understanding of network protocols, and security best practices
  • Familiarity with DevOps culture and practices and experience with CI/CD toolchains
  • Experience with Incidence Response processes and config management tools (PagerDuty, Git),
  • Strong problem-solving and troubleshooting skills.
  • Excellent communication and collaboration skills. 
  • Ability to work independently and as part of a team to achieve the SRE agenda

What success looks like in the role
Within the first 30 days you will:

  • Onboard into your new role, get familiar with our product offering and technology, proactively meet peers and stakeholders, set up your test and development environment.
  • Seek to deeply understand business problems or common engineering challenges and propose software architecture designs to solve them elegantly by abstracting useful common patterns.

By 90 days:

  • Proactively collaborate on, discuss, debate and refine ideas, problem statements, and software designs with different (sometimes many) stakeholders, architects and members of your team.
  • Take a committed approach to prototyping and co-implementing systems alongside less experienced engineers on your team—there’s no room for ivory towers here.

By 6 months:

  • Share support of critical team systems by participating in call, learning the characteristics of currently running systems, and participating in improvements.
  • Occasionally serve as a debugging and implementation expert during escalations of systems issues that have evaded the ability of less experienced engineers to solve in a timely manner.
  • Collaborates with Support Management and Engineering Manager to quick resolution of escalation.

SailPoint is an equal opportunity employer and we welcome everyone to our team.  All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status.

Top Skills

AWS
Azure
Bash
Docker
GCP
Go
Grafana
Java
JavaScript
Kubernetes
Node.js
Npm
Prometheus
Python
Single-Spa
Typescript
Webpack
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Austin, TX
2,461 Employees
Hybrid Workplace
Year Founded: 2005

What We Do

SailPoint is the leader in identity security for the cloud enterprise. Our identity security solutions secure and enable thousands of companies worldwide, giving our customers unmatched visibility into the entirety of their digital workforce, ensuring workers have the right access to do their job – no more, no less.

Why Work With Us

Together, we’re redefining identity’s place in the security ecosystem. We love taking on new challenges that seem daunting to others. We hold ourselves to the highest standards and deliver upon our promises to our customers. We bring out the best in each other, and we’re having a lot of fun doing it.

Gallery

Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery

SailPoint Teams

Team
International Culture
Team
Engineering
Team
Professional Services
Team
Sales
About our Teams

SailPoint Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Typical time on-site: Flexible
HQAustin, TX
Amsterdam, NL
Coyoacán, Ciudad de México
London, GB
Pune, Maharashtra
Toronto, Ontario
Learn more

Similar Jobs

SailPoint Logo SailPoint

Senior Solution Architect

Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
Hybrid
Pune, Maharashtra, IND
2461 Employees

SailPoint Logo SailPoint

Solution Architect

Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
Hybrid
Pune, Maharashtra, IND
2461 Employees

SailPoint Logo SailPoint

Sr. Staff Software Engineer

Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
Hybrid
Pune, Maharashtra, IND
2461 Employees

SailPoint Logo SailPoint

Senior Data Engineer

Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
Hybrid
Pune, Maharashtra, IND
2461 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account