Staff Engineer - DevOps Site Reliability

Posted 12 Hours Ago
Be an Early Applicant
Hiring Remotely in LTU
Remote
Senior level
Artificial Intelligence • Information Technology • Machine Learning • Software • Virtual Reality • Analytics
The Role
The role involves being an experienced L3 Site Reliability Engineer (SRE) for a business-critical SaaS application, providing full-stack support, automating SRE tools, and effectively communicating with various teams. Responsibilities include incident management, troubleshooting, and working with AWS services, CI/CD pipelines, and monitoring tools.
Summary Generated by Built In

Company Description

We are a Digital Product Engineering company that is scaling in a big way! We build products, services, and experiences that inspire, excite, and delight. We work at scale — across all devices and digital mediums, and our people exist everywhere in the world (19000+ experts across 33 countries, to be exact). Our work culture is dynamic and non-hierarchical. We are looking for great new colleagues. That is where you come in!

Job Description

  • Experienced L3 SRE engineer based on business-critical SaaS application.
  • Capacity to L3 across the full stack including infra, backend and front-end, before escalation to engineering business unit.
  • Capacity to automate SRE tools to provide proactive.
  • L3 support, close to our tech monitoring strategy.
  • Capacity to work under business pressure for business critical applications.
  • Capacity to communicate accordingly with L1,L2, Engineering, Product managers, leadership and end-users during troubleshooting.
  • Capacity to communicate accordingly.
  • Experience with incident and problem management.
  • Experience with multitenant applications.
  • Solid understanding of networking concepts(TCP/IP, DNS, Routing, etc) like VPCs, subnets, firewalls, and load balancing, TLS and SSL.
  • Experience with CI/CD pipelines (e.g., Jenkins, Github Actions) & version control.
  • Python, react/next.
  • Monitoring and logging to analyze & track resource utilization, application performance, and identify potential issues, Grafana, Prometheus, Loki or ELK.
  • Experience with AWS, particularly EKS, serverless, queue & various databases.
  • Solid knowledge Kubernetes.

Qualifications

Must have Skills: EKS, Github Actions, Python (Strong), Kubernetes (Expert), Prometheus.

Good to Have Skills: 

  • Previous experience building a user-facing GenAI/LLM software application.
  • Security best practices in cloud environments. - AWS Managed Services (RDS, Batch, Lambda, Fargate, Step Functions, SQS/SNS, etc.).
  • FastAPI and NextJS experience (if we're still using the latter).
  • Websockets, Server-Side Events, Pub/Sub (RabbitMQ, Kafka, etc.).
  • Cloud security concepts (IAM, access control).
  • Terraform experience. 

Top Skills

Python
The Company
19,994 Employees
On-site Workplace
Year Founded: 1996

What We Do

Nagarro helps future-proof your business through a forward-thinking, fluidic, and CARING mindset. We excel at digital engineering and help our clients become human-centric, digital-first organizations, augmenting their ability to be responsive, efficient, intimate, creative, and sustainable. Today, we are 19,000 experts across 36 countries, forming a Nation of Nagarrians, ready to help our customers succeed.

Similar Jobs

Nagarro Logo Nagarro

Staff Engineer - DevOps Site Reliability

Artificial Intelligence • Information Technology • Machine Learning • Software • Virtual Reality • Analytics
Remote
LTU
19994 Employees

Nagarro Logo Nagarro

Staff Engineer, DevOps

Artificial Intelligence • Information Technology • Machine Learning • Software • Virtual Reality • Analytics
Remote
LTU
19994 Employees

Nagarro Logo Nagarro

Associate Staff Engineer, DevOps

Artificial Intelligence • Information Technology • Machine Learning • Software • Virtual Reality • Analytics
Remote
LTU
19994 Employees

Nagarro Logo Nagarro

Senior Engineer, DevOps

Artificial Intelligence • Information Technology • Machine Learning • Software • Virtual Reality • Analytics
Remote
LTU
19994 Employees

Similar Companies Hiring

Air Space Intelligence Thumbnail
Software • Machine Learning • Aerospace
Boston, , MA
109 Employees
HERE Technologies Thumbnail
Software • Logistics • Internet of Things • Information Technology • Computer Vision • Automotive • Artificial Intelligence
Amsterdam, NL
6000 Employees
True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account