L3 Cloud DevOps Engineer / Site Reliability Engineer (SRE)

Posted 8 Days Ago
2 Locations
Remote
Senior level
Information Technology • Software • Consulting
The Role
The L3 Cloud DevOps Engineer / Site Reliability Engineer (SRE) will focus on creating and enhancing monitoring tools using Grafana, Prometheus, and Datadog. Key responsibilities include developing dashboards, managing alerts, conducting monitoring, and automating processes, with a strong emphasis on Python scripting and cross-functional collaboration.
Summary Generated by Built In

We are seeking an experienced L3 Cloud DevOps Engineer with a strong focus on Site Reliability Engineering (SRE) to join our team. This role is centered around the creation and enhancement of monitoring and alerting tools, with significant emphasis on using Grafana, Prometheus, and Datadog. The ideal candidate will have hands-on experience with Python scripting and a solid understanding of user and system monitoring. This role involves proactive dashboard building, cross-functional collaboration, and addressing service issues through monitoring and remediation.

Requirements

  • Extensive hands-on experience with Python scripting.
  • Strong expertise in Site Reliability Engineering (SRE) practices.
  • Proficiency in Grafana, including dashboard creation and modification.
  • In-depth knowledge of Prometheus and Datadog tools for monitoring and alerting.
  • Experience with user and system monitoring, along with the ability to create and enhance dashboards and runbooks.
  • DevOps experience is a secondary but desirable skill set.
  • Relevant certifications or courses in Python, SRE, Grafana, and Prometheus are a plus.

Responsibilities

  • Proactively build and enhance Grafana dashboards to improve monitoring capabilities.
  • Collaborate with cross-functional teams to ensure effective monitoring and alerting.
  • Manage and respond to alerts, focusing on timely remediation and implementation of solutions for service issues.
  • Conduct user and system monitoring to identify and address potential problems.
  • Develop and maintain runbooks to support operational efficiency and incident response.
  • Utilize Python scripting to automate and improve processes within the DevOps and SRE framework.

NTD Software is an equal opportunity employer. We do not discriminate on the basis of age, race, color, religion or religious creed, sexual orientation, gender, gender identity, marital status, family or parental status, disability, military or veteran status, age, or any other basis protected by law. All employment decisions at NTD Software are based on a person’s merit, business needs, and role requirements.

Top Skills

Python
The Company
San Francisco, California
21 Employees
On-site Workplace
Year Founded: 2021

What We Do

NTD Software is a Mexican company located in Guadalajara, Jalisco, known as "the silicon valley of Mexico." We help both startups and big companies by finding the right people to join their team and creating digital solutions using the latest or well-established programming languages and tools. Our expertise is in building software from the ground up and expanding our clients' existing teams, allowing us to work with businesses globally

Similar Jobs

Motive Logo Motive

Site Reliability Engineer, Embedded

Artificial Intelligence • Fintech • Hardware • Information Technology • Sales • Software • Transportation
Easy Apply
Remote
United States
3600 Employees
109K-156K Annually

Atlassian Logo Atlassian

Principal Site Reliability Engineer

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Remote
San Francisco, CA, USA
11000 Employees
167K-269K Annually

Atlassian Logo Atlassian

Site Reliability Engineer

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Remote
San Francisco, CA, USA
11000 Employees

Block Logo Block

Senior Software Engineer, DevOps

Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
Remote
Hybrid
7 Locations
12000 Employees
168K-297K Annually

Similar Companies Hiring

Hedra Thumbnail
Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
San Francisco, CA
14 Employees
HERE Thumbnail
Software • Logistics • Internet of Things • Information Technology • Computer Vision • Automotive • Artificial Intelligence
Amsterdam, NL
6000 Employees
True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account