Senior Customer Reliability Engineer, Infrastructure - India

Posted 10 Days Ago
Be an Early Applicant
2 Locations
Senior level
Big Data • Cloud • Software • Analytics
The Role
As a Senior Customer Reliability Engineer, you will support customers using Astronomer's Airflow platform, focusing on cloud infrastructure reliability, troubleshooting, and enhancing customer experience.
Summary Generated by Built In

Astronomer empowers data teams to bring mission-critical software, analytics, and AI to life and is the company behind Astro, the industry-leading unified DataOps platform powered by Apache Airflow®. Astro accelerates building reliable data products that unlock insights, unleash AI value, and powers data-driven applications. Trusted by more than 700 of the world's leading enterprises, Astronomer lets businesses do more with their data. To learn more, visit www.astronomer.io.

Your background may be unconventional; as long as you have the essential qualifications, we encourage you to apply. While having "bonus" qualifications makes for a strong candidate, Astronomer values diverse experiences. Many of us at Astronomer haven't followed traditional career paths, and we welcome it if yours hasn't either.

About this role:

The Astronomer Customer Reliability Engineering (CRE) team is responsible for the success of our customers' usage of our managed Airflow service.

The CRE are responsible for operating, monitoring, and maintaining the platform to ensure availability, predictability, and reliable operations.

As an infrastructure specialist within the team, you will focus on the reliability of the underlying cloud infrastructure and Kubernetes clusters. This entails responding to incidents either raised by a customer, or from our monitoring system and then taking further steps to ensure problems are permanently resolved or monitored. As owners of the observability platform, CRE has unlimited potential to improve the reliability of the product and deliver the best possible outcome for our customers.

This role is directly customer-facing and gives exposure to very diverse problems and requirements. The CRE get the opportunity to interface with customers from a variety of industries across different cloud providers, and all with different expectations. Your contributions will directly impact customers' success with using the Astronomer products, and you will be able to help make meaningful improvements to the customer experience.

What you get to do:

  • Provide solutions to customers to make them successful using our products.

  • Troubleshoot Customer environments and engage in active triaging with customers

  • Provide feedback to the product development teams on customer needs and pain points.

  • Build out our monitoring and alerting systems.

  • Build and maintain automation to ensure daily operational tasks are handled as efficiently as possible. 

  • Help direct the architecture of the products and contribute where possible.

  • Own the customer experience, working directly with customers to prioritize and solve issues, meet SLAs, and provide “white glove” guidance on the path to production.

  • Participate remotely within a fully distributed team.

  • Enhance and Enrich customer documentation

  • Work on a modern, sophisticated, cloud-native product that customers use to connect to dozens of other systems.

  • Help maintain 24x7 coverage through a specified 6-hour pager period during your work day.

  • Participate in paid on-call rotation for weekend coverage.

What you bring to the role:

  • 5+ years of experience, preferably with large, complex cloud infrastructures operating at scale

  • 3+ years of experience with Kubernetes

  • Experience managing a Production  distributed system with at least one major cloud provider (one or all: AWS, GCP, Azure)

  • Strong Network Experience with one of the major Clouds 

  • Strong Linux experience

  • Knowledge of how to operate and monitor issues for distributed systems 

  • Experience with Observability tools

  • Previous experience in handling customers issues (internal and external) 

  • Strong Communication Skills

  • DevOps or CI/CD experience

  • Python scripting

  • Good troubleshooting Skills 

Bonus points if you have:

  • Experience as a Site Reliability Engineer

  • Worked with Kubernetes Custom Resources

  • Depth of knowledge with Azure

  • Airflow/Big Data Orchestration experience

  • IaC experience

#LI-Fulltime

At Astronomer, we value diversity. We are an equal opportunity employer: we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.  Astronomer is a remote-first company.

Top Skills

Airflow
AWS
Azure
GCP
Kubernetes
Linux
Observability Tools
Python
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Cincinnati, OH
344 Employees
On-site Workplace
Year Founded: 2018

What We Do

Astronomer is the driving force behind Apache Airflow, the de facto standard for expressing data flows as code. Airflow is downloaded more than 4 million times each month and is used by hundreds of thousands of teams around the world.

For data teams looking to increase the availability of trusted data, Astronomer provides Astro, the modern data orchestration platform, powered by Airflow. Astro enables data engineers, data scientists, and data analysts to build, run, and observe pipelines-as-code.

Founded in 2018, Astronomer is a global remote-first company with hubs in Cincinnati, New York, San Francisco, and San Jose. Customers in more than 35 countries trust Astronomer as their partner for data orchestration.
Visit https://www.astronomer.io to learn more.

Similar Jobs

Hybrid
Hyderabad, Telangana, IND
2066 Employees

Yext Logo Yext

Specialist, Concierge Services

Artificial Intelligence • Information Technology • Internet of Things • Software
Easy Apply
Hyderabad, Telangana, IND
1100 Employees

Yext Logo Yext

Associate Platform Consultant

Artificial Intelligence • Information Technology • Internet of Things • Software
Easy Apply
Hyderabad, Telangana, IND
1100 Employees

ServiceNow Logo ServiceNow

Technical Support Engineer - Servicenow Developer/Admin

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Hybrid
Hyderabad, Telangana, IND
26000 Employees

Similar Companies Hiring

True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
53 Employees
Red 6 Thumbnail
Virtual Reality • Software • Hardware • Defense • Aerospace
Orlando, Florida
113 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account