Senior Resiliency Lead - DevOps

Posted 5 Days Ago
Be an Early Applicant
Atlanta, GA
116K-207K Annually
Senior level
Cloud • Fintech • HR Tech
The Role
As a Senior DevOps Engineer, you will lead efforts to ensure system resilience and fault tolerance, develop strategies to improve reliability, and collaborate with teams on best practices. You'll establish metrics to enhance performance, conduct chaos engineering, and refine incident management protocols, while providing guidance to development teams to maintain high availability in applications.
Summary Generated by Built In

Your work days are brighter here.

At Workday, it all began with a conversation over breakfast. When our founders met at a sunny California diner, they came up with an idea to revolutionize the enterprise software market. And when we began to rise, one thing that really set us apart was our culture. A culture which was driven by our value of putting our people first. And ever since, the happiness, development, and contribution of every Workmate is central to who we are. Our Workmates believe a healthy employee-centric, collaborative culture is the essential mix of ingredients for success in business. That’s why we look after our people, communities and the planet while still being profitable. Feel encouraged to shine, however that manifests: you don’t need to hide who you are. You can feel the energy and the passion, it's what makes us unique. Inspired to make a brighter work day for all and transform with us to the next stage of our growth journey? Bring your brightest version of you and have a brighter work day here.

At Workday, we value our candidates’ privacy and data security.  Workday will never ask candidates to apply to jobs through websites that are not Workday Careers. 

  

Please be aware of sites that may ask for you to input your data in connection with a job posting that appears to be from Workday but is not.

  

In addition, Workday will never ask candidates to pay a recruiting fee, or pay for consulting or coaching services, in order to apply for a job at Workday.

About the Team

As a Senior DevOps Engineer you will be joining one of Workday’s most exciting product and technology teams, Core Software. With nearly 750 employees globally, Core Software is responsible for evolving the core technology and runtime components of the Workday Platform and empowering developers to innovate and build on our products. This DevOps Engineer will be reporting to the VP of Core Services, will support our growth and continued success by ensuring our systems are resilient and capable of withstanding various challenges.
We are looking for a Senior Software DevOps Engineer who has proven experience to lead efforts to embed resilience and fault tolerance within our software architecture. Working closely with engineering, operations, and DevOps teams, this role will be responsible for developing and implementing strategies that improve system reliability, reduce Mean Time to Detect (MTTD) and Mean Time to Recovery (MTTR), and ensure our services can withstand disruptions.

About the Role

  • Design and deploy fault-tolerant and resilient architecture patterns to improve system availability and performance

  • Facilitate FMEA workshops to identify and prioritize potential failure points within critical applications and services
  • Establish and track metrics such as MTTD, MTTR, and Service Level Objectives (SLOs) to measure and improve system resilience
  • Work with QA and Perf teams to implement chaos engineering, stress testing, and other resiliency testing methodologies
  • Collaborate with Site Reliability Engineering (SRE) and operations teams to define incident management playbooks and recovery procedures for high-severity incidents
  • Educate development teams on best practices for building resilient applications, providing guidance on design principles, patterns, and tools
  • Lead post-incident reviews to identify root causes and implement changes to prevent recurrence, creating a culture of learning and continuous improvement
  • Provide program management support for short- and long-term work related to resiliency, quality, and security
  • Drive priorities and address backlogs for ongoing technical work that will advance reactions to incidents, security improvements, version updating, observability enhancements, and long-term system stability and scalability
  • Support Engineering and Product leaders within the Core Services pillar in establishing plans, managing dependencies and risks, and providing organizational visibility and team accountability to ensure the initiative moves forward at the pace needed
  • Implement techniques and best practices in processes and methodology so teams stay aligned and on track to achieve their stated goals.
  • Indepth OMS and workday stack knowledge is a big bonus

About You

Basic Qualifications

  • 5+ years in software engineering, architecture, or DevOps with a focus on reliability, resilience, and high availability
  • Proficiency in cloud platforms (AWS, GCP), distributed systems, containerization (Kubernetes, Docker), and infrastructure as code
  • Strong knowledge of FMEA, Chaos Engineering, Disaster Recovery, and redundancy strategies
  • Experience with monitoring tools (e.g., Prometheus, Grafana) and logging/alerting platforms
  • Proven ability to diagnose complex issues in distributed systems and design/influence/advise resilient solutions
  • Ability to synthesize information and drive alignment to a plan through varying levels of ambiguity

Other Qualifications

  • Bachelor’s or Master’s Degree in Computer Science, Engineering, or a related field (or equivalent work experience)
  • Desire to help others succeed, and the ability to quietly assess when and where to act as needed
  • Outstanding relationship-building and partnership skills
  • A high level of organization and attention to detail
  • Excellent training skills and ability to work hands-on with teams in an advisor role
  • Excellent written and verbal communication skills to document processes and influence engineering best practices
  • A keen desire to gain deeper knowledge of technologies and methodologies that may benefit us in the future through self-paced research, learning and experimentation

 

Posting End Date: 02/28/2025

If hired in Colorado, click here for information about Workday's comprehensive benefits in Colorado: https://workdaybenefits.com/us/welcome-to-workday-benefits/prospective-workmates.

The application deadline for this role is the same as the posting end date stated.


Workday Pay Transparency Statement

The annualized base salary ranges for the primary location and any additional locations are listed below.  Workday pay ranges vary based on work location. As a part of the total compensation package, this role may be eligible for the Workday Bonus Plan or a role-specific commission/bonus, as well as annual refresh stock grants. Recruiters can share more detail during the hiring process. Each candidate’s compensation offer will be based on multiple factors including, but not limited to, geography, experience, skills, job duties, and business need, among other things. For more information regarding Workday’s comprehensive benefits, please click here.

Primary Location: USA.GA.Atlanta


 

Primary Location Base Pay Range: $122,400 USD - $183,600 USD


 

Additional US Location(s) Base Pay Range: $116,300 USD - $206,500 USD


Our Approach to Flexible Work
 

With Flex Work, we’re combining the best of both worlds: in-person time and remote. Our approach enables our teams to deepen connections, maintain a strong community, and do their best work. We know that flexibility can take shape in many ways, so rather than a number of required days in-office each week, we simply spend at least half (50%) of our time each quarter in the office or in the field with our customers, prospects, and partners (depending on role). This means you'll have the freedom to create a flexible schedule that caters to your business, team, and personal needs, while being intentional to make the most of time spent together. Those in our remote "home office" roles also have the opportunity to come together in our offices for important moments that matter.

Pursuant to applicable Fair Chance law, Workday will consider for employment qualified applicants with arrest and conviction records.

Workday is an Equal Opportunity Employer including individuals with disabilities and protected veterans.

Are you being referred to one of our roles? If so, ask your connection at Workday about our Employee Referral process!

Top Skills

AWS
Docker
GCP
Kubernetes
The Company
HQ: Pleasanton, CA
14,894 Employees
On-site Workplace
Year Founded: 2005

What We Do

Workday is a leading provider of enterprise cloud applications for finance, HR, and planning. Founded in 2005, Workday delivers financial management, human capital management, and analytics applications designed for the world’s largest companies, educational institutions, and government agencies. Organizations ranging from medium-sized businesses to Fortune 50 enterprises have selected Workday.

Similar Jobs

Atlanta, GA, USA
12643 Employees
116K-207K Annually
Atlanta, GA, USA
604 Employees

Cox Enterprises Logo Cox Enterprises

Senior Manager, Software Engineering - ServiceNow

Automotive • Cloud • Greentech • Information Technology • Other • Software • Cybersecurity
Hybrid
Atlanta, GA, USA
50000 Employees
142K-237K Annually

Cox Enterprises Logo Cox Enterprises

Senior Enterprise Applications Engineer

Automotive • Cloud • Greentech • Information Technology • Other • Software • Cybersecurity
Hybrid
Atlanta, GA, USA
50000 Employees
90K-150K Annually

Similar Companies Hiring

MassMutual India Thumbnail
Insurance • Information Technology • Fintech • Financial Services • Big Data
Hyderabad, Telangana
Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Enterprise Web • Consulting • Cloud
Chicago, IL
45 Employees
RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
Charlotte, North Carolina
53 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account