HashiCorp

Sr. Engineer II - Hashicorp Cloud DR (Hybrid)

Reposted 16 Days Ago

Be an Early Applicant

Bengaluru, Karnataka

Senior level

Cloud • Information Technology • Security • Software

The Role

As a Sr. Engineer in Hashicorp Cloud DR, enhance reliability, lead disaster recovery strategy execution, and ensure compliance through testing and tool development.

Summary Generated by Built In

The Role

As a Sr. Engineer for the Hashicorp Cloud DR team, you will play a critical role in enhancing the reliability DR governance of HashiCorp's cloud products. With at least 8 years of experience in Software Development in DR domain or reliability engineering or systems engineering or system testing or related fields, you will lead efforts to identify, address, and mitigate operational challenges before they impact our customers. Your expertise in system resilience, and understanding disaster recovery strategies will ensure that our services meet the highest standards of reliability and help achieve operational excellence across our cloud products.

You will play a pivotal role in enhancing our operational resilience and maintaining the reliability of our enterprise and cloud-based products. With a focus on DR and overall Quality you will be at the forefront of ensuring high availability and compliance of DR standards across HashiCorp’s offerings.

You will provide expert execution of the DR test plans, You will be working on a wide variety of tools and exploring new avenues to ensure all the products meet the essential Operational readiness criteria.

What you’ll do (responsibilities)

Leadership and Strategy: Provide visionary leadership and strategic direction to the Cloud DR team, ensuring alignment with organizational goals and the highest standards of reliability and disaster recovery.
Best Practices Implementation: Implement and advocate for best practices in system reliability and disaster recovery, including proactive identification of potential failure points and the development of automated mitigations.
DR Testing Strategies: Design and oversee the execution of comprehensive DR testing strategies to identify bottlenecks and failure points affecting RPO and RTO across our cloud products.
DR Compliance: Lead initiatives around DR compliance, implementing best practices and technologies to improve system resilience, ensuring high availability and fault tolerance through the Chaos testing framework.
Performance Benchmarking: Conduct rigorous performance benchmarking and testing to validate the efficiency and scalability of the tooling for the orchestration of DR across our cloud products.
Cross-functional Collaboration: Work closely with engineering and product teams to integrate operational readiness into the development lifecycle, enhancing product stability and user satisfaction.
Tool and Framework Development: Build and refine tools and frameworks for automated testing, environment simulation, and incident reproduction, reducing manual effort and increasing test coverage.
Mock Drills and Chaos Tests: Conduct mock drills and drive chaos tests in collaboration with partner teams, analyzing test results, documenting findings, and making actionable recommendations for systemic improvements.
Knowledge Sharing: Share your knowledge and expertise with team members, fostering a culture of learning and continuous improvement..

What you’ll need (basic qualifications)

8+ years of experience in software development, reliability engineering, systems engineering, or non functional testing roles with a focus on Disaster recovery or backup and recovery of Cloud based systems.
Having commitment to explore career opportunity in Reliability Engineering field
Proficient in Golang programming language or any other scripting language. Hands-on experience with version control systems such as Git, Gitlab.
Deep understanding of microservices architecture and CI/CD processes.
Experience in collecting various metrics and building data pipeline to analyze data and building dashboards for availability and status of various components across the cloud
Exposure to cloud technologies ( AWS, Azure, Or GCP) and container technologies like Nomad or Kubernetes.
Familiarity with chaos engineering principles and practices.
Exceptional communication and collaboration skills, capable of leading cross-functional teams and articulating technical concepts to diverse audiences..

What's nice to have (preferred qualifications)

Exposure to disaster recovery domain or worked on any product testing for DR is a plus
Experience with infrastructure as code (Terraform, CloudFormation) is a plus.
Chaos testing experience is a plus
Understanding of compliance frameworks like ISO/IEC 27031, ISO 22301, SOC2 is a Plus #LI-Hyrbid

“HashiCorp is an IBM subsidiary which has been acquired by IBM and will be integrated into the IBM organization. HashiCorp will be the hiring entity. By proceeding with this application you understand that HashiCorp will share your personal information with other IBM subsidiaries involved in your recruitment process, wherever these are located. More information on how IBM protects your personal information, including the safeguards in case of cross-border data transfer, are available here: link to IBM privacy statement.”

Top Skills

AWS

Azure

CloudFormation

GCP

Git

Gitlab

Kubernetes

Nomad

Terraform

View all jobs at HashiCorp

View HashiCorp Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: San Francisco, CA

1,200 Employees

Hybrid Workplace

Year Founded: 2012

What We Do

HashiCorp was founded by Mitchell Hashimoto and Armon Dadgar in 2012 with the goal of revolutionizing datacenter management: application development, delivery, and maintenance. The datacenter of today is very different than the datacenter of yesterday, and we think the datacenter of tomorrow is just around the corner.