Overview:
- Seeking an experienced and highly skilled Lead Operational Excellence Engineer to join our team as we transform the world through our industry-leading cloud-based supply chain solutions.
- Operational Excellence is about improving our resilience and reliability of our cloud platform, ensuring our customers have the best possible experience and that the service is always up to meet their needs.
Scope:
- This role focuses on leading the enhancement of our operational processes and tools, driving a DevOps culture, and ensuring continuous improvements in reliability and operational efficiency across various functional teams
- The team is completely new based in India and we’re hiring for roles at multiple levels. The incumbent will need to have leadership qualities because their role involves working cross-functionally with other teams to help drive process and automation improvements.
Our current technical environment:
- Cloud Architecture: Microsoft Azure
- Observability: Elastic, Azure, Grafana, ELK
- Application Architecture: Scalable, Resilient, event driven, observable and secure multi-tenant Microservices architecture
- Infrastructure Architecture: Blue Green Deployments, High Availability, Disaster Recovery
- Frameworks/Others: Kubernetes, Docker, Kafka, Elasticsearch, GitHub CI/CD, ArgoCD, Argo Rollout, Crossplane, Prometheus
What you’ll do:
- Lead the definition and execution of high availability and disaster recovery test plans.
- Oversee the implementation and management of infrastructure change control processes.
- Drive enhancements in on-call and engineer engagement processes.
- Lead the development and maintenance of common operational dashboards and widgets.
- Establish and enforce guidelines for monitoring and alarming, mentoring teams in their implementation.
- Develop and manage processes for reporting operational impacts.
- Lead improvements in incident response processes and root cause analysis documentation.
- Automate the collection of service maturity data to support continuous delivery promotions.
- Collaborate with cross-functional teams to drive a DevOps culture and best practices.
- Identify gaps and implement tools, processes, and best practices for continuous improvement in reliability and operational excellence.
- Mentor junior engineers and provide technical leadership.
What we are looking for:
- Bachelor's degree in Computer Science, Engineering, or a related field; Master's degree preferred.
- 4.6 to 7.6 years of experience only in Site Reliability Engineering, DevOps, or a related field.
- Proven leadership experience with a track record of leading projects and mentoring teams.
- Strong understanding of cloud platforms, particularly Microsoft Azure.
- Extensive experience with high availability and disaster recovery planning and execution.
- Proficiency in infrastructure change control processes and tools.
- Expertise in on-call management and paging tools like Opsgenie or Pager Duty.
- Experience in defining and implementing monitoring, alarming, and operational dashboards.
- Strong problem-solving skills and a proactive approach to identifying and addressing operational issues.
- Excellent communication and collaboration skills, with the ability to work effectively with cross-functional teams.
- Passion for continuous improvement and operational excellence.
- Experience with supply chain solutions or similar industries.
- Certification in Microsoft Azure or related cloud platforms.
- Familiarity with incident management tools and processes.
- Knowledge of automation tools and scripting languages
Our Values
If you want to know the heart of a company, take a look at their values. Ours unite us. They are what drive our success – and the success of our customers. Does your heart beat like ours? Find out here: Core Values
Diversity, Inclusion, Value & Equity (DIVE) is our strategy for fostering an inclusive environment we can be proud of. Check out Blue Yonder's inaugural Diversity Report which outlines our commitment to change, and our video celebrating the differences in all of us in the words of some of our associates from around the world.
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status.
Top Skills
What We Do
The Anglian Water's @one Alliance is a partnership consisting of 8 companies, who each provide specialised knowledge allowing the @one Alliance to deliver complex delivery projects in the most efficient way, reducing the cost to Anglian Water’s customers.
Within @one Alliance we’ve currently embarked on a huge programme of work costing a massive £1.2 billion. We’ve entered year 5 of our current 5-year Asset Management Period (AMP7) meaning we’re full steam ahead in delivering around 50% of Anglian Water’s capital delivery projects.
Our partners are Anglian Water Asset Delivery, Balfour Beatty, Barhale, Binnies, Mott MacDonald Bentley (MMB), SWECO, SKANSKA and MWH Treatment. Employees in the @one Alliance are employed across all our partner companies and work together to deliver complex programmes of work.
The @one Alliance is currently working on over 700 projects all designed to improve and expand the Anglian Water network to better serve existing customers and help ensure supply for future customers as well.
As we move into AMP (Asset Management Period 8) in 2025, our work is set to increase from a £1.2 to an approximate £2.6 billion programme of works….so there has never been a better time to join us on our journey!
Take a look at our jobs page to see the options available to you