About the team
Our team owns the core preventative reliability platforms, tools, and programs used by infrastructure and product teams across the company. Our load testing platform enables traffic generation and monitoring to ensure products are prepared to handle surges in usage, our chaos platform enables product and infrastructure experimentation to validate system resiliency and operational excellence, and our scale validation game day program ensures Stripe can handle our biggest customers' biggest days.
What you’ll do
We’re looking for an experienced distributed systems engineer with outstanding technical and leadership skills, strong collaboration skills and huge passion for customers to help deliver the foundation of our reliability infrastructure and work with various teams and across the entire stack to deliver world-class reliability solutions. In this role you’ll not only be in charge of designing, implementing and testing your reliability infrastructure components, but you’ll play an influential role in enabling engineering teams to make their services more reliable by identifying, creating, and deploying engineering practices, processes, and solutions.
Responsibilities
- Design, build, test and operationalize end to end distributed systems reliability infrastructure and solutions that will be integrated into various services.
- Liaise with teams using this core infrastructure to ensure it meets their needs and expectations.
- Work cross functionally to ensure Stripe can scale to meet our biggest customers’ needs
- Shape the plan for the growth of Stripe’s reliability infrastructure
- Mentor other engineers in the organization and review code
- Manage projects, including measuring impact and success of the project, and creating a maintenance and reliability plan for the future
Who you are
We’re looking for someone who meets the minimum requirements to be considered for the role. If you meet these requirements, you are encouraged to apply. The preferred qualifications are a bonus, not a requirement.
Minimum requirements
- 9+ years of engineering experience or equivalent combined work experience reflecting domain expertise
- Hands-on experience designing and building large scale distributed systems.
- Demonstrated experience of leading initiatives spanning multiple teams and leveraging deep domain expertise to influence tech roadmap planning and execution
- Demonstrated ability to effectively collaborate across multiple teams and stakeholders to drive business outcomes
- Experience, mentoring, and investing in the development of engineers and peers
Preferred qualifications
- Genuine interest and/or experience in debugging and troubleshooting complex distributed systems problems.
- Familiarity with the common patterns and practices for building reliable software.
- Experience with Kubernetes, Golang,
What We Do
Stripe is a technology company that builds economic infrastructure for the internet. Businesses of every size—from new startups to public companies like Salesforce and Facebook—use the company’s software to accept online payments and run technically sophisticated financial operations in more than 100 countries. Stripe helps new companies get started and grow their revenues, and established businesses accelerate into new markets and launch new business models. Over the long term, Stripe aims to increase the GDP of the internet.