The Principal Site Reliability Engineer is a critical part of our SaaS strategy. In this role you will work closely with engineering to ensure the reliability and performance of Tricentis SaaS Products. You will join a team of SRE organization with reporting into head of SRE.
As the Principal SRE, you will lead architecture councils and design solutions that shape our SaaS products. Help teams adopt SRE practices like error budgets, toil reduction, SLI/SLO/SLA, reliability.
Essential Job Duties and Responsibilities:
- Exercise autonomy in determining business priorities and technical focus areas.
- Synthesize vague problem statements into strategic technology solutions, especially in specialized domains.
- Contribute actively to core SRE areas, platform, and shared infrastructure initiatives.
- Lead and influence technical decision-making to align with business objectives.
- Set and promote best practices in the SRE domain, system designs, and development processes.
- Drive platform and infrastructure development to enhance system reliability and scalability.
- Build tools, libraries, and services that reduce engineering costs across common problem areas.
- Improve developer experience through documentation, mentoring, and teaching.
- Coordinate cross-departmental efforts for infrastructure implementation programs.
- Advocate for business priorities and share knowledge on how technical work aligns with the bigger picture.
Qualifications:
- Strong track record of sound technical judgment and autonomy in decision-making.
- Expertise in SRE domain, with a deep understanding of system designs and SaaS patterns.
- Mastery in debugging complex outages and establishing long-term preventive solutions.
- Strong understanding of DevOps and Infrastructure domains.
- Ability to analyze trade-offs and understand scaling, reliability, and security concerns for new technologies.
- Advanced understanding of security threats and mitigation techniques.
- Excellent communication skills, adaptable for different audiences (engineers, product teams, executives, customers).
- Strong opinions about the current technology landscape, with the ability to advocate for or against specific technologies.
- Ability to influence technical decisions within a group and align them with business needs.
- Track record of creating impactful content such as papers or conference presentations.
- Experience setting a technical "north star" and coordinating its implementation across different teams.
Tech Stack
- AWS/Azure/GCP
- Kubernetes (EKS/AKS), ArgoCD, Helm
- Azure DevOps, GitHub Actions
- PostgresSQL, MongoDB, Redis,
- Java/C#/Node.JS
- Terraform, Pulumi, Crossplan
- Datadog, PagerDuty, Azure Insights, OTEL
You can look forward to:
- Flexible working schedule (no core hours)
- Learning and career growth opportunities
- 25 days of paid time off
- 3 Sick Days
- 4 days of paid Volunteering Leave per year to get involved in your local community or in a cause that matters to you
- Hybrid work environment, with home-office allowance
- Meal allowance
- Pension Contribution
- Life & Disability Insurance
- Paid Sickness leave
- A team of passionate professionals who are experts in their fields
- Events for employees to learn, celebrate and socialize (training sessions, hackathons, parties, sports events, board game gatherings, BBQs) and much more
Tricentis Core Values:
Knowing what we need to achieve and how to achieve it is important. Tricentis core values define our ways of working and the behaviors we model that create an enjoyable and successful Tricentis life.
- Demonstrate Self-Awareness: Own your strengths and limitations.
- Finish What We Start: Do what we say we are going to do.
- Move Fast: Create momentum and efficiency.
- Run Towards Change: Challenge the status quo.
- Serve Our Customers & Communities: Create a positive experience with each interaction.
- Solve Problems Together: We win or lose as one team.
- Think Big & Believe: Set extraordinary goals and believe you can achieve them.
Top Skills
What We Do
Tricentis is the global leader in enterprise continuous testing, widely credited for reinventing software testing for DevOps, cloud, and enterprise applications. The Tricentis AI-powered, continuous testing platform provides a new and fundamentally different way to perform software testing. An approach that’s totally automated, fully codeless, and intelligently driven by AI. It addresses both agile development and complex enterprise apps, enabling enterprises to accelerate their digital transformation by dramatically increasing software release speed, reducing costs, and improving software quality. Tricentis has been widely recognized as the leader by all major industry analysts, including being named the leader in Gartner’s Magic Quadrant five years in a row. Tricentis has more than 1,800 customers, including the largest brands in the world, such as McKesson, Accenture, Nationwide Insurance, Allianz, Telstra, Moet-Hennessy-Louis Vuitton, and Vodafone.