Site Reliability Engineer

Posted 10 Days Ago
Be an Early Applicant
Praha, Hlavní město Praha
Entry level
Artificial Intelligence • Cloud • Information Technology • Software
The Role
The Site Reliability Engineer will ensure the observability and performance of Tricentis SaaS Products by designing and maintaining cloud infrastructure, developing monitoring systems, and automating operational processes. Responsibilities include collaborating with product engineers, responding to incidents, and proposing enhancements to system reliability and performance.
Summary Generated by Built In

Join our SRE Team and Revolutionize Tricentis SaaS Products! 

The Site Reliability Engineer plays a pivotal role in our SaaS strategy. You will work closely with our engineering team to ensure unrivalled observability, availability, and performance of Tricentis SaaS Products." 

As a Site Reliability Engineer (SRE), you'll be the driving force of our user-facing services and production systems. We're seeking individuals with pragmatic operational skills and software craftsmanship, applying engineering principles, operational discipline to elevate our operating environments and codebase to new heights. 

At the core of your responsibilities, you'll specialize in systems such as operating systems, storage subsystems, observability and networking while implementing best practices for availability, reliability, and scalability. But that's just the beginning of your thrilling journey with us! 

Your Impact as an SRE

  • Design, build, and maintain the product cloud infrastructure that enables seamless scaling to support hundreds of thousands of concurrent users. 

  • Develop advanced monitoring systems that proactively alert on symptoms, ensuring rapid response to potential issues. 

  • Leverage tools like Terraform, GitHub actions, and Kubernetes to efficiently manage our AWS or AZURE infrastructure. 

  • Continuously enhance operational processes, making deployments, upgrades, and other tasks as boring and automated as possible. 

  • Collaborate with product engineers on a daily basis and influence product architecture designs

  • Be part of an on-call (PagerDuty) rotation to respond swiftly to incidents affecting availability, offering support to product engineers during customer incidents. 

As a valuable member of our SRE team, you'll have the opportunity to

  • Act as a reliability champion for stable counterpart assignments, ensuring a robust and resilient infrastructure. 

  • Propose innovative ideas and solutions within the SRE organization and engineering 

  • Plan, design, and execute solutions to achieve goals agreed upon by the team. 

  • Leading by example with positive and inclusive leadership and fostering constructive discussions between SRE and engineering 

  • Proactively identify opportunities to enhance system availability and performance by applying insights gained from monitoring and observation. 

  • Share your learnings with the wider community

  • Be the first responder during emergencies and on-call duties, promptly addressing symptoms and conducting root cause analysis to implement corrective actions and prevent recurring issues. 

Our Tech Stack

Terraform, GitHub Actions, Kubernetes, DataDog, Prometheus, Grafana, AWS, AZURE

Our Culture

We don't just preach our values; we embody them in everything we do. We are committed to creating an environment that empowers, supports, and includes individuals, where trust, transparency, creativity, curiosity, and continuous improvement thrive on a daily basis. 

 

About You

  • Proficiency in Terraform syntax and GitHub Actions configuration, including pipelines and job management using GitOps

  • Working knowledge of SaaS architecture concepts and designs. 

  • Understanding of Kubernetes, including CLI usage and service re-provisioning 

  • Working knowledge of SaaS architecture concepts and designs. 

  • Ability to provision and set up metrics in DataDog, Prometheus and Grafana, along with managing alerts and silences. 

  • Identify Service Level Indicators (SLIs) that align the team with availability and latency objectives. 

  • Experience with Linux operating system configuration, package management, and troubleshooting. 

  • Working experience with cloud environments like AWS, AZURE or GCLOUD and provisioning infrastructure there 

If you're ready to make a lasting impact as a Site Reliability Engineer and be at the forefront of revolutionizing Tricentis SaaS Products, don't miss this.

You can look forward to:

    • Flexible working schedule (no core hours)
    • Learning and career growth opportunities
    • 25 days of paid time off
    • 3 Sick Days
    • 4 days of paid Volunteering Leave per year to get involved in your local community or in a cause that matters to you
    • Hybrid work environment, with home-office allowance
    • Meal allowance
    • Pension Contribution
    • Life & Disability Insurance
    • Paid Sickness leave
    • A team of passionate professionals who are experts in their fields
    • Events for employees to learn, celebrate and socialize (training sessions, hackathons, parties, sports events, board game gatherings, BBQs) and much more

    Tricentis Core Values: 

    Knowing what we need to achieve and how to achieve it is important. Tricentis core values define our ways of working and the behaviors we model that create an enjoyable and successful Tricentis life.

    • Demonstrate Self-Awareness: Own your strengths and limitations.
    • Finish What We Start: Do what we say we are going to do.
    • Move Fast: Create momentum and efficiency.
    • Run Towards Change: Challenge the status quo.
    • Serve Our Customers & Communities: Create a positive experience with each interaction.
    • Solve Problems Together: We win or lose as one team.
    • Think Big & Believe: Set extraordinary goals and believe you can achieve them.

    About Tricentis:

    Tricentis is a software company officially founded in 2007, with a primary focus on software quality assurance. Whether exploratory or automated, functional or performance, API or UI, as well as mainframes or custom applications or packaged applications, or cloud-native applications - our comprehensive suite of specialized Continuous Testing tools makes DevOps real by giving our clients the confidence to release on-demand.

    Tricentis has more than 1500 employees working in over 20 global offices in US, EMEA, APAC serving over 2100 customers, and currently expanding our R&D centres to two new locations in the Czech Republic – Prague and Brno - with a hybrid office environment.

      

    #LI-DS1     

    Top Skills

    Github Actions
    Kubernetes
    Terraform
    The Company
    Atlanta, GA
    1,154 Employees
    On-site Workplace
    Year Founded: 2007

    What We Do

    Tricentis is the global leader in enterprise continuous testing, widely credited for reinventing software testing for DevOps, cloud, and enterprise applications. The Tricentis AI-powered, continuous testing platform provides a new and fundamentally different way to perform software testing. An approach that’s totally automated, fully codeless, and intelligently driven by AI. It addresses both agile development and complex enterprise apps, enabling enterprises to accelerate their digital transformation by dramatically increasing software release speed, reducing costs, and improving software quality. Tricentis has been widely recognized as the leader by all major industry analysts, including being named the leader in Gartner’s Magic Quadrant five years in a row. Tricentis has more than 1,800 customers, including the largest brands in the world, such as McKesson, Accenture, Nationwide Insurance, Allianz, Telstra, Moet-Hennessy-Louis Vuitton, and Vodafone.

    Similar Jobs

    Tricentis Logo Tricentis

    Principal Site Reliability Engineer

    Artificial Intelligence • Cloud • Information Technology • Software
    Praha, Hlavní město Praha, CZE
    1154 Employees

    Gen Logo Gen

    Sr. Full Stack Developer

    Security • Cybersecurity
    Praha, Hlavní město Praha, CZE
    2006 Employees
    Hybrid
    Praha, Hlavní město Praha, CZE
    1050 Employees

    NN Group Logo NN Group

    Data Engineer for Alfred team

    Fintech • Payments • Financial Services
    Praha, Hlavní město Praha, CZE
    21409 Employees

    Similar Companies Hiring

    InCommodities Thumbnail
    Renewable Energy • Machine Learning • Information Technology • Energy • Automation • Analytics
    Austin, TX
    234 Employees
    RunPod Thumbnail
    Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
    Charlotte, North Carolina
    53 Employees
    Hedra Thumbnail
    Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
    San Francisco, CA
    14 Employees

    Sign up now Access later

    Create Free Account

    Please log in or sign up to report this job.

    Create Free Account