Job Description:
Title: Site Reliability Architect
Job Information
The Site Reliability Architect (SRA) is responsible for designing and implementing scalable, reliable, and efficient systems that support the organization's software applications and services. As a key technical leader, you will work closely with development, operations, and product teams to ensure that systems are designed with reliability, performance, and scalability in mind. You will also play a crucial role in establishing best practices for site reliability engineering (SRE) and fostering a culture of operational excellence.
Essential Duties and Responsibilities
● Design and implement robust, scalable, and high-availability systems that meet business and technical requirements.
● Collaborate with software engineering teams to integrate reliability into the software development lifecycle, ensuring that applications are built with operational excellence in mind.
● Develop and maintain service level objectives (SLOs), service level agreements (SLAs), and service level indicators (SLIs) to measure system performance and reliability.
● Lead incident response efforts, including post-mortem analysis and root cause investigations, to improve system reliability and prevent future incidents. ● Automate operational processes to improve efficiency and reduce manual intervention, leveraging tools and technologies such as Infrastructure as Code (IaC).
● Monitor system performance and reliability using appropriate metrics and monitoring tools, proactively identifying and addressing potential issues. ● Advocate for and implement best practices in site reliability engineering, including capacity planning, disaster recovery, and incident management. ● Train and mentor engineering and operations teams on SRE principles and practices, fostering a culture of continuous improvement.
Qualifications
● Bachelor's or Master’s degree in Computer Science, Engineering, or a related field.
● 8+ years of experience in software engineering, systems engineering, or site reliability engineering.
● Strong understanding of cloud computing platforms (e.g., AWS, Azure, Google Cloud) and container orchestration technologies (e.g., Kubernetes, Docker).
● Experience with configuration management and automation tools (e.g., Terraform, Ansible, Puppet).
● Proficient in programming and scripting languages (e.g., Python, Go, Bash) for automation and tool development.
● Extensive knowledge of monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack) and practices.
● Solid understanding of networking concepts, distributed systems, and microservices architecture.
● Excellent problem-solving skills and the ability to work effectively under pressure.
Required Skills and Abilities
● Leadership Skills: Ability to lead cross-functional teams and drive initiatives that enhance system reliability and performance.
● Interpersonal Skills: Self-motivated, team player, builds trust, action and results-oriented; open and collaborative style; comfortable working in a dynamic environment.
● Communication Skills: Strong written, oral, and presentation skills, with the ability to effectively communicate technical concepts to non-technical stakeholders.
● Attention to Detail: Thoroughness in accomplishing tasks, ensuring accuracy and quality in all aspects of work.
● Analytical Skills: Strong analytical and troubleshooting skills, with the ability to think critically and make data-driven decisions.
Our Core Values
● Data Fanatics: Our edge is always found in the data
● Partner Obsessed: We are obsessed with partner success
● Team of Doers: We have a bias for action
● Gamechangers: We encourage innovation
Pattern is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.
Top Skills
What We Do
Pattern operates as a worldwide e-commerce growth, protection, control, and distribution platform for brands.
Pattern® provides a proven blend of marketplace analytics, product distribution, MAP compliance, and brand management to drive ecommerce acceleration for premium brands. We thrive on high energy, professional excellence, and disciplined creativity.