Site Reliability Architect

Posted 7 Days Ago
Be an Early Applicant
Pune, Maharashtra
Senior level
eCommerce • Logistics • Software • Analytics
The Role
The Site Reliability Architect designs scalable systems, collaborates with teams to integrate reliability, and implements best practices in reliability engineering.
Summary Generated by Built In

Job Description:

Title: Site Reliability Architect 

Job Information 

The Site Reliability Architect (SRA) is responsible for designing and implementing scalable, reliable, and efficient systems that support the organization's software applications and services. As a key technical leader, you will work closely with development, operations, and product teams to ensure that systems are designed with reliability, performance, and scalability in mind. You will also play a crucial role in establishing best practices for site reliability engineering (SRE) and fostering a culture of operational excellence. 

Essential Duties and Responsibilities 

Design and implement robust, scalable, and high-availability systems that meet business and technical requirements. 

Collaborate with software engineering teams to integrate reliability into the software development lifecycle, ensuring that applications are built with operational excellence in mind. 

Develop and maintain service level objectives (SLOs), service level agreements (SLAs), and service level indicators (SLIs) to measure system performance and reliability. 

Lead incident response efforts, including post-mortem analysis and root cause investigations, to improve system reliability and prevent future incidents. Automate operational processes to improve efficiency and reduce manual intervention, leveraging tools and technologies such as Infrastructure as Code (IaC). 

Monitor system performance and reliability using appropriate metrics and monitoring tools, proactively identifying and addressing potential issues. Advocate for and implement best practices in site reliability engineering, including capacity planning, disaster recovery, and incident management. Train and mentor engineering and operations teams on SRE principles and practices, fostering a culture of continuous improvement. 

Qualifications 

Bachelor's or Master’s degree in Computer Science, Engineering, or a related field. 

8+ years of experience in software engineering, systems engineering, or site reliability engineering. 

Strong understanding of cloud computing platforms (e.g., AWS, Azure, Google Cloud) and container orchestration technologies (e.g., Kubernetes, Docker).

Experience with configuration management and automation tools (e.g., Terraform, Ansible, Puppet). 

Proficient in programming and scripting languages (e.g., Python, Go, Bash) for automation and tool development. 

Extensive knowledge of monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack) and practices. 

Solid understanding of networking concepts, distributed systems, and microservices architecture. 

Excellent problem-solving skills and the ability to work effectively under pressure. 

Required Skills and Abilities 

● Leadership Skills: Ability to lead cross-functional teams and drive initiatives that enhance system reliability and performance. 

● Interpersonal Skills: Self-motivated, team player, builds trust, action and results-oriented; open and collaborative style; comfortable working in a dynamic environment. 

● Communication Skills: Strong written, oral, and presentation skills, with the ability to effectively communicate technical concepts to non-technical stakeholders. 

● Attention to Detail: Thoroughness in accomplishing tasks, ensuring accuracy and quality in all aspects of work. 

● Analytical Skills: Strong analytical and troubleshooting skills, with the ability to think critically and make data-driven decisions. 

Our Core Values 

● Data Fanatics: Our edge is always found in the data 

● Partner Obsessed: We are obsessed with partner success 

● Team of Doers: We have a bias for action 

● Gamechangers: We encourage innovation

Pattern is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Top Skills

Ansible
AWS
Azure
Bash
Docker
Elk Stack
Go
GCP
Grafana
Kubernetes
Prometheus
Puppet
Python
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Lehi, UT
501 Employees
On-site Workplace
Year Founded: 2013

What We Do

Pattern operates as a worldwide e-commerce growth, protection, control, and distribution platform for brands.

Pattern® provides a proven blend of marketplace analytics, product distribution, MAP compliance, and brand management to drive ecommerce acceleration for premium brands. We thrive on high energy, professional excellence, and disciplined creativity.

Similar Jobs

Magna International Logo Magna International

Senior Mould Engineer

Automotive • Hardware • Robotics • Software • Transportation • Manufacturing
Hybrid
Pune, Maharashtra, IND
171000 Employees

Magna International Logo Magna International

Senior Design Engineer

Automotive • Hardware • Robotics • Software • Transportation • Manufacturing
Hybrid
Pune, Maharashtra, IND
171000 Employees

Cencora Logo Cencora

Engineer II - Software Engineering (IN)

Healthtech • Logistics • Pharmaceutical
Pune, Maharashtra, IND
46000 Employees

Cencora Logo Cencora

Salesforce Developer

Healthtech • Logistics • Pharmaceutical
Pune, Maharashtra, IND
46000 Employees

Similar Companies Hiring

True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
53 Employees
Red 6 Thumbnail
Virtual Reality • Software • Hardware • Defense • Aerospace
Orlando, Florida
113 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account