Team Lead, Site Reliability Engineering

Sorry, this job was removed at 05:24 p.m. (CST) on Wednesday, Feb 19, 2025
Be an Early Applicant
Toronto, ON
Marketing Tech
The Role

Overview: 

Guidepoint’s Engineering team thrives on problem-solving and creating happier users. As Guidepoint works to achieve its mission of making individuals, businesses, and the world smarter through personalized knowledge-sharing solutions, the engineering team is taking on challenges to improve our internal application architecture and create new products to optimize the seamless delivery of our services. 

The site reliability engineering team lead is responsible for ensuring the reliability, scalability and performance of a SaaS product running on Azure. The role involves, leading a team of SRE’s to proactively monitor, Automate and optimize system performance while fostering a culture of collaboration with development teams, innovations and continuous improvements. As the SRE lead, this person will act as the bridge between development ad operations driving best practices of in reliability engineering and proactive management of environments thru Observability, Key areas of focus would include maintaining uptime, monitoring performance, resolving incidents, optimizing capacity, managing error budgets, and collaborating with development teams to build resilient and maintainable systems.


This is a hybrid position based in Toronto. 

What You’ll Do:

  • Guide, mentor, and upskill the SRE team, ensuring alignment with organizational priorities
  • Design and implement monitoring strategies to ensure uptime and minimize failures
  • Automate manual processes to improve efficiency and reduce human error
  • Define, manage, and maintain SLOs and SLIs to ensure high availability of systems
  • Manage error budgets and trigger breach actions as per established policies
  • Enhance Datadog automated monitoring and alerting, ensuring critical events are managed through the Status Page
  • Lead incident response alongside engineering leads, support RCA efforts, and drive auto-remediation initiatives
  • Collaborate with Product, Support, Engineering, and Cloud Operations teams to deliver scalable and reliable solutions
  • Actively participate in cost optimization initiatives with Cloud Operations and Engineering
  • Handle escalated customer issues and ensure satisfactory resolution
  • Conduct regular team meetings and training sessions
  • Identify areas for process improvement and implement best practices
  • Provide insights and recommendations to enhance reliability and customer satisfaction

What You Have:

  • 8+ years of experience in software development and Site Reliability Engineering or Production Engineering
  • 3+ years of experience leading an SRE team with expertise in Infrastructure as Code (IaC) using Terraform and Ansible, managing and operating Kubernetes clusters, and implementing monitoring and observability solutions with Datadog
  • Comprehensive understanding of web application security
  • Strong system engineering background with Linux/Windows
  • Proficient in development with Python or Golang
  • Strong understanding of Azure libraries (Client, Management, Asset)
  • In-depth knowledge of web application SaaS platforms and architecture
  • Proficient in SQL and possibly other database operations
  • Strong communication skills
  • Expertise in technical writing and documentation
  • Ability to rapidly analyze issues, anticipate consequences, make decisions, and take action
  • Ability to work independently and as part of a team
  • Experience in presenting monthly reports and metrics to managers and stakeholders

What We Offer:

  • Paid Time Off
  • Comprehensive benefits plan
  • Company RRSP Match
  • Development opportunities through the LinkedIn Learning platform

About Guidepoint: 

Guidepoint is a leading research enablement platform designed to advance understanding and empower our clients’ decision-making process. Powered by innovative technology, real-time data, and hard-to-source expertise, we help our clients to turn answers into action.

Backed by a network of nearly 1.5 million experts and Guidepoint’s 1,300 employees worldwide, we inform leading organizations’ research by delivering on-demand intelligence and research on request. With Guidepoint, companies and investors can better navigate the abundance of information available today, making it both more useful and more powerful.

At Guidepoint, our success relies on the diversity of our employees, advisors, and client base, which allows us to create connections that offer a wealth of perspectives. We are committed to upholding policies that contribute to an equitable and welcoming environment for our community, regardless of background, identity, or experience.

#LI-DH1

#LI-Hybrid

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New York, NY
2,882 Employees
On-site Workplace
Year Founded: 2003

What We Do

Guidepoint connects clients with vetted subject matter experts—Advisors—from our global professional network. Our clients leverage the insights and perspectives shared by our Advisors to stay informed and make better business decisions.

Our multinational client list includes nine of the top 10 global consulting firms, hundreds of hedge funds (including five of the largest firms), and many of the largest private equity firms and Fortune-ranked companies. Guidepoint’s fourteen offices on three continents provide 24/7, quick and agile service.

Similar Jobs

TransUnion Logo TransUnion

Administrative Assistant - (3 Month Contract)

Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Hybrid
Burlington, ON, CAN
13000 Employees

Motorola Solutions Logo Motorola Solutions

Product Manager (Canada Eastern Timezone Remote)

Artificial Intelligence • Hardware • Information Technology • Security • Software • Cybersecurity • Big Data Analytics
Remote
Hybrid
Toronto, ON, CAN
21000 Employees

BuildOps Logo BuildOps

Backend Engineer

Cloud • Mobile • Software
Hybrid
Toronto, ON, CAN
300 Employees
Hybrid
Toronto, ON, CAN
289097 Employees

Similar Companies Hiring

JuiceMedia.AI Thumbnail
Marketing Tech • Machine Learning • Digital Media • Big Data Analytics • Analytics • Agency • AdTech
Marina Del Rey, CA
68 Employees
Effectv Thumbnail
Marketing Tech • Digital Media • AdTech
New York, NY
2157 Employees
Hedra Thumbnail
Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
San Francisco, CA
14 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account