Manager of Engineering, SRE

Posted 2 Days Ago
3 Locations
Remote
Senior level
Software
The Role
The Site Reliability Engineering (SRE) Manager will lead a team to ensure system reliability and efficiency, coach team members, promote best practices in observability and CI/CD, manage on-call rotations, drive application resilience, and oversee multiple projects aligning with organizational goals.
Summary Generated by Built In

Who We Are

At Platform Science, we’re working to connect everything that moves.

Founded in 2015, we are an open IoT platform that partners with innovative fleets, application developers, vehicle manufacturers, and equipment providers in the transportation industry to deliver revolutionary solutions to supply chain professionals across the globe.

Our employees are an engaging, diverse group of people who believe in the power of great ideas. We hire people with different experiences and perspectives to build a company culture that fuels growth through innovation.

We value thoughtful actions and empathy for others.  We approach challenges with resiliency and creativity, while encouraging transparency because, no matter our backgrounds or responsibilities, we are one team.

About the Role

The Site Reliability Engineering (SRE) Manager will lead a high-performing team that ensures system reliability, scalability, and efficiency while championing SRE principles across the organization. This role involves coaching the team, promoting best practices, and enabling development teams to deliver observable, maintainable, and production-ready applications. The SRE Manager oversees multiple projects, requests, and initiatives while maintaining clear communication and keeping the team aligned and productive.

Essential Responsibilities

  • Recruit, train, and mentor a team of Site Reliability Engineers to deliver operational excellence.
  • Foster a culture of innovation, collaboration, and adherence to SRE principles like SLOs, error budgets, and production readiness.
  • Standardize and train development teams on observability tools such as Prometheus, Grafana, and Datadog.
  • Enhance developer and release workflows using CI/CD best practices, GitOps methodologies, and tools like Jenkins, ArgoCD, and Docker.
  • Drive application and system resilience through chaos engineering, load testing, and automation.
  • Collaborate with teams to define SLIs, SLOs, and manage error budgets.
  • Manage on-call rotation schedules, optimize alerting processes, and ensure 24/7 production application support.
  • Serve as the escalation point for incident resolution, providing guidance and technical expertise.
  • Build tools, dashboards, and processes to improve incident response, production health, and system reliability.
  • Conduct quarterly "State of the Service" reviews to assess performance, sustainability, and risks.
  • Track and prioritize multiple initiatives while ensuring the team stays focused and aligned with organizational goals.
  • Maintain detailed documentation on team projects, requests, policies, and best practices.
  • Communicate effectively across teams, departments, and stakeholders to ensure alignment and a clear understanding of SRE initiatives.
  • Evangelize SRE practices across the organization and ensure consistent adoption of reliability-focused processes.

Education and Experience

  • 5+ years of experience in software engineering or SRE roles.
  • 2+ years in a leadership or management position.
  • Proven expertise with Kubernetes, ArgoCD, AWS, Prometheus, Grafana, Datadog, FluentD, Jenkins, and Docker.
  • Strong knowledge of CI/CD and GitOps practices.
  • Excellent verbal and written communication skills.
  • Demonstrated ability to track and prioritize multiple projects, requests, and initiatives effectively.
  • Bachelor’s degree in Computer Science, Engineering, or equivalent experience.

Platform Science Benefits Highlights

The company offers various benefits to regular, full-time employees including: 

  • Medical, dental, and vision insurance
  • Short-term and long-term disability insurances
  • AD&D and life insurance
  • 401k plan
  • Paid vacation, sick leave and holidays
  • Six weeks of paid parental leave

For more information please see the Benefits Highlights brochure for regular, full-time employees.

In addition, you can access the Benefit Highlights brochure for regular, full-time employees by copying and pasting the link into your browser: https://www.platformscience.com/benefits.

This is an exempt role. Our job titles for each posting may span across more than one job level. The estimated base salary for this role is between $134,550 and $200,000. The range displayed on each job posting reflects the minimum and maximum target range for new hire base salaries across all US locations. Compensation packages are based on many factors unique to each candidate, including but not limited to skill set, work experience, relevant trainings and certifications, business needs, market demands and specific geographical location. The base pay range is subject to change and may be modified in the future. This role may also be eligible for bonus, equity, and benefits.
Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits.

Platform Science collects your personal information to support its business operations, including for human resources, employment, benefits administration, health and safety, and other business-related purposes as well as to be in legal compliance. You can review further details of such collection and use in our Privacy Policy (link for browser: https://www.platformscience.com/privacy-notice).

Qualified applications with arrest or conviction records will be considered for employment in accordance with the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act.

At this time we only consider candidates in these states: AL, AR, AZ, CA, CO, FL, GA, ID, IL, KY, MA, MD, MI, MN, MO, NC, NH, NV, NY, OH, OK, OR, PA, SC, TN, TX, UT, VA, WA, and WI. In the future we plan to add more states.

Beware job scams! Our recruiters use @platformscience.com emails only. We don’t interview via text/message. We don't ask for software downloads (except Zoom) or sensitive info (like SSN/bank). Suspect fraud? Report it to law enforcement & [email protected].

Top Skills

Argocd
AWS
Datadog
Docker
Fluentd
Grafana
Jenkins
Kubernetes
Prometheus
The Company
La Jolla, CA
180 Employees
On-site Workplace
Year Founded: 2015

What We Do

Platform Science is an innovative enterprise grade IoT fleet management platform for the transportation and logistics industry. By bringing expertise from 30+ years in telematics together with innovative thinking of IoT-technology experts, Platform Science is the first open platform solution for enterprise fleets to better configure their fleet management solution and meet changing demands of the regulatory landscape, all while preparing for a future of IoT-connected trucks/freight and the digital supply chain.

Similar Jobs

Remote
USA
7863 Employees
202K-322K Annually
Remote
United States
300 Employees

Samsara Logo Samsara

Software Engineer (New Grad) - US

Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software
Easy Apply
Remote
United States
2800 Employees
90K-151K Annually

PrizePicks Logo PrizePicks

Front End Engineer III (React/Typescript)

Fintech • Gaming • Mobile • Sports • eSports
Remote
Atlanta, GA, USA
500 Employees

Similar Companies Hiring

RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
Charlotte, North Carolina
53 Employees
Hedra Thumbnail
Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
San Francisco, CA
14 Employees
HERE Thumbnail
Software • Logistics • Information Technology
Amsterdam, NL
9000 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account