Senior Site Reliability Engineer (Senior Resilience Engineer)

Posted 3 Days Ago
Be an Early Applicant
Austin, TX
133K-164K Annually
Senior level
Insurance
The Role
As a Senior Site Reliability Engineer, you will enhance system resilience and automate processes for Texas Mutual. You'll lead incident management, collaborate on system strategies, mentor team members, and advocate for Site Reliability Engineering best practices, shaping the organization's culture and operational excellence.
Summary Generated by Built In

We’re excited you’re considering joining a great place to work! 

Texas Mutual is deeply committed to creating and maintaining an environment of mutual respect and is proud to be an equal opportunity employer. All qualified applicants are encouraged to apply and will receive consideration for employment without regard to age, race, color, national origin, religion, sex, gender identity, sexual orientation, genetic information, veteran status, or any other basis protected by local, state, or federal law.


About this Position
Join Texas Mutual as a Senior Site Reliability Engineer and guide a dynamic team that ensures the smooth operations of a billion-dollar company. You'll drive efforts to automate manual processes, improve system resilience, and deliver excellence in incident response.
As a key player in a company known for its growth and innovation, this is your chance to make a meaningful impact. Located in Austin’s thriving Mueller district, our modern office offers on-site fitness and a variety of amenities. Texas Mutual is consistently recognized as one of the best places to work in the state.
In this position, you will be a vital member of our Site Reliability Engineering (SRE) team, responsible for improving incident response, advancing problem management, identifying automation opportunities, and managing observability tools. You'll work closely with Platform and Value Stream teams to strengthen system resiliency, champion a culture of Site Reliability Engineering, and support our transition from on-premise to cloud infrastructure.

Responsibilities & Qualifications

Ideal candidates will: 

  • Lead positive change with clear, collaborative leadership and measurable project outcomes. 
  • Solve challenges independently while offering solutions-focused guidance to peers. 
  • Empower team growth by sharing knowledge transparently and providing constructive feedback. 
  • Foster a culture of diversity of thought, mutual trust, and accountability. 

 

What you’ll do: 

  • Take ownership of key projects, driving efforts to improve efficiency, enable self-service, and automate manual processes. 

  • Manage initiatives from discovery through planning, scheduling, and execution using Agile Scrum methodologies. 

  • Lead high-stakes production incidents as a Senior Incident Commander, ensuring rapid resolution, clear communication, and poise under pressure. 

  • Facilitate post-incident retrospectives, transforming technical learnings into actionable improvements. 

  • Architect, implement, and maintain cutting-edge observability systems to ensure proactive incident detection and resolution. 

  • Build and manage integrations across systems to streamline monitoring, alerting, and health reporting. 

  • Define and execute strategies for system availability, performance, and reliability, aligning with organizational goals. 

  • Collaborate with stakeholders to establish Service Level Objectives (SLOs) and design strategies for managing breaches. 

  • Mentor and guide team members, setting high standards for technical excellence and operational discipline. 

  • Offer candid, constructive feedback to improve processes, systems, and team performance. 

  • Serve as a trusted advisor, advocating for best practices in reliability engineering and driving cultural change across the organization. 

 

It is required that you have: 

  • Bachelor’s degree in a related field or equivalent education, training, or experience. 

  • At least 4 years of experience in site reliability engineering, DevOps, or related engineering discipline (or equivalent education, training or experience) 

  • Strong leadership skills in incident management and operational excellence. 

  • Demonstrated initiative, independent work, and results-driven success  

  • Expertise in building and optimizing complex systems  

 

It would be great to also have:  

  • Expertise in ITIL practices and their application in modern IT environments. 

  • Extensive experience in operations and engineering with distributed systems. 

  • Proficiency with Git and modern CI/CD pipelines. 

  • Advanced skills in programming (Java, C#) and scripting (Python, PowerShell, Bash). 

  • Hands-on experience with automation tools (Terraform, Ansible) and infrastructure as code. 

  • Proven success in implementing monitoring, logging, and alerting solutions. 

  • Exceptional collaboration, negotiation, and presentation skills, with the ability to inspire and influence. 

  • Experience providing constructive feedback and fostering continuous improvement. 

  • A passion for achieving results, with a strong sense of accountability and teamwork. 

Texas Mutual Pay Transparency

The base pay range is based on the market evaluation of the job and may include pay for multiple levels. Individual base pay within the range is determined by a variety of factors, including experience, performance, education, and demonstration of skills and competencies required for each role. Your recruiter can discuss the full value of our total compensation package with you, including our generous bonus plans and flex-hybrid work model.

Base Pay Range: $133,081.10 - $164,394.30 Per Year

Flex-Hybrid Work Environment:

Texas Mutual’s flex-hybrid schedule allows you to bring your best self to work by working remotely and collaborating in the office based on business needs. All Texas Mutual employees are required to have Texas residency and travel to their designated office as needed.

Our Benefits:

  • Annual performance bonus and merit-based pay increase

  • Lifestyle Savings Account ($1,000 per year)

  • Automatic 4% employer contribution to retirement plan

  • 401k plan with 100% employer match up to 6%

  • Student loan repayment matching in 401k plan

  • Three weeks’ time off for vacation

  • Nine paid holidays and two personal days each year

  • Day one health, Rx, vision and dental insurance

  • Life and disability insurance

  • Flexible spending account

  • Pet insurance and pet Rx discounts

  • Free on-site gym, fitness classes, and health and wellness resources

  • Free identity theft protection

  • Free student loan repayment and refinancing consultation

  • Professional development and tuition reimbursement

  • Employee referral bonus

  • Free onsite snacks

Top Skills

Bash
C#
Java
Powershell
Python
The Company
Austin, Texas
1,057 Employees
On-site Workplace
Year Founded: 1991

What We Do

Texas Mutual Insurance Company, the state's leading workers'​ compensation provider, has a strong foundation of success. We insure 42% of the Texas workers'​ compensation market, which means that more than 72,000 business owners rely on us to keep their 1.5 million workers safe on the job every day.

Our exceptional employees answer the call by providing strategic safety services, top-notch medical care coordination if an accident occurs, claims handling designed to get workers well and back on the job, a zero-tolerance fraud policy and more. We provide our employees with the training, technology, tools and support needed to deliver top-tier service.

Our competitive salaries, award-winning benefits package, and employee recognition programs are just a few of the reasons our average employee tenure is 10 years. We empower career growth with our professional development opportunities, and support our employees'​ efforts to make a difference in the communities in which they live.

Texas Mutual is always seeking talented, dedicated people to join its team. Read more about why so many choose to work with us at texasmutual.com/careers

Similar Jobs

Hybrid
Houston, TX, USA
289097 Employees
Hybrid
Fort Worth, TX, USA
289097 Employees
Hybrid
Fort Worth, TX, USA
289097 Employees

Cognite LLC Logo Cognite LLC

Senior Site Reliability Engineer

Artificial Intelligence • Big Data • Internet of Things • Machine Learning • Robotics • Software
Hybrid
Austin, TX, USA
580 Employees

Similar Companies Hiring

Flume Health Thumbnail
Software • Insurance • Healthtech
US
22 Employees
Spark Advisors Thumbnail
Software • Sales • Other • Insurance • Healthtech
New York, NY
73 Employees
MassMutual India Thumbnail
Insurance • Information Technology • Fintech • Financial Services • Big Data
Hyderabad, Telangana

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account