Service Reliability Engineer

Posted 4 Days Ago
Be an Early Applicant
Austin, TX
Hybrid
Senior level
Fintech • Information Technology • Mobile • Payments • Software • Financial Services
As passionate about our people as we are about our mission!
The Role
The Service Reliability Engineer will enhance service reliability through monitoring, automation, and incident response. Responsibilities include defining system reliability metrics, collaborating on hosting solutions, automating recovery processes, and performing forensic analysis of incidents to improve system resiliency and user experience.
Summary Generated by Built In

As passionate about our people as we are about our mission.
What We're All About :
Q2 is proud of delivering our mobile banking platform and technology solutions, globally, to more than 22 million end users across our 1,300 financial institutions and fintech clients. At Q2, our mission is simple: Build strong, diverse communities by strengthening their financial institutions. We accomplish that by investing in the communities where both our customers and employees serve and live.
What Makes Q2 Special?
Being as passionate about our people as we are about our mission. We celebrate our employees in many ways, including our "Circle of Awesomeness" award ceremony and day of employee celebration among others! We invest in the growth and development of our team members through ongoing learning opportunities, mentorship programs, internal mobility, and meaningful leadership relationships. We also know that nothing builds trust and collaboration like having fun. We hold an annual Dodgeball for Charity event at our Q2 Stadium in Austin, inviting other local companies to play, and community organizations we support to raise money and awareness together.
The Job At-A-Glance:
This role combines operational expertise and technical proficiency to drive service reliability, proactive monitoring, and incident response. As a Service Reliability Engineer, you'll work closely with cross-functional teams to maintain and improve system resiliency, automate recovery processes, and enhance overall user experience. You'll contribute to a culture that values continuous improvement, automation, and collaboration.
A Typical Day:

  • Define and measure system reliability through SLAs, SLOs, and SLIs.
  • Consult on hosting solutions to identify the best fit for specific services and optimize internal services' interactions with hosting platforms.
  • Collaborate with Observability and Incident Response teams to implement monitoring and early-warning systems.
  • Automate recovery processes such as failure remediations, auto-rollbacks, and alerting mechanisms.
  • Support incident management processes, post-incident reviews (PIRs), and root cause analysis (RCA).
  • Perform forensic analysis to isolate issues (e.g., hosting platforms, configuration, or service).
  • Partner with developers to drive performance improvements, establish standards, and optimize changes while maintaining the right balance of reliability, speed of innovation, and cost.
  • Optimize capacity planning and resource performance for seamless scalability under high demand.
  • Foster a reliability-focused culture by partnering across engineering, operations, product, and support teams.
  • Engage in performance and chaos engineering practices to strengthen system resilience.


Bring Your Passion, Do What You Love. Here's What We're Looking For:

  • Bachelor's degree in Computer Science, Engineering, or a related field (Master's preferred)
  • 5-8 years in Service Reliability Engineering, Infrastructure Engineering, Software Engineering, Implementations, or Service Optimization.
  • Proven ability to consult on hosting solutions and optimize internal services for hosting platforms and global capabilities.
  • Track record of implementing SRE principles in complex technical systems and environments.
  • Technical Proficiency: Expertise in system architecture, hosting platform performance, high availability, load balancing, and distributed systems.
  • Tooling Experience: Familiarity with tools like HashiCorp Nomad, Consul, Vault, Confluent Cloud (Kafka), Prometheus, Grafana, and Splunk.
  • Optimization Expertise: Ability to improve service health, narrow focus for troubleshooting across hosting, configuration, and services.
  • Automation Skills: Proficient in scripting (Python, Go, etc.), orchestration, and infrastructure-as-code (Ansible, Terraform).
  • Incident Management: Experience driving monitoring strategies, root cause analysis, and recovery optimizations.
  • Performance & Chaos Engineering: Capability to implement solutions for failure testing and system improvements.
  • Service-Level Understanding: Knowledge of SLIs, SLOs, and error budget calculations.
  • Strategic Thinking: Ability to balance reliability, innovation speed, and service costs while consulting cross-functionally.
  • Analytical Problem-Solving: Strong forensic analysis skills with natural curiosity for identifying root issues.
  • Collaboration: Proven ability to partner with developers to implement standards, performance enhancements, and engineering changes.
  • Communication: Excellent written and verbal skills to simplify technical concepts for stakeholders.
  • Familiarity with Google's SRE principles, Agile methodologies, and DevOps practices.
  • Strong belief in automation, risk assessment, and resilience as cornerstones of system reliability.


#LI-HB1
This position requires fluent written and oral communication in English.
Applicants must be authorized to work for any employer in the U.S. We are unable to sponsor or take over sponsorship of an employment Visa at this time.
Health & Wellness

  • Hybrid Work Opportunities
  • Flexible Time Off
  • Career Development & Mentoring Programs
  • Health & Wellness Benefits, including competitive health insurance offerings and generous paid parental leave for eligible new parents
  • Community Volunteering & Company Philanthropy Programs
  • Employee Peer Recognition Programs - "You Earned it"


Click here to find out more about the benefits we offer.
How We Give Back to the Community:
You can learn more about our Q2 Spark Program, Q2 Philanthropy fund, and our employee volunteering programs on our Q2 Community page . Q2 supports dozens of wide-reaching organizations, such as the African American Leadership Institute , and The Trevor Project , promoting diversity and success in leadership and technology. Other deserving beneficiaries include Resource Center helping LGBTQ communities, JDRF , and Homes for our Troops , a group helping veterans rebuild their lives with specially adapted homes.
At Q2, our goal is to be a diverse and inclusive workforce that fosters mutual respect for our employees and the communities we serve. Q2 is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.

Top Skills

Go
Python

What the Team is Saying

Sahana
Kelley
Clayton
Mo
Sravan
Edwin
The Company
HQ: Austin, TX
2,700 Employees
Hybrid Workplace
Year Founded: 2004

What We Do

Want to feel truly valued at work? Check out Q2! Our unique company culture and super-hero employees, are what sets us apart. We know how to get it done and still have fun! Q2 builds the leading mobile banking software platform serving Credit Unions, Banks (large and small), Community Banks and Financial Institutions. Our mission is to build stronger and diverse communities by strengthening their financial institutions. Q2 prioritizes innovation, collaboration and celebrating our employees who make our mission successful. Q2 is a national "Best Place to Work" Award winner 3 years running! Join our "Circle of Awesomeness"! #Q2Peeps

Why Work With Us

Q2 is a "Top Place to Work" Award winner for 3 years! Nothing builds trusting, collaborative relationships like a fun atmosphere and a shared sense of purpose. Q2 is known for our collaborative, friendly and mission driven culture. We prioritize career development and employee recognition. We value our customer relationships and our global impact.

Gallery

Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery

Q2 Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Our hybrid work environment allows us to work, where we work best! Employees can choose when to work from home and when to work in-person. Q2 also has a few specific days a month, where functional groups are in office collaborating together.

Typical time on-site: Flexible
Company Office Image
HQAustin, TX
Mexico
Company Office Image
Bengaluru, Karnataka
Cary, NC
Charlotte, NC
Company Office Image
Des Moines, IA
Company Office Image
Lincoln, NE
London, GB
Minneapolis, MN
Sydney, New South Wales
Learn more

Similar Jobs

Q2 Logo Q2

Sr. Software Engineer

Fintech • Information Technology • Mobile • Payments • Software • Financial Services
Remote
Hybrid
Austin, TX, USA
2700 Employees

Q2 Logo Q2

Senior Staff Software Engineer

Fintech • Information Technology • Mobile • Payments • Software • Financial Services
Hybrid
Austin, TX, USA
2700 Employees

Q2 Logo Q2

Senior Application Developer - Data and Integrations

Fintech • Information Technology • Mobile • Payments • Software • Financial Services
Hybrid
Austin, TX, USA
2700 Employees

Q2 Logo Q2

Manager, Software Engineering

Fintech • Information Technology • Mobile • Payments • Software • Financial Services
Hybrid
Austin, TX, USA
2700 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account