Site Reliability Engineer - West Coast

Posted 18 Days Ago
Hiring Remotely in United States
Remote
Mid level
Software
The Role
As a Site Reliability Engineer, you will manage production infrastructure on AWS and Azure, ensuring high availability and performance. You'll automate alerts, collaborate with R&D for scalable solutions, and document processes for repeatability. Your responsibilities include troubleshooting incidents, monitoring system observability, and conducting on-call duties.
Summary Generated by Built In

The world of digital assets is accelerating in speed, magnitude, and complexity, opening the door to new ways for leveraging the blockchain. Fireblocks’ platform and network provide the simplest and most secure way for companies to work with digital assets and it trusted by some of the largest financial institutions, banks, globally-recognized brands, and Web3 companies in the world, including BNY Mellon, BNP Paribas, ANZ Bank, Revolut, and thousands more. 

Want to build something new? Be one of the first members of our SRE team. Define processes and best practices. Work closely with R&D on how to design scalable solutions? We're establishing Fireblocks' first-ever Site Reliability Engineering (SRE) team, dedicated to enhancing the reliability and performance of our high-throughput SaaS platform.

 

Responsibilities:

  • Own the production infrastructure over AWS and Azure. Implement sustainable and scalable solutions with goals of improving availability and performance
  • Help Identify root causes for every incident and prevent incidents from ever happening again
  • Have alerts on symptoms and not on outages. Ensure all infrastructure and application alerts are “actionable” alerts and/or self-healing automation
  • Work closely with the R&D and Support: offering education and guidance on integration, support, and monitoring across the toolset
  • Everything as a code approach: Run our infrastructure with Ansible, Terraform, and Kubernetes
  • Document every action and turn it into repeatable actions and then into automation
  • Focus on the system's observability, availability, reliability, performance/latency, monitoring
  • Conduct periodic on-call duties and emergency response

Minimum Requirements:

  • At least 3+ years of experience as DevOps or SRE in a SaaS environment
  • Experience with Coding languages - Python/JavaScript/Bash, or similar
  • At least 3+ years of experience with Alerting & Monitoring systems such as DataDog Splunk / New Relic / Prometheus, or similar
  • Experience working with Linux systems from kernel to shell and beyond
  • Cloud systems such as AWS / Google cloud / Azure
  • Configuration management such as Ansible/Chef/Puppet
  • Experience with Docker, Kubernetes and Helm
  • SCM - Git/bitbucket/gitlab/Phabricator/gerrit
  • High Analytical & Troubleshooting skills - ability to solve complex problems
  • Strong verbal and written communication skills and a collaborative mindset
  • Ability to dive into detail while understanding the big picture

Nice-to-have: 

  • DataDog extensive experience, monitoring\dashboard expert
  • Participated in Kubernetes migration projects
  • Previous experience as a C++ or Node Developer
  • BSC in Computer Science or related technical certifications
  • Previous experience in cryptocurrencies \ blockchains - big advantage



Fireblocks' mission is to enable every business to easily and securely access digital assets and cryptocurrencies. In order to do that, we strongly believe our workforce should be as diverse as our clients, and this is why we embrace diversity and inclusion in all its forms. 

Please see our candidate privacy policy here.

Top Skills

Bash
JavaScript
Python
The Company
HQ: New York, NY
410 Employees
On-site Workplace
Year Founded: 2018

What We Do

For institutions that need to store and move digital assets without the operational or security headache.

Fireblocks streamlines operations by bringing all your exchanges, OTCs, counterparties, hot wallets, and custodians into one platform. Wallets, deposit addresses, and API credentials are secured using patent-pending chip isolation technology and the newest breakthrough in cryptography (MPC). Institutions are using Fireblocks to move funds securely in seconds – instead of hours.

Similar Jobs

Cisco Meraki Logo Cisco Meraki

Lead Site Reliability Engineer - Remote

Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
Easy Apply
Remote
San Francisco, CA, USA
3000 Employees
173K-242K Annually

Cisco Meraki Logo Cisco Meraki

Lead Site Reliability Engineer , Cloud Platform - Remote

Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
Easy Apply
Remote
San Francisco, CA, USA
3000 Employees
173K-242K Annually

Atlassian Logo Atlassian

Site Reliability Engineer

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Remote
San Francisco, CA, USA
11000 Employees

Atlassian Logo Atlassian

Principal Site Reliability Engineer

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Remote
San Francisco, CA, USA
11000 Employees
167K-269K Annually

Similar Companies Hiring

Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees
RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
Charlotte, North Carolina
53 Employees
Hedra Thumbnail
Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
San Francisco, CA
14 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account