Senior Site Reliability Engineer (SRE)

Posted 11 Days Ago
Be an Early Applicant
North Carolina
Senior level
Artificial Intelligence • Software
The Role
As a Senior Site Reliability Engineer, you will ensure the reliability and scalability of cloud-based infrastructure, manage monitoring systems, lead incident response, optimize system performance, and implement automation solutions. You will also design disaster recovery plans and adhere to security best practices.
Summary Generated by Built In

What we do 

At Blankfactor, we are dedicated to engineering impact. We are passionate about creating value by building best-in-class tech solutions for companies looking to transform, innovate, and scale. In every project, we aim to deliver work that moves the needle and drives measurable outcomes for our partners and clients. Our full-stack development, data engineering, digital product, and enterprise AI solutions cater to a range of industries, including payments, banking, capital markets, and life sciences.

We are headquartered in Miami, Florida, have offices in Bulgaria, Colombia, and Romania, and are rapidly expanding our global footprint. Our culture of engineering excellence, technical expertise, and care for both our clients and our talented workforce has made us one of the fastest-growing companies in America.

We only hire the best and brightest. If you have talent and ambition, join us and be part of an environment that fosters innovation, collaboration, and growth. Welcome to Blankfactor!


 

What to expect in this role

We are seeking a highly skilled Senior Site Reliability Engineer (SRE) to join our team and ensure the reliability, performance, and scalability of our cloud-based infrastructure. The ideal candidate will possess strong expertise in AWS architecture, observability tools, event-driven systems, and infrastructure as code, with a proven ability to automate processes and enhance system resilience.

  • Monitoring & Observability: Design, implement, and manage robust monitoring systems using CloudWatch, CloudTrail, Splunk, and other observability tools.

  • Incident Management: Lead incident response efforts, diagnose root causes, and implement long-term fixes to prevent future occurrences.

  • System Performance: Optimize system reliability and performance by identifying bottlenecks and proactively implementing solutions.

  • Alerting & Logging: Develop and manage alerting mechanisms to ensure early detection and resolution of system anomalies.

  • Disaster Recovery: Create and maintain disaster recovery plans and conduct regular testing to ensure system availability in case of failures.

  • Event-Driven Architectures: Design and manage event-driven architectures using AWS services such as AWS Events, Custom Events, and other cloud-native tools.

  • Infrastructure as Code (IaC): Utilize Terraform (HCL) and Ansible for provisioning and maintaining cloud infrastructure.

  • Security & Compliance: Implement security best practices using tools like KMS, Access Analyzer, and ensure systems adhere to compliance standards.

  • Automation: Build and maintain automation scripts using Python, Go, and Rego to streamline workflows and reduce manual intervention.

Qualifications and Tech Proficiency

  • Cloud Expertise: Deep knowledge of AWS architecture, including observability tools like CloudWatch, CloudTrail, and security services such as KMS and Access Analyzer.

  • Event-Driven Systems: Proficient in managing event-driven architectures (EDS) and tools like AWS Events and Custom Events.

  • Observability & Analysis: Hands-on experience with monitoring and log analysis tools such as Splunk and designing Custom Metrics/Events.

  • IaC & Automation: Proficiency in writing and managing infrastructure using Terraform (HCL) and Ansible.

  • Programming Languages: Advanced skills in Python, Go, and Rego for automation and tool development.

  • Problem-Solving: Strong troubleshooting skills with a focus on system performance optimization and root cause analysis.

Preferred Qualifications

  • Experience with MRF and DAP systems in AWS.

  • Knowledge of compliance and audit processes in cloud environments.

  • Strong interpersonal skills for collaboration with cross-functional teams.


 

What We Offer

  • Competitive salary with an attractive benefits package

  • Working on wide-ranging and interesting areas with highly skilled professionals, international clients, and the latest technologies.

  • A culture of collaboration and continuous learning in a work environment invested both in nurturing your skills and shaping our common technological future. 

We believe that diversity of experience and background contributes to more robust ideas and a stronger team. All qualified applicants will receive consideration for employment without regard to religion, race, sex, sexual orientation, gender identity, national origin, or disability.


 

Top Skills

AWS
Go
Python
The Company
HQ: Miami, Florida
360 Employees
On-site Workplace
Year Founded: 2015

What We Do

At Blankfactor, we are dedicated to engineering impact. We are passionate about creating value by building best-in-class tech solutions for companies looking to transform, innovate, and scale. In every project, we aim to deliver work that moves the needle and drives measurable outcomes for our partners and clients. Our full-stack development, data engineering, digital product, and enterprise AI solutions cater to a range of industries, including payments, banking, capital markets, and life sciences.

We are headquartered in Miami, Florida, have offices in Bulgaria and Colombia, and are rapidly expanding our global footprint. Our culture of engineering excellence, technical expertise, and care for both our clients and our talented workforce has made us one of the fastest-growing companies in America.

We work with the greatest talent based in Colombia, Bulgaria, Costa Rica, and around the world to deliver innovative products.

We only hire the best and brightest. If you have talent and ambition, join us and be part of an environment that fosters innovation, collaboration, and growth!

Follow us:
Twitter: @_Blankfactor
Instagram: @blank.factor
Facebook: Blankfactor

Similar Jobs

Lowe’s Logo Lowe’s

Sr. Software Engineer - (Site Reliability Engineer)

Consumer Web • eCommerce • Information Technology • Retail • Software • Analytics • App development
Hybrid
Charlotte, NC, USA
300000 Employees

Inmar Intelligence Logo Inmar Intelligence

Senior Site Reliability Engineer

Fintech • Information Technology • Analytics
Winston-Salem, NC, USA
2044 Employees

Red Hat Logo Red Hat

Senior Site Reliability Engineer

Cloud • Information Technology • Internet of Things • Software • Consulting • Infrastructure as a Service (IaaS) • Automation
Raleigh, NC, USA
20000 Employees
111K-184K Annually
Raleigh, NC, USA
943 Employees

Similar Companies Hiring

Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees
RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
Charlotte, North Carolina
53 Employees
Hedra Thumbnail
Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
San Francisco, CA
14 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account