Sr. Site Reliability Engineer (Remote, Mexico)

Posted 10 Days Ago
Be an Early Applicant
Hiring Remotely in México
Remote
Senior level
Cloud • Information Technology • Consulting
The Role
The Sr. Site Reliability Engineer will design, build, and maintain production services across multiple data centers, ensuring reliability and scalability. Responsibilities include automating infrastructure provisioning, collaborating with development teams, troubleshooting incidents, and writing documentation. The role requires proactive identification of issues and implementation of solutions.
Summary Generated by Built In

About IO Connect Services:


IO Connect Services is an AWS Advanced Tier Services Partner and Datadog Partner with a commitment to delivering complex and well-architected technical solutions worldwide. Founded in 2016, our professionals are dedicated to establishing and maintaining trust with our clients and business partners for long-term relationships. 


Position Overview:

As we expand customer deployments, we’re seeking an experienced SRE. Specifically, we’re searching for someone who has fresh ideas and a unique viewpoint, and who enjoys collaborating with a cross-functional team to develop real-world solutions and positive user experiences for every interaction.


Responsibilities


Responsible for designing, building, maintaining, and scaling production services and server farms across multiple data centers for complex and data-intensive cloud services.

Design and enhance software architecture to improve scalability, service reliability, capacity, and performance.

Write automation code for provisioning and operating infrastructure at massive scale. You are not an operator, you’re an experienced software engineer focused on operations.

Work with development teams to make sure the applications fit nicely within the infrastructure and scalability/reliability is designed and implemented from the grounds up. You will work with QA on building pipelines and automation for delivering and deploying applications to production.

Roll up the sleeves to troubleshoot incidents, formulate theories and test your hypothesis, and narrow down possibilities to find the root cause.

Write postmortem reviews and remediation recommendation.

Identify bad trends before they become problems; respond to automated system alerts, effectively troubleshoot system errors and work incidents to return systems to normal operating conditions

Author and update high-quality documentation of all relevant specifications, systems and procedures

Support and comply with the company’s Quality Management System policies and procedures.



Required skills and qualifications

Bachelor’s degree (or equivalent) in computer science or related discipline

Knowledge of IaC technologies such as Terraform, Ansible, Puppet, Chef.

Knowledge of Cluster creation and management through Kubernetes

Knowledge of Microsoft Azure, AWS, Google Cloud, Azure services, Virtual Machine in Azure, Virtual Network Configuration.

Knowledge in design patterns such as: Iaas, Paas, and Saas

Knowledge in CI/CD

Scripting knowledge with PowerShell

IPs and Mask knowledge

Ability to program (structured and OOP) using one or more high-level languages, such as Python, Java, C/C++, Ruby, and JavaScript

Experience with distributed storage technologies such as NFS, HDFS, Ceph, and Amazon S3, as well as dynamic resource management frameworks (Apache Mesos, Kubernetes, Yarn)

Proactive approach to identifying problems, performance bottlenecks, and areas for improvement


What we offer:


Base Salary and permanent contract directly with the company

Continuous training plan with paid certifications

Carreer plan according to your development and knowledge

Benefits above the law: 12 days of Paid Time Off, 30 day Christmas Bonus, Medical Insurance, Life Insurance, Savings Fund, Groceries Bonus

Quarterly Performance Bonus

Computer equipment for your work

Optional 100% Home Office

Top Skills

C/C++
Java
JavaScript
Python
Ruby
The Company
Newark, New Jersey
84 Employees
On-site Workplace
Year Founded: 2016

What We Do

IO Connect Services is an AWS Advanced Tier Services Partner, a certified MuleSoft® System Integrator Partner, a Salesforce Commerce Cloud Consulting Partner, and a member of the Datadog Partner Network. Our professionals have over 20 years of experience delivering complex technical solutions worldwide. If there is one thing you must know about us: we relentlessly work on establishing and maintaining trust with our clients and all business partners for long-term relationships

Similar Jobs

Crunchyroll Logo Crunchyroll

Staff Site Reliability Engineer

Digital Media • eCommerce • Gaming • Mobile • News + Entertainment
Remote
Mexico, Cuauhtémoc, Ciudad de México, MEX
1200 Employees
Remote
15 Locations
88 Employees
Remote
3 Locations
28 Employees

Similar Companies Hiring

Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees
InCommodities Thumbnail
Renewable Energy • Machine Learning • Information Technology • Energy • Automation • Analytics
Austin, TX
234 Employees
RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
Charlotte, North Carolina
53 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account