Senior Site Reliability Engineer

Posted 2 Days Ago
Be an Early Applicant
Argentina
Mid level
Artificial Intelligence
The Role
As a Senior Site Reliability Engineer at Clarifai, you will ensure the high availability and smooth operation of core services, develop Kubernetes resources for deployments, and implement infrastructure solutions while partnering with teams to solve engineering challenges. You will work with cloud environments and be responsible for monitoring system performance and optimizing reliability.
Summary Generated by Built In

Senior Site Reliability EngineerAbout the Company

Clarifai is a leading, full-lifecycle deep-learning AI platform for computer vision, natural language processing, LLM and audio recognition. We help organizations transform unstructured images, video, text, and audio data into structured data at a significantly faster and more accurate rate than humans would be able to do on their own.  Founded in 2013 by Matt Zeiler, Ph.D. Clarifai has been a market leader in AI since winning the top five places in image classification at the 2013 ImageNet Challenge. Clarifai continues to grow with employees remotely based throughout the United States, Canada, Argentina, India and Estonia. 

We have raised $100M in funding to date, with $60M coming from our most recent Series C, and are backed by industry leaders like Menlo Ventures, Union Square Ventures, Lux Capital, New Enterprise Associates, LDV Capital, Corazon Capital, Google Ventures, NVIDIA, Qualcomm and Osage.

Clarifai is proud to be an equal opportunity workplace dedicated to pursuing, hiring, and retaining a diverse workforce.

Your Impact

Clarifai’s platform is a kubernetes-native distributed system that requires the orchestration of many components. Efficiently serving and training large neural networks presents unique design and infrastructure challenges. 

You will be critical to solving these challenges both in the context of the cloud and in on premise environments. Additionally, you will be responsible for our broader cloud infrastructure and development tools and environments.

The Opportunity

  • Ensure the smooth operation and high availability of Clarifai's core services
  • Monitor system performance, identify bottlenecks, and implement optimizations to enhance reliability and efficiency
  • Develop Kubernetes resources and custom tooling for seamless cloud and on-premise deployments
  • Design and implement scalable, secure, and cost-effective infrastructure solutions.
  • Partner with teams across the organization to identify & solve engineering challenges

Requirements

  • BS/BA in Computer Science or related degree
  • Good knowledge of cloud providers (AWS, GCP or similar)
  • Expertise with Kubernetes (EKS, GKE, self-hosted) and Infrastructure as Code using Terraform, Helm
  • Solid understanding of web and networking (HTTP, TLS, DNS, Certificates, etc)
  • Experience with CI/CD pipelines using tools such as GitHub Actions, ArgoCD, and Atlantis
  • Strong interpersonal skills working with teams across different time zones and regions

Great to Have

  • Knowledge of basic Microservice Architecture principles
  • Familiarity with security best practices for cloud-based systems.
  • Experience with relational databases, message queues, key value stores
  • Experience writing python, golang, or any other popular programming language
  • Familiarity with any RPC framework
  • Experience developing & building custom Kubernetes operators

Top Skills

Go
Python
The Company
San Francisco, CA
100 Employees
On-site Workplace
Year Founded: 2013

What We Do

We help organizations transform unstructured images, video and text data into structured data, significantly faster and more accurately than humans would be able to do on their own. Founded in 2013 by Matt Zeiler, Ph.D., Clarifai has been a market leader in AI since winning the top five places in image classification at the 2013 ImageNet Challenge. Clarifai continues to grow with more than 90 employees and offices in Wilmington, Delaware, San Francisco, and Tallinn, Estonia.

Similar Jobs

Hybrid
Buenos Aires, Ciudad Autónoma de Buenos Aires, ARG
132 Employees
Remote
15 Locations
88 Employees

Mondelēz International Logo Mondelēz International

S4/o9 Ecommerce RTM & Sales Lead LA

Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
Remote
4 Locations
90000 Employees

ZS Logo ZS

Advanced Data Science Associate Consultant

Artificial Intelligence • Healthtech • Professional Services • Analytics • Consulting
Hybrid
Ciudad Autónoma de Buenos Aires, ARG
13000 Employees

Similar Companies Hiring

RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
Charlotte, North Carolina
53 Employees
HERE Thumbnail
Software • Logistics • Internet of Things • Information Technology • Computer Vision • Automotive • Artificial Intelligence
Amsterdam, NL
6000 Employees
True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account