DevOps Engineer

Posted 23 Days Ago
Palo Alto, CA
Hybrid
120K-215K Annually
Mid level
Artificial Intelligence • Machine Learning • Software
Lightning empowers everyone to build AI.
The Role
The DevOps Engineer will design, build, and maintain cloud infrastructure, automate deployment processes, monitor system performance, and troubleshoot issues. Responsibilities include managing CI/CD pipelines and collaborating with development teams to ensure seamless integration and delivery of new features while adhering to security best practices.
Summary Generated by Built In
Who We Are

Lightning AI is the company reimagining the way AI is built. After creating and releasing PyTorch Lightning in 2019, Lightning AI was launched to reshape the development of artificial intelligence products for commercial and academic use.

We are on a mission to simplify AI development, making it accessible to everyone—from solo researchers to large enterprises. By removing the complexity of building and deploying AI tools, we empower innovators to focus on solving real-world problems. Our platform is built to scale with the latest AI advancements while staying intuitive and adaptable, so you can bring your ideas to life.

We have offices in New York City, Palo Alto, and London and are backed by investors such as Coatue, Index Ventures, Bain Capital Ventures, and Firstminute.

Our Values

  • Move Fast: We act with speed and precision, breaking down big challenges into achievable steps.

  • Focus: We complete one goal at a time with care, collaborating as a team to deliver features with precision.

  • Balance: Sustained performance comes from rest and recovery. We ensure a healthy work-life balance to keep you at your best.

  • Craftsmanship: Innovation through excellence. Every detail matters, and we take pride in mastering our craft.

  • Minimal: Simplicity drives our innovation. We eliminate complexity through discipline and focus on what truly matters.

What We're Looking For

We are looking for an experienced DevOps Engineer to design, build, and maintain our cloud infrastructure and scale CI/CD pipelines, ensuring reliability and stability for our enterprise customers. With a primary focus on Golang, you'll play a key role in automating our deployment processes, monitoring system performance, and troubleshooting infrastructure issues.

As part of the Red Squad Team, you’ll work closely with our development teams and report to the Director of Product. This hybrid role is based in Palo Alto, with a two-day in-office requirement. The salary range is $120,000 - $215,000.

What You’ll Do

  • Design, build, and maintain scalable infrastructure for deploying, monitoring, and automating our cloud environments.
  • Collaborate closely with development teams to ensure seamless integration and delivery of new features.
  • Implement and manage CI/CD pipelines to improve deployment frequency and reduce manual intervention.
  • Monitor system performance, identify bottlenecks, and develop strategies to improve reliability and performance.
  • Ensure security best practices are followed across infrastructure and deployment processes.
  • Troubleshoot and resolve infrastructure-related issues in a timely manner.
  • Stay up to date with the latest industry trends and tools to drive innovation in DevOps practices.

What You’ll Need

  • Proven experience as a DevOps Engineer or in a similar role, with a deep understanding of cloud infrastructure (AWS, GCP, or Azure).
  • Expertise in CI/CD tools such as Jenkins, CircleCI, GitHub, or GitLab.
  • Ability to code in golang
  • Experience with infrastructure as code tools like Terraform, Ansible, or CloudFormation.
  • Familiarity with containerization technologies like Docker and Kubernetes.
  • Knowledge of monitoring and logging tools such as Prometheus, Grafana, or ELK stack.
  • A strong security mindset with experience in managing secure cloud environments.
  • Excellent problem-solving skills, attention to detail, and ability to work in a fast-paced, collaborative environment.
Benefits and Perks

We offer competitive base salaries and stock options with a 25% one year cliff and monthly vesting thereafter. For our international employees, we work with Velocity Global to pay you in your local currency and provide equitable benefits across the globe.

In the US, we offer:

  • Medical, dental and vision
  • Life and AD&D insurance 
  • Flexible paid time off plus 1 week of winter closure
  • Generous paid family leave benefits
  • $500 monthly meal reimbursement, including groceries & food delivery services
  • $1,000 home office stipend
  • $1,000 annual learning & development stipend 
  • 100% Citibike membership (NYC only)
  • $45/month gym membership 
  • Additional various medical and mental health services

At Lightning AI, we are committed to fostering an inclusive and diverse workplace. We believe that diverse teams drive innovation and create better products. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity, national origin, age, disability, veteran status, or any other protected characteristic. We are dedicated to building a culture where everyone can thrive and contribute to their fullest potential.

Top Skills

Go
The Company
HQ: New York, NY
50 Employees
Hybrid Workplace
Year Founded: 2019

What We Do

Our platform provides intuitive open source tools, powerful cloud infrastructure and expertise to help you build AI securely.

Why Work With Us

Our team has shaped groundbreaking AI projects like PyTorch and PyTorch Lightning. We’re educators, innovators, and collaborators, united by a mission to democratize AI. A hybrid based company headquartered in New York City, we value in-person time for collaboration while still allowing flexibility.

Gallery

Gallery

Similar Jobs

Rula Logo Rula

Sr. SRE & DevOps Engineer (Remote)

Healthtech • Other • Social Impact • Software • Telehealth
Remote
Los Angeles, CA, USA
450 Employees

BAE Systems, Inc. Logo BAE Systems, Inc.

DevOps Engineer - Hybrid

Aerospace • Hardware • Information Technology • Security • Software • Cybersecurity • Defense
Hybrid
San Diego, CA, USA
40000 Employees
76K-128K Annually

BAE Systems, Inc. Logo BAE Systems, Inc.

Experienced DevOps Engineer - GenAI - Hybrid

Aerospace • Hardware • Information Technology • Security • Software • Cybersecurity • Defense
Hybrid
San Diego, CA, USA
40000 Employees
112K-191K Annually

SoFi Logo SoFi

Senior Staff DevOps Engineer

Fintech • Mobile • Software • Financial Services
Hybrid
San Francisco, CA, USA
4500 Employees

Similar Companies Hiring

bet365 Thumbnail
Software • Gaming • eSports • Digital Media • Automation
Denver, Colorado
6100 Employees
Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees
InCommodities Thumbnail
Renewable Energy • Machine Learning • Information Technology • Energy • Automation • Analytics
Austin, TX
234 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account