Machine Learning Engineer, Distributed Training Infrastructure

Posted 4 Days Ago
Be an Early Applicant
San Francisco, CA
60K-120K
Senior level
Software
The Role
As a Machine Learning Engineer, you'll optimize compute performance, manage infrastructure, and collaborate with teams to enhance distributed training processes.
Summary Generated by Built In

Who We Are

At Twelve Labs, we are pioneering the development of frontier multimodal foundation models that can see, hear and understand the world as humans do. Our models have redefined the standards in video-language modeling, allowing developers to build programs with state-of-the-art semantic search, summarization and analysis capabilities. 

Twelve Labs has raised $107 million in Seed + Series A funding from world-class VC & corporate partners: NVIDIA, NEA, Radical Ventures, Index Ventures, Snowflake and Databricks. Our advisory team features AI visionaries and founders such as Fei-Fei Li, Silvio Savarese, Alexandr Wang and more. Headquartered in San Francisco, with an influential APAC presence in Seoul, our global footprint underscores our commitment to driving worldwide innovation.

About The Role

As Machine Learning Engineer, Distributed Training Infrastructure, you will be responsible for ensuring that compute performance and ease-of-use never delay our research timeline. You will contribute to all compute & training infrastructure optimization, observability, scaling, and orchestration. You will collaborate closely with other engineers and scientists to define and implement your chosen roadmap. This role is a perfect fit for research minded compute specialists who want to build SOTA video, vision, and video-language modeling systems!

In This Role, You Will:

  • Partner with researchers to understand our future research roadmap and to identify scaling limitations which will most imminently block us from achieving our goals

  • Be a hands on leader who is excited to debug perplexing node failures at odd hours

  • Collaborate with ML engineers/Researchers, and hold a high bar around code quality / engineering best practices

  • Work across teams to understand and manage project priorities and product deliverables, evaluate trade-offs, and drive technical initiatives from ideation to execution to shipment

You May Be A Good Fit If You Have:

  • 6+ years of industry experience

  • Contributed to large scale distributed training efforts across thousands of accelerators

  • Experience with a panoply of HPC related tools and have developed strong opinions about how we should build our stack

  • A passion for solving the most pressing technical challenges, as opposed to the most intellectually satisfying ones

  • Strong Python and infrastructure-as-code expertise

Interview Process

1) Recruiter Phone Screen

2) Initial Technical Assessment

3) Hiring Manager Interview

4) Onsite Technical Interview [In-Person]

5) Final Interview: Culture

Even if there are a few checkboxes that aren’t ticked through your prior experience, we still encourage you to apply! If you are a 0-1 achiever, a ferocious learner, and a kind and fun team player who motivates others, you will find a home at Twelve Labs.

We are a global company that values the uniqueness of each person’s journey. It is the differences in our cultural, educational, and life experiences that allow us to constantly challenge the status quo. We are looking for individuals who are motivated by our mission and eager to make an impact as we push the bounds of technology to transform the world. Join us as we revolutionize video understanding and multimodal AI.

Benefits and Perks

🤝 An open and inclusive culture and work environment.

🧑‍💻 Work closely with a collaborative, mission-driven team on cutting-edge AI technology.

🦷 Full health, dental, and vision benefits.

✈️ Flexible PTO and parental leave policy. Office closed the week of Christmas and New Years.

🛂 VISA support (such as H1B and OPT transfer for US employees).

Top Skills

Hpc Tools
Infrastructure-As-Code
Python
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, California
94 Employees
On-site Workplace
Year Founded: 2021

What We Do

The world's most powerful video intelligence platform for enterprises.

Similar Jobs

Nuro Logo Nuro

Senior/Staff Software Engineer, ML Infrastructure, Distributed Training

Artificial Intelligence • Automotive • Information Technology • Robotics
Mountain View, CA, USA
908 Employees
167K-303K

Chime Logo Chime

Principal Data Engineer

Fintech • Machine Learning • Mobile • Security • Software
Easy Apply
Hybrid
San Francisco, CA, USA
1459 Employees
263K-373K

GoodRx Logo GoodRx

Director, Software Engineering

Consumer Web • Coupons • Healthtech • Social Impact • Pharmaceutical
Remote
Hybrid
5 Locations
800 Employees
212K-452K Annually

Celonis Logo Celonis

Release Manager

Big Data • Information Technology • Productivity • Software • Analytics • Business Intelligence • Consulting
Hybrid
Redwood City, CA, USA
3000 Employees
160K-210K Annually

Similar Companies Hiring

True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
53 Employees
Red 6 Thumbnail
Virtual Reality • Software • Hardware • Defense • Aerospace
Orlando, Florida
113 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account