ML Research Engineer

Posted 20 Days Ago
Be an Early Applicant
San Francisco, CA
Hybrid
Expert/Leader
Software
The Role
As an ML Research Engineer at Twelve Labs, you will focus on applied research in video embedding, multimodal language modeling, and intelligent agents. Responsibilities include building scalable models, collaborating on core systems, mentoring, and enhancing enterprise video solutions.
Summary Generated by Built In

Who we are


At Twelve Labs, we are pioneering the development of cutting-edge multimodal foundation models that have the ability to comprehend videos just like humans do. Our models have redefined the standards in video-language modeling, empowering us with more intuitive and far-reaching capabilities, and fundamentally transforming the way we interact with and analyze various forms of media.


With a remarkable $107 million in Seed and Series A funding, our company is backed by top-tier venture capital firms such as NVIDIA’s NVentures, NEA, Radical Ventures, and Index Ventures, and prominent AI visionaries and founders such as Fei-Fei Li, Silvio Savarese, Alexandr Wang and more. Headquartered in San Francisco, with an influential APAC presence in Seoul, our global footprint underscores our commitment to driving worldwide innovation.


We are a global company that values the uniqueness of each person’s journey. It is the differences in our cultural, educational, and life experiences that allow us to constantly challenge the status quo. We are looking for individuals who are motivated by our mission and eager to make an impact as we push the bounds of technology to transform the world. Join us as we revolutionize video understanding and multimodal AI.



About the role


As an ML Research Engineer at Twelve Labs, you will drive our applied efforts in video embedding and retrieval, multimodal language modeling, and intelligent agents. You will collaborate closely with other engineers and scientists to build the next generation of Twelve Labs models, services, and infra. Scaling our models, data, and training + inference platform, while improving the reliability of our core systems, is the essence of the role. This role is a perfect fit for research minded engineers who want to build SOTA video, vision, and video-language modeling systems!

In this role, you will

  • Deliver top-notch applied research solutions to problems like VLM finetuning, auto-labeling of video-text datasets, and model-based filtering of said datasets to optimize (end-)model performance
  • Collaborate with our science org to optimize the (e.g.) training/inference performance of our core model stack
  • Define a systematic prompt generation and selection strategy for our flagship VLM
  • Mentor junior engineers/researchers, and hold a high bar around code quality / engineering best practices
  • Lead by example in interviewing, hiring, and onboarding passionate and empathetic engineers
  • Work across teams to understand and manage project priorities and product deliverables, evaluate trade-offs, and drive technical initiatives from ideation to execution to shipment
  • Advance our industry-leading enterprise video solutions by incorporating already-great research into fault tolerant, low latency e2e systems

You may be a good fit if you have:

  • 10+ years of industry experience (or 7+ with a PhD in a related technical domain)
  • A PhD, or a Master's degree, in machine learning or a closely related discipline
  • Led teams of 5+ engineers as a technical lead, or formally managed engineering teams comprised of both junior and senior engineers
  • Published research/engineering work on LLMs, VLMs, video models, or contrastive multimodal models in top-tier AI conferences such as NeurIPS, ICML, ICLR, etc., or have scaled distributed foundation model data acquisition, training, inference, evaluation, etc.
  • Expertise optimizing model inference with TensorRT, ONNX, Triton Inference Server, or directly related technologies
  • Built Kubernetes-based systems for distributed data/ML workflows or worked extensively with HPC tools such as Slurm
  • A passion for, and experience in, both ML modeling and ML/AI systems software engineering
  • Strong Python expertise and considerable prior work history with at least one statically typed language (we use Golang)
  • An applied bent / are not a pure theoretician: we are an applied science and engineering group at an applied science and engineering company!
  • Strong communication skills in written and spoken English

Interview and Onboarding Process:


1) Recruiter Phone Screen

2) Initial Technical Assessment

3) Final round technical assessment & culture interview

4) Reference Checks 


We're also excited to share that we'll do global onboarding in Seoul for all new hires (paid company travel!).


Even if there are a few checkboxes that aren’t ticked through your prior experience, we still encourage you to apply! If you are a 0-to-1 achiever, a ferocious learner, and a kind and fun team player who motivates others, you will find a home at Twelve Labs.


We welcome applicants from all walks of life and are committed to equal-opportunity employment. We cherish and celebrate diversity not just because it is the right thing to do, but because it makes our company much stronger.



Benefits and Perks

🤝 An open and inclusive culture and work environment.

🧑‍💻 Work closely with a collaborative, mission-driven team on cutting-edge AI technology.

✈️ Extremely flexible PTO and parental leave policy. Office closed the week of Christmas and New Years.

🏙 Remote-flexible, offices in San Francisco and Seoul and coworking stipend.

Top Skills

Python
The Company
HQ: San Francisco, CA
20 Employees
On-site Workplace
Year Founded: 2021

What We Do

Helping developers build programs that can see, hear, and understand the world as we do by giving them the world's most powerful video-understanding infrastructure.

Similar Jobs

Hybrid
Santa Monica, CA, USA
200000 Employees
203K-298K Annually

Exa (exa.ai) Logo Exa (exa.ai)

ML Research Engineer

Artificial Intelligence • Software
San Francisco, CA, USA
36 Employees
130K-300K Annually

Atlassian Logo Atlassian

Senior Principal Machine Learning Engineer - Search

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Remote
Mountain View, CA, USA
11000 Employees
198K-319K Annually

Canva Logo Canva

Staff Machine Learning Engineer - User Voice (Remote across Australia & New Zealand)

Artificial Intelligence • Cloud • Digital Media • Machine Learning • Mobile • Software • Design
Remote
Hybrid
San Francisco, CA, USA
5000 Employees

Similar Companies Hiring

Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees
RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
Charlotte, North Carolina
53 Employees
Hedra Thumbnail
Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
San Francisco, CA
14 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account