Software Engineer, Pre-Training/AI

Posted 12 Days Ago
Be an Early Applicant
San Francisco, CA
Expert/Leader
Artificial Intelligence • Machine Learning • Software
The Role
As a Research Engineer in AI Pretraining at Eventual, you will develop advanced model training techniques, optimize data pipelines, and collaborate with teams to enhance the Daft data engine for modern AI workloads.
Summary Generated by Built In

About Eventual

Eventual is a data platform that helps data scientists and engineers build data applications across ETL, analytics and ML/AI.

OUR PRODUCT IS OPEN-SOURCE AND USED AT ENTERPRISE SCALE
Our distributed data engine Daft is open-sourced and runs on 800k CPU cores daily. This is more compute than Frontier, the world's largest supercomputer!

Today's data tooling (Spark, Presto, Snowflake) was built for a world of tabular data analytics, but does not generalize to the needs of modern ML/AI such as multimodal data, heterogenous compute and user-defined Python algorithms.
Eventual and Daft Bridge that gap, making ML/AI workloads easy to run alongside traditional tabular workloads.



About the Role

At Eventual, we’re pushing the boundaries of artificial intelligence and large-scale distributed data systems. As a Research Engineer focused on AI Pretraining, you will operate at the intersection of cutting-edge AI research and scalable system development. Your work will involve implementing advanced dataset and model training techniques—including multimodal learning, synthetic data generation, and reinforcement learning from human feedback (RLHF)—to drive innovation and performance on the Daft data engine.

In this role, you will collaborate closely with the Daft data engine team to build and optimize a high-performance data engine, ensuring it meets the scale and demands of modern AI workloads.

Key Responsibilities:

  • Build model pretraining pipelines: build training and data pipelines in a principled and observable manner using state-of-the-art data techniques

  • Develop a set of benchmarks: design and define the benchmarks for AI data workloads

  • Data Engineering for AI: Collaborate with our data engine team to design and optimize the Daft data engine for these targeted AI data workloads on massive 100TB+ datasets

  • AI Research: Stay at the forefront of AI research, incorporating the latest advancements into our data engine and platform capabilities


What we look for:

  • Strong programming skills in Python, with experience in deep learning frameworks such as PyTorch or TensorFlow.

  • Deep understanding of transformer architectures, self-supervised learning, and AI model training techniques.

  • Experience with distributed training frameworks (e.g., DeepSpeed, FSDP, Horovod) and efficient model parallelism.

  • Expertise in data pipelines and large-scale dataset management.

  • Familiarity with ML compilers, kernel optimizations, and GPU acceleration is a plus.

  • Familiarity with systems programming (Rust, C++) is a plus.

  • PhD or equivalent research experience in Machine Learning, Computer Science, or related fields is preferred.

Why Join Eventual?

  • Work alongside world-class experts in distributed computing and AI research.

  • Build the next generation of scalable AI infrastructure and training techniques.

  • Competitive salary, equity, and top-tier benefits.

  • A collaborative, engineering-driven environment where innovation thrives.


Benefits and Remote Work

We are believers in both having the flexibility of remote work but also the importance of in-person work, especially at the earliest stages of a startup. We have a flexible hybrid approach to in-person work with at least 3 days of in-person work typically from Monday - Wednesday at our office in San Francisco.

We believe in providing employees with best-in-class compensation and benefits including meal allowances, comprehensive health coverage including medical, dental, vision and more.


About the interview

INTRODUCTORY CALL [15M]

A short phone screen over video call with one of our co-founders for us to get acquainted, understand your aspirations and evaluate if there is a good fit in terms of the type of role you are looking for.

TECHNICAL PHONE SCREEN [1 HR]

A technical phone screen question over video call to understand your technical abilities.

TECHNICAL INTERVIEW PANEL [4 HR]

Technical interviews with the rest of the Eventual team with questions to further understand your technical strengths, weaknesses and experiences.


MEET THE TEAM

As many chats as necessary to get to know us - come have a coffee with our co-founders and existing team members to understand who we are and our goals, motivations and ambitions.

We look forward to meeting you!


WE'RE GROWING - COME GROW WITH US!

We are well funded by investors such as YCombinator, Caffeinated Capital, Array.vc and top angels in the valley from Databricks, Meta and Lyft.

Our team has deep expertise in high performance computing, big data technologies, cloud infrastructure and machine learning. Our team members have previously worked in top technology companies such as Amazon, Databricks, Tesla and Lyft.

We are looking for exceptional individuals with a passion for technology and a strong sense of intellectual curiosity.

If that sounds like you, please reach out even if you don't see a specific role listed that matches your skillsets - we'd love to chat!

Top Skills

C++
Deepspeed
Fsdp
Horovod
Python
PyTorch
Rust
TensorFlow
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, California
20 Employees
On-site Workplace

What We Do

Eventual is building a Data Warehouse from the ground up that is designed to tackle the challenges of working with traditional data engineering and analytics alongside modern ML/AI workloads.

Eventual has raised over $2.5M from investors including YCombinator, Array VC, Caffeinated Capital and top Silicon Valley executives and founders in companies such as Meta, Lyft and Databricks.

Similar Jobs

Grammarly Logo Grammarly

Software Engineer, Machine Learning

Artificial Intelligence • Information Technology • Machine Learning • Natural Language Processing • Productivity • Software • Generative AI
Easy Apply
Hybrid
San Francisco, CA, USA
1400 Employees
244K-337K

Anduril Logo Anduril

Senior Systems Engineer - (C2 Design/Architecture/Datalinks)

Aerospace • Artificial Intelligence • Hardware • Robotics • Security • Software • Defense
Costa Mesa, CA, USA
4500 Employees
142K-213K Annually

Anduril Logo Anduril

Mechanical Engineer, Connected Warfare

Aerospace • Artificial Intelligence • Hardware • Robotics • Security • Software • Defense
Costa Mesa, CA, USA
4500 Employees
142K-213K Annually

Relativity Space Logo Relativity Space

Lead, Embedded Software

3D Printing • Aerospace • Hardware • Robotics • Software • Manufacturing
Easy Apply
Hybrid
Long Beach, CA, USA
1300 Employees
174K-214K Annually

Similar Companies Hiring

True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
53 Employees
Red 6 Thumbnail
Virtual Reality • Software • Hardware • Defense • Aerospace
Orlando, Florida
113 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account