Senior Research Engineer, ML Data Pipelines

Posted 8 Days Ago
Be an Early Applicant
Santa Clara, CA
Expert/Leader
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Role
Design and optimize ML data pipelines for large-scale models, collaborate with researchers on data management, and develop tools for data processing and quality monitoring.
Summary Generated by Built In

NVIDIA is searching for a senior or principal engineer who specializes in designing and optimizing large-scale machine learning data pipelines in the Generalist Embodied Agent Research (GEAR) group. Our team is leading Project GR00T, NVIDIA’s moonshot initiative at building foundation models and full-stack technology for humanoid robots.

You will work with an amazing and collaborative research team that consistently produces influential works on multimodal foundation models, large-scale robot learning, embodied AI, and physics simulation. Our past projects include Eureka, VIMA, Voyager, MineDojo, MimicPlay, Prismer, and more. Your contributions will have a significant impact on our research projects and product roadmaps.

What you will be doing:

  • Design, implement, and optimize scalable ML data pipelines for training multimodal foundation models;

  • Collaborate closely with researchers to preprocess, transform, and manage large datasets for robot model training and evaluation;

  • Develop tools for labeling and curating multiple streams of sensor data;

  • Continuously monitor robot data collection process and evaluate data quality;

  • Implement and optimize PyTorch data loading modules for video processing and robot learning on large GPU clusters.

What we need to see:

  • Bachelor's Degree in Computer Science, Robotics, Engineering, or a related field;

  • 10+ years of full-time industry experience working with large-scale machine learning data pipelines;

  • Proficiency in JavaScript/TypeScript and React for building frontend applications, such as dashboards, visualization, and analysis tools;

  • Proficiency in Python data processing libraries. Hands-on model training experience in PyTorch, JAX, or Tensorflow;

  • Strong experience with large-scale GPU clusters, HPC environments, and job scheduling/orchestration tools (e.g., SLURM, Kubernetes).

Ways to stand out from the crowd:

  • Master’s or PhD’s degree in Computer Science, Robotics, Engineering, or a related field;

  • Strong experience with cloud infrastructure management (AWS, Azure, GCP) and data stores (Postgres, MySQL, ElasticSearch, Redis);

  • Experience at autonomous driving or robotics companies training machine learning models on massive datasets;

  • Demonstrated Tech Lead experience, coordinating a team of engineers and driving projects from conception to deployment.

  • Contributions to popular open-source frameworks.

NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and productive people in the world. Please join us and be part of the forefront of developing general-purpose robots and large-scale foundation models!

The base salary range is 224,000 USD - 425,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Top Skills

AWS
Azure
Elasticsearch
GCP
JavaScript
Jax
Kubernetes
MySQL
Postgres
Python
PyTorch
React
Redis
Slurm
TensorFlow
Typescript
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Santa Clara, CA
21,960 Employees
On-site Workplace
Year Founded: 1993

What We Do

NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, NVIDIA is increasingly known as “the AI computing company.”

Similar Jobs

The Aerospace Corporation Logo The Aerospace Corporation

Electronics Technician- Flight Software Integration Lab (Retiree Casual)

Aerospace • Artificial Intelligence • Cloud • Machine Learning • Software • Cybersecurity • Defense
Hybrid
El Segundo, CA, USA
4600 Employees
Easy Apply
Hybrid
San Francisco, CA, USA
1100 Employees

The Aerospace Corporation Logo The Aerospace Corporation

RI 2025 Communication Systems and Signal Processing Graduate Intern

Aerospace • Artificial Intelligence • Cloud • Machine Learning • Software • Cybersecurity • Defense
Hybrid
El Segundo, CA, USA
4600 Employees

The Aerospace Corporation Logo The Aerospace Corporation

Space Systems Architecture Section Manager

Aerospace • Artificial Intelligence • Cloud • Machine Learning • Software • Cybersecurity • Defense
Hybrid
2 Locations
4600 Employees
134K-201K Annually

Similar Companies Hiring

True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
53 Employees
Red 6 Thumbnail
Virtual Reality • Software • Hardware • Defense • Aerospace
Orlando, Florida
113 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account