Machine Learning Performance Engineer

Reposted 7 Days Ago
Be an Early Applicant
Sunnyvale, CA
Senior level
Artificial Intelligence • Transportation
The Role
The role involves optimizing large scale training jobs, maximizing training throughput through performance optimization, profiling bottlenecks, and collaborating with research teams to enhance training efficiency. You will work on managing and improving GPU training infrastructure and software.
Summary Generated by Built In

At Wayve we're committed to creating a diverse, fair and respectful culture that is inclusive of everyone based on their unique skills and perspectives, and regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, veteran status, pregnancy or related condition  (including breastfeeding) or any other basis as protected by applicable law.  

About us   

Founded in 2017, Wayve is the leading developer of Embodied AI technology.  Our advanced AI software and foundation models enable vehicles to perceive, understand, and navigate any complex environment, enhancing the usability and safety of automated driving systems.

Our vision is to create autonomy that propels the world forward.  Our intelligent, mapless, and hardware-agnostic AI products are designed for automakers, accelerating the transition from assisted to automated driving.  

At Wayve, big problems ignite us—we embrace uncertainty, leaning into complex challenges to unlock groundbreaking solutions. We aim high and stay humble in our pursuit of excellence, constantly learning and evolving as we pave the way for a smarter, safer future.

At Wayve, your contributions matter.  We value diversity, embrace new perspectives, and foster an inclusive work environment; we back each other to deliver impact.  

Make Wayve the experience that defines your career!  

The Role

We are seeking skilled engineers to join our Machine Learning Platform team working on optimising large scale training jobs as we aim to scale our models through the next order of magnitude. The Machine Learning Platform team owns our GPU training infrastructure and software abstractions around it, and you will have a specific focus on improving training efficiency.

Challenges you will own

  • Maximising the MFU of our large scale training jobs.
  • Profiling and identifying bottlenecks in training code.
  • Implementing GPU kernels to improve training throughput.
  • Working closely with Research teams to integrate and test training efficiency improvements.
  • Owning and improving our GPU training clusters.

About You

Essential:

  • 5+ years experience in performance optimization or ML engineering.
  • Experience optimize large scale training jobs on GPU compute clusters (e.g. PyTorch, CUDA)
  • Experience in working in platform teams and working with research teams.
  • Experience in reporting and tracking over time benchmarked performance in an open and accessible way.
  • Ability to write high quality, well-structured and tested Python code
  • BS or MS in Machine Learning, Computer Science, Engineering, or a related technical discipline or equivalent experience

Desirable:

  • Solid experience working with concurrent, parallel and distributed computing.
  • Experience using Nvidia NSight Systems.
  • Experience implementing GPU kernels.
  • Knowledge of computing fundamentals - what makes code fast, secure and reliable.


This is a full-time role based in our office in Sunnyvale, California.  At Wayve we want the best of all worlds so we operate a hybrid working policy that combines time together in our offices and workshops to fuel innovation, culture, relationships and learning, and time spent working from home.   We operate core working hours so you can determine the schedule that works best for you and your team.  

#LI-HH1

We understand that everyone has a unique set of skills and experiences and that not everyone will meet all of the requirements listed above. If you’re passionate about self-driving cars and think you have what it takes to make a positive impact on the world, we encourage you to apply.

For more information visit Careers at Wayve. 

To learn more about what drives us, visit Values at Wayve 

DISCLAIMER: We will not ask about marriage or pregnancy, care responsibilities or disabilities in any of our job adverts or interviews. However, we do look to capture information about care responsibilities, and disabilities among other diversity information as part of an optional DEI Monitoring form to help us identify areas of improvement in our hiring process and ensure that the process is inclusive and non-discriminatory.



Top Skills

Gpu Compute Clusters
Machine Learning
Nvidia Nsight Systems
Performance Optimization
Python
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: London
200 Employees
On-site Workplace
Year Founded: 2017

What We Do

We're Wayve, a leading developer of embodied intelligence for autonomous vehicles. We use AI to pioneer a next-generation approach to self-driving: AV2.0, which enables fleet operators to unlock the benefits of AV technology at scale.

Founded in 2017, Wayve is made up of a diverse team of experts in machine learning and robotics. We were the first to deploy AVs on public roads with end-to-end deep learning. Today, our teams are based in London and California, and we're testing AVs in cities across the UK.

Inspired by our vision for a smarter, safer, more sustainable world, we're looking for people who are passionate about building breakthrough solutions to some of the world’s most important challenges. If you're looking for an exciting opportunity with a dynamic team, get in touch!

Similar Jobs

MatX Logo MatX

ML Performance Engineer

Artificial Intelligence • Hardware • Software
Mountain View, CA, USA
19 Employees
San Francisco, CA, USA
154 Employees

Databricks Logo Databricks

GenAI Staff Machine Learning Engineer, Performance Optimization

Big Data • Machine Learning • Software • Analytics • Big Data Analytics
San Francisco, CA, USA
2200 Employees
192K-260K Annually

Autodesk Logo Autodesk

Machine Learning: Performance Developer Remote or Hybrid Canada or United States

Big Data • Cloud • Digital Media • Machine Learning • Mobile • Software • Industrial
Remote
7 Locations
13285 Employees

Similar Companies Hiring

Stepful Thumbnail
Software • Healthtech • Edtech • Artificial Intelligence
New York, New York
60 Employees
HERE Technologies Thumbnail
Software • Logistics • Internet of Things • Information Technology • Computer Vision • Automotive • Artificial Intelligence
Amsterdam, NL
6000 Employees
True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account