Principal Software Engineer - AI Inference NVIDIA NIM

Posted 4 Days Ago
Be an Early Applicant
Santa Clara, CA
Expert/Leader
Artificial Intelligence • Hardware • Robotics • Software • Metaverse
The Role
The Principal Engineer will design and improve NVIDIA's Inference Microservices architecture, focusing on scalable microservices, efficient APIs, and high-performance inference solutions. The role involves collaboration with AI model teams, mentoring engineers, and ensuring customer satisfaction through exceptional software design and implementation practices.
Summary Generated by Built In

NVIDIA is the platform upon which every new AI-powered application is built. We are seeking a Principal Engineer for NVIDIA Inference Microservices (NIMs). The right person for this role is technical, creative and driven to change the way NVIDIA NIMs are designed. Our NIM offerings are easy to use, provide the best performance and are tested in all deployment scenarios: in the cloud, on self hosted infrastructure and locally on all NVIDIA GPUs. You will apply your deep technical expertise in AI Inference and containerized software to design an efficient and scalable software architecture that works with inference solutions in a cloud native ecosystem through well defined APIs.

NVIDIA is building a new category of products by intersecting our prowess in deep learning and computing with industry-leading technology. You will harness groundbreaking inference acceleration from NVIDIA, and design inference microservices that span from multi-modal, to protein folding, to weather prediction. You will influence and drive advances in many NVIDIA teams and external partners to solve the hardest problems in AI inference. In this role, you will design and improve our NIM architecture using the most recent improvements in AI inference, improving and specifying the underlying libraries, optimization techniques, containerization details, and easy-to-use APIs that capture metrics and traceability.

What you'll be doing:

  • You are the engineer's engineer demonstrating good engineering practice and mentoring others to follow suit. You will be architect and build NIM software to support modularity and high performance using the latest GPU-enhanced technology.

  • Design scalable and maintainable microservices, define APIs for different inference uses cases, both local and distributed processing, while providing observability. You deliver software through rapid iterations and support many different teams to re-use scalable architecture and components.

  • This role requires collaboration with multiple AI model teams to build an efficient architecture that improves inference performance and re-usability. You will define metrics and drive improvements based on user feedback and industry expectations.

  • You are a great communicator! Through your partnership with our NIM leadership, you will deliver a cohesive and enticing architecture to our engineers that is reflected in customer satisfaction.

What we need to see:

  • Experience with large-language models or generative AI providing high performant inference and deep expertise in microservices, Pytorch, ONNX, REST, gRPC APIs, and multiple inference backends.

  • Technical leadership experience providing designs and building scalable microservices in an agile software development environment. You demonstrate the ability to lead multi-functional efforts, effectively working with multi-functional teams, principals and architects, across organizational boundaries.

  • You are a mentor and coach in all your interactions with all your colleagues.

  • BS or MS in Computer Science, Computer Engineering or related field (or equivalent experience).

  • 12+ years of experience building, debugging, analyzing and optimizing runtime performance of distributed services.

Ways to stand out from the crowd:

  • Experience building inference systems.

  • Prior experience providing full-stack development that scales to large numbers of nodes.

  • Prior MLOps experience and experience using ML and AI technologies

  • CUDA experience and an ability to use GPU optimized libraries for improved performance

We are widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and talented people in the world working for us. If you're creative and autonomous with a real passion for technology we want to hear from you.

The base salary range is 272,000 USD - 419,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Top Skills

Onnx
PyTorch
The Company
HQ: Santa Clara, CA
21,960 Employees
On-site Workplace
Year Founded: 1993

What We Do

NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, NVIDIA is increasingly known as “the AI computing company.”

Similar Jobs

Pfizer Logo Pfizer

Digital Assistant Product Engineer

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
La Jolla, CA, USA
121990 Employees
98K-182K Annually

Atlassian Logo Atlassian

Principal Site Reliability Engineer

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Remote
San Francisco, CA, USA
11000 Employees
167K-269K Annually

ServiceNow Logo ServiceNow

Network Engineer

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
San Diego, CA, USA
26000 Employees
92K-142K Annually

Datadog Logo Datadog

Services Architect 2

Artificial Intelligence • Cloud • Software • Cybersecurity
Hybrid
San Francisco, CA, USA
5000 Employees
86K-134K Annually

Similar Companies Hiring

TrainingPeaks (A Peaksware Company) Thumbnail
Software • Fitness
Louisville, CO
69 Employees
bet365 Thumbnail
Software • Gaming • eSports • Digital Media • Automation
Denver, Colorado
6100 Employees
Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account