Senior Software Developer, HPC Cluster Management

Posted 4 Days Ago
Be an Early Applicant
Santa Clara, CA
Senior level
Artificial Intelligence • Hardware • Robotics • Software • Metaverse
The Role
The Senior Software Developer will focus on developing and managing Linux-based cluster software, particularly in hardware integration, installation, and provisioning. Responsibilities include ensuring efficient performance of NVIDIA's Bright Cluster Manager, enhancing scalability, and supporting new hardware and Linux distributions, while also assisting customers with the software.
Summary Generated by Built In

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you will be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Join the team and see how you can make a lasting impact on the world!

We have positions available for enthusiastic, hardworking and experienced software developers for working on our hardware integration and bare-metal provisioning related functionality in our Linux-based cluster management software environment. NVIDIA's Bright Cluster Manager is used to power thousands of Linux clusters around the world, varying from a few nodes to several thousands of nodes. Bright clusters can run on-premises, completely in the cloud, or in a hybrid environment.

What you’ll be doing:

  • Development of the head node and compute node installation and provisioning processes.

  • Work on functionality in the area of edge site deployment.

  • Integrating our product with the latest hardware (e.g GPUs, DPUs, accelerators, high-speed interconnects such as Infiniband).

  • Work on features related to composable infrastructure management.

  • Develop new features for our BIOS and firmware upgrade management.

  • Develop functionality that makes Bright clusters usable for a wider range of workloads, and increases scalability to allow clusters to scale to huge number of nodes.

  • Adding support for new Linux distributions.

  • Improving support for alternative CPU architectures such as ARM.

  • Work on adding features to our Ansible collections for Cluster Installation and Management.

  • Assist our support team with customer support requests in the above mentioned features and help our customers to use our product more efficiently.


What we need to see:

  • Degree in Computer Science or related field (or equivalent experience).

  • 7+ years of experience in software development and/or related roles.

  • Our software is based on Linux. You should be very familiar with the Linux operating system and in particular with networking concepts in Linux. In addition, good practical knowledge about the most common software that is installed as part of a typical Linux installation is required.

  • You are proficient in Python and intimately familiar with object oriented software design, design patterns, and concurrent programming techniques.

  • Emphasis on high quality of work and in producing clean code.

  • Eager to learn and use new technologies.

Ways to stand out from the crowd:

  • Experience with Ansible.

  • Experience with high-performance computing and system administration.

  • Knowledge of Kubernetes, AWS, Azure, GCE, OpenStack, Jenkins and distributed programming.

  • Proficiency in C++.

The base salary range is 180,000 USD - 339,250 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Top Skills

C++
Python
The Company
HQ: Santa Clara, CA
21,960 Employees
On-site Workplace
Year Founded: 1993

What We Do

NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, NVIDIA is increasingly known as “the AI computing company.”

Similar Jobs

Atlassian Logo Atlassian

Principal Engineer, Distribution at Loom

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Remote
San Francisco, CA, USA
11000 Employees
171K-274K Annually

Crunchyroll Logo Crunchyroll

Staff Software Engineer, Content Delivery

Digital Media • eCommerce • Gaming • Mobile • News + Entertainment
Hybrid
San Francisco, CA, USA
1200 Employees
190K-239K Annually

Block Logo Block

Software Engineer (Backend), Buyer Foundations

Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
Remote
Hybrid
7 Locations
12000 Employees
139K-245K Annually

Block Logo Block

Senior Software Engineer, Bitcoin Compliance

Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
Remote
Hybrid
7 Locations
12000 Employees
168K-297K Annually

Similar Companies Hiring

Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees
RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
Charlotte, North Carolina
53 Employees
Hedra Thumbnail
Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
San Francisco, CA
14 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account