Storage Engineer

Posted 3 Days Ago
Hiring Remotely in USA
Remote
150K-180K Annually
Senior level
Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
We build infrastructure for machine learning
The Role
The Storage Engineer will manage and optimize a customer-facing multi-petabyte VAST storage system, including performance tuning, troubleshooting, and collaboration with teams.
Summary Generated by Built In

Voltage Park is on a mission to make machine learning infrastructure accessible to all, from large enterprises and research universities, to seed-stage startups and nonprofits. We believe that providing seamless access to compute with pricing and inventory transparency is the future of access to GPUs. 

As part of this effort, we’re hiring a Storage Engineer to be responsible for the buildout, maintenance, and day to day operations of our customer facing storage system. To succeed in this role, you will need to be comfortable owning break/fix, performance tuning, firmware/OS updates, RMA tracking, working with datacenter operations, logging, analytics, automations, testing, and SOPs. The ideal candidate will have a strong background in HPC storage systems, and experience with VAST storage systems is required. 

This is a fully remote role, but you must be located in the continental US and available to work PST hours. We are not able to provide sponsorship for this position.

Responsibilities

  • Own the full lifecycle of a multi-petabyte, multi-datacenter VAST storage system.

  • Define SOPs and runbooks for handling storage system events.

  • Work on performance tuning, client optimization, and speed / reliability troubleshooting tasks.

  • Optimize storage performance and scalability for large-scale GPU infrastructure.

  • Collaborate with other engineers and teams to integrate storage solutions.

  • Stay updated with the latest storage technologies and best practices in HPC.

  • Be on-call for urgent system incidents.

Qualifications

  • Proven experience in deploying and managing storage solutions for large-scale HPC infrastructures.

  • Experience with VAST storage systems

  • Expertise in NFS, high-performance parallel file systems, and related storage networking technologies.

  • Strong understanding of HPC architectures and storage performance optimization techniques.

  • Experience with bare metal servers in a datacenter environment.

  • Experience with Linux, Terraform, Ansible.

  • Strong communication skills and the ability to collaborate effectively with technical and non-technical stakeholders.

  • Experience architecting, building, and delivering complex systems from 0 to 1.

  • Balances pragmatic development and ideal architectures.

  • Effective at navigating tradeoffs between design, risk, cost, and outcomes.

Culture

  • You enjoy working with a small group of friendly, highly motivated, execution focused colleagues.

  • You’re comfortable with a high degree of autonomy. We expect you to independently prioritize your work and understand how it maps to the overall needs and goals of the company.

  • You’re knowledgeable in your domain but also enjoy wearing multiple hats and venturing outside of your comfort zone when the need arises.

  • You value the ability to write well and understand the importance of good documentation.

Voltage Park is an equal opportunity employer and makes employment decisions on the basis of merit. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic under federal, state, or local law. If you require an accommodation during the job application process, please notify your recruiter. 

Compensation Range: $150K - $180K


#BI-Remote

Top Skills

Ansible
Hpc Storage Systems
Linux
Nfs
Terraform
Vast Storage Systems

What the Team is Saying

Melissa Du
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
51 Employees
Hybrid Workplace
Year Founded: 2023

What We Do

The market for cutting-edge ML compute is broken. Startups, researchers and even big AI labs are scrambling to buy or rent access to the latest chips for ML training. But demand far outstrips supply, and what’s available is only accessible to the well-resourced, placing an artificial damper on innovation.

To solve this challenge, we've launched Voltage Park, and we’re on a mission to make machine learning infrastructure accessible to all, from large enterprises and research universities, to seed-stage startups and nonprofits.

With around 24,000 NVIDIA H100 GPUs, the Voltage Park cloud is one of the most powerful collections of cutting-edge ML compute in the world. Our clusters consist of 80GB H100 SXM5 GPUs fully interconnected with 3.2T InfiniBand.

Why Work With Us

You’ll play a pivotal role as a member of the founding team that will change the face of machine learning infrastructure. As an early hire, you’ll have outsize influence in defining the company’s culture and ensuring mission success.

Voltage Park Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Typical time on-site: Flexible
HQSan Francisco, CA

Similar Jobs

Voltage Park Logo Voltage Park

Technical Account Manager

Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
Remote
2 Locations
51 Employees
140K-165K Annually

Voltage Park Logo Voltage Park

Platform Engineer

Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
Remote
2 Locations
51 Employees
120K-180K Annually

Voltage Park Logo Voltage Park

Director of Customer Experience

Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
Remote
USA
51 Employees
170K-225K Annually

Voltage Park Logo Voltage Park

Technical Project Manager (Datacenters)

Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
Remote
USA
51 Employees
135K-160K Annually

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account