Staff Software Engineer, Metrics - US (Remote)

Posted 3 Days Ago
Hiring Remotely in San Francisco, CA
Remote
Senior level
Machine Learning • Software
The Role
Lead the scaling of metrics and storage systems, design infrastructure, mentor junior engineers, and enhance customer-facing APIs for AI developers.
Summary Generated by Built In

At Weights & Biases, our mission is to build the best tools for AI developers. We founded our company on the insight that while there were excellent tools for developers to build better code, there were no similarly great tools to help ML practitioners build better models. Starting with our first experiment tracking product, we have since expanded our solution into a comprehensive AI developer platform for organizations focused on building their own deep learning models and generative AI applications.


Weights & Biases is a Series C company with $250M in funding and over 200 employees. We proudly serve over 1,000 customers and more than 30 foundation model builders including customers such as OpenAI, NVIDIA, Microsoft, and Toyota.


As a Staff Engineer, you'll lead the effort to scale our metrics and storage systems, ensuring they meet the complex demands of our most advanced customers. You’ll play an instrumental role in the evolution of our platform as we grow our capability to ingest and query petabytes of data, making critical technical decisions that optimize the performance, reliability, and cost-effectiveness of our systems.


You will set the technical direction for the team, guiding the organization to balance short-term deliverables with strategic, long-term architectural improvements. You'll partner closely with product management, revenue teams, and other engineering groups to shape and deliver the future of W&B’s flagship Models product, supporting experiment tracking and analytics utilized by over 2,500 leading machine learning and AI teams worldwide.

Responsibilities:

  • Design and implement infrastructure that is scalable, efficient, and tailored to customer needs.
  • Lead the maintenance and monitoring of existing services, identifying and executing necessary improvements to ensure ongoing performance and reliability.
  • Participate in team-wide rotations to respond to customer support issues and site outages.
  • Communicate and collaborate effectively with internal and external stakeholders to achieve optimal outcomes.
  • Lead and mentor junior engineers, supporting their professional growth and development within the company.

Requirements:

  • 8+ years of experience in software engineering, with a focus on data platforms and/or distributed systems.
  • Strong software engineering fundamentals and proficiency in at least one modern programming language (e.g., Python, Go, Typescript).
  • Extensive experience designing and scaling customer-facing APIs in production environments, ideally leveraging systems like MySQL, Postgres, Clickhouse, Bigtable, Pub/Sub, Kafka, etc.
  • Hands-on experience with Kubernetes, Terraform, and major cloud providers (e.g., GCP, AWS, Azure).

Our benefits:

  • 🏝️ Flexible time off
  • 🩺 Medical, Dental, and Vision for employees and Family Coverage
  • 🏠 Remote first culture with in-office flexibility in San Francisco
  • 💵 Home office budget with a new high-powered laptop
  • 🥇 Truly competitive salary and equity
  • 🚼 12 weeks of Parental leave (U.S. specific)
  • 📈 401(k) (U.S. specific)
  • Supplemental benefits may be available depending on your location
  • Explore benefits by country

We encourage you to apply even if your experience doesn't perfectly align with the job description as we seek out diverse and creative perspectives. Team members who love to learn and collaborate in an inclusive environment will flourish with us. We are an equal opportunity employer and do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. If you need additional accommodations to feel comfortable during your interview process, reach out at [email protected].


#LI-Remote

Top Skills

AWS
Azure
Bigtable
Clickhouse
GCP
Go
Kafka
Kubernetes
MySQL
Postgres
Pub/Sub
Python
Terraform
Typescript
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
132 Employees
On-site Workplace
Year Founded: 2017

What We Do

Weights & Biases helps machine learning teams build better models faster. With a few lines of code, practitioners can instantly debug, compare and reproduce their models — architecture, hyperparameters, git commits, model weights, GPU usage, and even datasets and predictions — and collaborate with their teammates.

Similar Jobs

Huntress Logo Huntress

Software Engineer, Core Platform (Ruby/Rails)

Information Technology • Cybersecurity
Easy Apply
Remote
US
450 Employees

Thrive Market Logo Thrive Market

Principal Software Engineer, Platform

Consumer Web • eCommerce • Food • Healthtech • Natural Language Processing • Social Impact
Remote
2 Locations
1000 Employees
190K-230K Annually

Chronosphere Logo Chronosphere

Member of Technical Staff - Metrics Platform

Cloud • Enterprise Web • Software
Remote
United States
299 Employees
200K-250K Annually

Domino Data Lab Logo Domino Data Lab

Staff Software Engineer, Infrastructure

Artificial Intelligence • Machine Learning
Easy Apply
Remote
Hybrid
3 Locations
200 Employees
200K-235K Annually

Similar Companies Hiring

True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
53 Employees
Red 6 Thumbnail
Virtual Reality • Software • Hardware • Defense • Aerospace
Orlando, Florida
113 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account