Senior Site Reliability Engineer (SRE) - LATAM (Hybrid)

Posted 3 Days Ago
Be an Early Applicant
Buenos Aires, Ciudad Autónoma de Buenos Aires
Hybrid
Senior level
Machine Learning • Software
The Role
The Senior Site Reliability Engineer will manage and maintain infrastructure standards, develop internal and external tools, troubleshoot operating issues across various infrastructures, and provide high-quality solutions to enterprise customers, while fostering strong partnerships for customer satisfaction.
Summary Generated by Built In

At Weights & Biases, our mission is to build the best tools for AI developers. We founded our company on the insight that while there were excellent tools for developers to build better code, there were no similarly great tools to help ML practitioners build better models. Starting with our first experiment tracking product, we have since expanded our solution into a comprehensive AI developer platform for organizations focused on building their own deep learning models and generative AI applications.


Weights & Biases is a Series C company with $250M in funding and over 200 employees. We proudly serve over 1,000 customers and more than 30 foundation model builders including customers such as OpenAI, NVIDIA, Microsoft, and Toyota.


The Senior Site Reliability Engineer (SRE) will report to the Enterprise Engineering Manager. This person will be responsible for setting up and maintaining infrastructure standards.


In addition, this role will play a pivotal role in tool development both externally and internally. You'll help make it possible to deploy our software to our enterprise customers, establishing a strong foundation of technical excellence for our diversified customer base.


This role will also establish firm partnerships with our enterprise customers, which will boost customer satisfaction and lead to more comprehensive solutions.


This team manages the variances in infrastructure types and implementing suitable solutions that cater to the unique needs of each enterprise customer. You will leverage your skills, knowledge, and adaptability to navigate this complex landscape, consistently providing high-quality solutions to our customers.

What you’ll achieve (Responsibilities)

  • Set up and maintain infrastructure standards to ensure stable and smooth operations, supporting efficient functioning and laying the groundwork for meaningful improvements over time.
  • Develop tools for both external and internal purposes. This involves building mechanisms to deploy our software to enterprise customers effectively, fostering a foundation of technical excellence.
  • Troubleshoot and resolve issues related to operating Weights & Biases across different types of infrastructure
  • Understand the nuances of various infrastructure types and implement suitable solutions catering to the unique needs of each enterprise customer
  • Leverage technical skills, knowledge, and adaptability to effectively navigate the complex landscape of different infrastructures

What we’re looking for (Requirements)

  • 5+ years of software development experience in an enterprise software environment
  • Proficiency in GoLang programming.
  • Deep understanding of distributed systems.
  • Experience with Kubernetes required
  • Knowledge of monitoring and scaling services for distributed systems (including Datadog, New Relic, Open Telemetry, Prometheus, etc)
  • Familiarity with at least one major cloud provider (AWS, Azure, Google).
  • Active collaboration with team members to align and achieve departmental goals.

Our Benefits

  • 🏝️ Flexible time off
  • 🩺 Medical, Dental, and Vision for employees and Family Coverage
  • 🏠 Remote first culture with in-office flexibility in San Francisco
  • 💵 Home office budget with a new high-powered laptop
  • 🥇 Truly competitive salary and equity
  • Supplemental benefits may be available depending on your location

We encourage you to apply even if your experience doesn't perfectly align with the job description as we seek out diverse and creative perspectives. Team members who love to learn and collaborate in an inclusive environment will flourish with us. We are an equal opportunity employer and do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. If you need additional accommodations to feel comfortable during your interview process, reach out at [email protected].


#LI-Remote

Top Skills

Go
The Company
HQ: San Francisco, CA
132 Employees
On-site Workplace
Year Founded: 2017

What We Do

Weights & Biases helps machine learning teams build better models faster. With a few lines of code, practitioners can instantly debug, compare and reproduce their models — architecture, hyperparameters, git commits, model weights, GPU usage, and even datasets and predictions — and collaborate with their teammates.

Similar Jobs

Buenos Aires, Ciudad Autónoma de Buenos Aires, ARG
289097 Employees
Buenos Aires, Ciudad Autónoma de Buenos Aires, ARG
289097 Employees
Buenos Aires, Ciudad Autónoma de Buenos Aires, ARG
289097 Employees
Buenos Aires, Ciudad Autónoma de Buenos Aires, ARG
289097 Employees

Similar Companies Hiring

InCommodities Thumbnail
Renewable Energy • Machine Learning • Information Technology • Energy • Automation • Analytics
Austin, TX
234 Employees
RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
Charlotte, North Carolina
53 Employees
Hedra Thumbnail
Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
San Francisco, CA
14 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account