Staff Site Reliability Engineer Cloud Platform

Posted 2 Days Ago
Be an Early Applicant
Redwood City, CA
Mid level
Artificial Intelligence • Machine Learning • Database
The Role
As a Staff Site Reliability Engineer at Zilliz, you will enhance reliability and performance of distributed database systems, develop SRE tools, automate operations, improve CI/CD pipelines, and support infrastructure while collaborating with software engineers and contributing to the Milvus community.
Summary Generated by Built In

About Zilliz

Zilliz is a fast-growing startup developing the industry’s leading vector database company for enterprise-grade AI. Founded by the engineers behind Milvus, the world’s most popular open-source vector database, the company builds next-generation database technologies to help organizations quickly create AI applications. On a mission to democratize AI, Zilliz is committed to simplifying data management for AI applications and making vector databases accessible to every organization.

 

What you will do:

  • Work at the intersection of development and site reliability. Creating SRE tools and systems, as well as supporting existing infrastructure and platforms.
  • Ensure the reliability, availability, and performance of Zilliz’s distributed database systems.
  • Develop and implement strategies for monitoring, incident management, and disaster recovery.
  • Automate system operations and maintenance tasks to improve efficiency and reduce manual intervention.
  • Design and build tools to manage and monitor infrastructure, ensuring scalability and robustness.
  • Collaborate with software engineers to enhance system reliability, scalability, and performance.
  • Maintain and improve the CI/CD pipeline to ensure smooth and rapid deployment of changes.
  • Actively contribute to the Milvus open-source community, focusing on improving reliability and operational efficiency.

What we are looking for:

  • 4+ years of experience in site reliability engineering or similar roles with a focus on cloud-native systems.
  • Proficiency in scripting languages such as Python, Go, or Java.
  • Strong knowledge of container orchestration technologies like Kubernetes and Docker.
  • Expertise with cloud platforms such as AWS, GCP, or Azure, and their respective monitoring and management tools.
  • Experience with infrastructure as code tools such as Terraform or Ansible.
  • Familiarity with CI/CD tools such as Jenkins, GitLab CI, or Argo.
  • Proven ability to troubleshoot complex distributed systems and resolve issues promptly.
  • Bachelor’s degree or above in computer science, software engineering, or other relevant disciplines.
  • Ability to thrive in a fast-paced, startup environment and handle multiple projects simultaneously.

Benefits:

  • Competitive compensation (cash + equity)

  • Regular bonus and equity refresh opportunities

  • Medical, dental, and vision insurance

  • Paid time off, including vacation, sick leave, and global reset/wellbeing days

  • Generous 401(k) and regional retirement plans

Compensation Range

$160,000$230,000 USD

Zilliz is committed to building an inclusive and diverse workforce. We are an Equal Opportunity Employer and welcome people from all backgrounds, experiences, abilities, and perspectives. All qualified applicants will receive consideration for employment regardless of race, color, national origin, religion, sexual orientation, gender, gender identity, age, physical disability, or length of time spent unemployed.

Top Skills

Go
Java
Python
The Company
HQ: San Francisco, CA
75 Employees
On-site Workplace
Year Founded: 2017

What We Do

Zilliz is a leading vector database company for production-ready AI. Built by the engineers who created Milvus, the world's most popular open-source vector database, Zilliz is on a mission to unleash data insights with AI. The company builds next-generation database technologies to help organizations rapidly create AI/ML applications, and unlock the potential of unstructured data. By taking the burden of complex data infrastructure management off of its users, Zilliz is committed to bringing the power of AI to every corporation, every organization, and every individual.

Headquartered in San Francisco, Zilliz is backed by a number of prestigious investors, including Aramco's Prosperity7 Ventures, Temasek's Pavilion Capital, Hillhouse Capital, 5Y Capital, Yunqi Partners, Trustbridge Partners and others. Zilliz's technologies and products help over 1000 organizations worldwide easily create AI applications in various scenarios, including computer vision, image retrieval, video analysis, NLP, recommendation engines, targeted ads, customized search, smart chatbots, fraud detection, network security, new drug discovery, and much more. Learn more at zilliz.com or follow @zilliz_universe.

Similar Jobs

Cisco Meraki Logo Cisco Meraki

Lead Site Reliability Engineer , Cloud Platform - Remote

Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
Easy Apply
Remote
San Francisco, CA, USA
3000 Employees
173K-242K Annually

Atlassian Logo Atlassian

Principal Site Reliability Engineer

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Remote
San Francisco, CA, USA
11000 Employees
167K-269K Annually

Cisco Meraki Logo Cisco Meraki

Lead Site Reliability Engineer - Remote

Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
Easy Apply
Remote
San Francisco, CA, USA
3000 Employees
173K-242K Annually

Atlassian Logo Atlassian

Site Reliability Engineer

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Remote
San Francisco, CA, USA
11000 Employees

Similar Companies Hiring

Halter Thumbnail
Software • Machine Learning • Internet of Things • Hardware • Greentech • Business Intelligence • Agriculture
Auckland City, NZ
150 Employees
InCommodities Thumbnail
Renewable Energy • Machine Learning • Information Technology • Energy • Automation • Analytics
Austin, TX
234 Employees
RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
Charlotte, North Carolina
53 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account