Site Reliability Engineer (Pinot)

Posted 22 Days Ago
Hiring Remotely in US
Remote
Senior level
Software
The Role
As a Site Reliability Engineer at StarTree, you will manage and tune large-scale distributed systems, specifically Apache Pinot. Your responsibilities include monitoring system performance, collaborating with customers to resolve incidents, executing disaster recovery strategies, and influencing system roadmaps based on troubleshooting experiences.
Summary Generated by Built In

At StarTree we're a group of passionate individuals that desire to improve the lives of many by developing tools and technologies that support availability and speed in the world of real-time analytics. 

Our aim is to make it simple for every company to delight their users - external and internal - and create new revenue streams from their data, by building the world’s most comprehensive and accessible cloud analytics system.

About the role:

StarTree is seeking exceptional Site Reliability Engineers for Pinot (SRE- Pinot), to manage, tune, and debug the large-scale highly available distributed systems. You will be working with a team of passionate and talented engineers in the automation, tuning, and troubleshooting of Apache Pinot. We are looking for motivated, hardworking, and focused individuals who have a real passion for operational excellence, data systems, and automation.

Responsibilities:

  • Leverage various monitoring and alerting services to solve intricate programming problems at scale.
  • Manage and tune multiple critical customer-facing Apache Pinot clusters
  • Monitor availability, read/write latencies, and other key telemetry to proactively identify SLO misses and help mitigate issues
  • Build a rapport with and work closely with customers to mitigate and resolve incidents
  • Execute disaster recovery strategies with minimal downtime
  • Collaborate with other engineers to understand and troubleshoot systems and use the experience gained to influence the roadmap of other teams
  • Debugging Pinot queries and ingestion when incidents occur.

Requirements:

  • 5+ years of experience as an engineer (SRE, SDET, or development)
  • Experience managing highly available production-facing distributed systems and in-depth knowledge of Java are a plus
  • Exposure with cloud platforms such as AWS, GCP, or Azure is a plus
  • Experience with Kubernetes and container orchestration is a plus
  • Familiarity with streaming systems, such as Kafka, Pulsar, Flume, Flink, Spark, or similar
  • Strong troubleshooting and critical thinking skills
  • Experience working with managing Apache Pinot is preferred
  • Experience building Java apps is required. 
  • Experience with zookeeper is a huge plus 

The base salary range for this US full-time position is $140,000 - $190,000 subject to standard withholding and applicable taxes. Additionally, new hires receive competitive and compelling equity grants, and access to a comprehensive benefits offering. The base salary range reflects the minimum and maximum target for candidates. The Salary and Equity compensation offered may vary depending on factors including: location, skills, experience, and the assessment process. 

About StarTree:  

StarTree is a cloud-based software company that enables business customers to derive advanced insights from real-time and historical data. StarTree was founded by the core software engineering team and inventors of Apache Pinot, which currently powers hundreds of user-facing applications at companies across industries, including LinkedIn, Uber, Target, 7Eleven, Etsy, Walmart, WePay, Factual, Weibo, and more. StarTree Cloud has enabled even more companies to deploy and operate real-time analytics at scale, including Stripe, Sovrn, Roadie, Just Eat Takeaway.com, Dialpad, Guitar Center, Blinkit, and more.

StarTree recently announced our Series B Funding with investment from GGV Capital, Sapphire Ventures, Bain Capital Ventures, and CRV. We have been named one of The Information's 50 Most Promising Startups and one of CRN's 10 Coolest Cloud Computing Startup Companies of 2022!

Top Skills

Java
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Mountain View, CA
60 Employees
On-site Workplace
Year Founded: 2019

What We Do

When you hear “decision maker,” it's natural to think, “C-suite,” or “executive.” But these days, we’re all decision-makers. So, yes, the CEO is a crucial decider, but the franchise owner, the student, or the big box shopper all have important choices to make, too. The difference is, those traditional decision-makers have access to relevant data to guide their thinking. All the others? Flying blind.

Our vision is to change that. We believe every decision-maker should have access to fast, fresh, actionable insights. We’re motivated to unlock what’s possible when you put the right information in the right hands at the right time. Like the restaurant owner who’s able to call in an extra cook because she can see the surge in orders coming. Or the holiday shopper who has plenty of time to pick a different gift for his loved one because he sees his original order is delayed at the moment the supplier registers a procurement delay. Or the use-case we can’t even envision yet, because it is sitting in your data, waiting to be set free.

Exposing timely information to real decision makers in intuitive apps that not only inform, but allow you to act on the information being shared. That’s how we’re unleashing the power of user-facing analytics.

Similar Jobs

GitLab Logo GitLab

Intermediate Site Reliability Engineer, US Public Sector Services

Cloud • Security • Software • Cybersecurity • Automation
Easy Apply
Remote
US
2350 Employees
104K-222K Annually

The PNC Financial Services Group Logo The PNC Financial Services Group

Site Reliability Engineer - Tempus

Machine Learning • Payments • Security • Software • Financial Services
Remote
USA
56000 Employees

DFIN Logo DFIN

Principal Site Reliability Engineer - Cloud (Remote)

Artificial Intelligence • Fintech • Information Technology • Software • Data Privacy
Remote
United States
2600 Employees

GitLab Logo GitLab

Intermediate Site Reliability Engineer, FinOps

Cloud • Security • Software • Cybersecurity • Automation
Easy Apply
Remote
29 Locations
2350 Employees

Similar Companies Hiring

HERE Technologies Thumbnail
Software • Logistics • Internet of Things • Information Technology • Computer Vision • Automotive • Artificial Intelligence
Amsterdam, NL
6000 Employees
True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
52 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account