Sr Site Reliability Engineer, Platform Engineering

Posted 21 Days Ago
Hiring Remotely in United States
Remote
186K-251K Annually
Senior level
Cloud • Information Technology
The Role
As a Senior Site Reliability Engineer, you'll ensure the efficiency and scalability of the real-time microservices-based infrastructure, collaborate with various engineering teams for operational solutions, and contribute to code and documentation for system improvements.
Summary Generated by Built In
Who we are

Kentik is the network observability company. Our platform is a must-have for the network front line, whether digital business, corporate IT, or service provider. Network professionals turn to the Kentik Network Observability Cloud to plan, run, and fix any network, relying on our infinite granularity, AI-driven insights, and insanely fast search.

Kentik makes sense of network, cloud, host, and container flow, Internet routing, performance tests, and network metrics. We show network pros what they need to know about their network performance, health, and security to make their business-critical services shine. Networks power the world’s most valuable companies, and those companies trust Kentik. Market leaders like IBM, Box, and Zoom rely on Kentik for network observability. Visit us at kentik.com and follow us at @kentikinc.

What we do

Kentik's Platform Engineering group is responsible for storing, enriching, and querying traffic metadata and metrics from the world's largest networks. Our platform actively monitors infrastructure, triggers automated responses to outages and attacks, and plays a critical role in delivering complete network observability to our customers. Our platform ingests trillions of records and serves hundreds of thousands of queries for our users each day. The scale that this group services is tremendous.  


Platform Engineering consists of 4 teams; (1) Ingest and Storage, (2) Query, (3) Network Data, (4) SRE. As a senior engineer in platform engineering, you will have the opportunity to co-own, design, and implement state of the art reliability engineering to help make sure our data-intensive platform continues to play a critical role for some of the most influential companies on the internet.
We have built a team of world-class engineers, network experts, and technology thought leaders in a remote-friendly culture from day one. While prior experience in a remote environment is not required, we highly value strong collaboration and communication skills, and a high level of independence and autonomy.

What you'll do

  • Ensure our real-time, scalable, microservices-based infrastructure is set up for growth and working efficiently. Our infrastructure runs on our own hardware, across multiple locations as well as all major cloud vendors
  • Work on tools and processes to better monitor our platform as well ensure its stability through our rapid growth
  • Deep-diving into diverse topics, from NetFlow and IP routing, to database replication strategies or HTTP optimization
  • Collaborate with engineering and infrastructure teams on finding solutions from an operational perspective
  • Contribute code, code reviews and tools or patches to all kinds of existing code
  • Write design documents or collaborate on colleagues’ docs to introduce new features or changes into our infrastructure
  • Provide valuable feedback on team goals, projects, and processes. We believe in continuously improving our team

What you'll bring

Studies have shown that some candidates tend to apply to jobs only if they meet 100% of the qualifications. We encourage you to apply if you meet most of the criteria - even if you don’t match all of the qualifications, your skills and experience could be valuable in this role!

  • 5+ years of experience in Systems Administration, Datacenter/IT and/or SRE related projects
  • Experience working with *nix system command line (e.g. ssh, grep, awk)
  • Detailed understanding of major internet protocols works (tcp/ip, dns, http, TLS)
  • Experience with or desire to learn about microservices, containers and orchestration
  • Networking administration experience: concepts such as routing, firewalls (iptables), peering sound familiar
  • A passion for documenting code, processes, and infrastructure in runbooks and wikis
  • Strong collaboration and communication skills. Kentik is a fully remote, global company - so we are looking for someone who can work well in an asynchronous environment using tools such as Slack, Zoom, Google Docs, Git, etc.
  • Worked with a configuration management (infrastructure as code) platform such as: Ansible, Puppet, Chef, SaltStack or CFEngine
  • Worked with metrics monitoring solutions such as grafana, prometheus, and OpenTelemetry
  • A strong preference towards automation - coding in Bash, Python, Ruby, or Go
  • Experience with public cloud (AWS, GCP, Azure, etc.) architectures and technologies management using Terraform.

Our tech stack

  • Our core data engine and platform are primarily written in Go
  • We use Node.js + Express for application serving, and React as our primary UI framework
  • We also use some JS and Python for tooling/scripting
  • In addition to our own database, we use Postgres, Kafka, Mysql, and Redis
  • Internal and public APIs expose both rest/json and gRPC endpoints
  • Haproxy, Envoy for API traffic routing and balancing
  • Github for source control, PRs, issues
  • Jenkins for automated builds

What we offer

Kentik is a fully remote company that operates globally. We seek professionals that will help us thrive as an organization, and in turn, to broaden and enhance your career. We’re very thorough in the interview process to understand your skills and how they will relate to your successful growth here at Kentik. Our compensation philosophy encompasses a fair program for all in order to attract, engage and retain talented individuals who will drive our business and wow our customers.

The compensation range for this position is: $186,000 - $251,000. This range reflects the low and high end of the U.S. compensation range Kentik reasonably and generally expects to pay the hired candidate in this role. The actual compensation offered may be lower or higher than the stated range depending on various factors, including but not limited to:

  • Experience with the skill sets required for success
  • Demonstrated competencies and potential 
  • A geographic market-based approach

In addition to a great career opportunity, Kentik offers stellar benefits for our employees, which include:

  • 100% of premiums are paid by company for health, vision and dental coverage for you and your dependents
  • Additionally, an annual Health Reimbursement Account (HRA) of $3,000 for an individual or $4,500 for a family
  • Paid family & medical leave 
  • Open PTO, a quarterly Wellness Day, and a minimum of 10 paid holidays
  • 401(k) retirement account
  • Home office reimbursement 
  • Stock options

Note: Benefits are as listed for all US full-time employees. For compensation, international applicants will be treated equitably in relation to the laws applicable within the countries in which we operate.

 

Come work with us

The true meaning of Kentik is visibility. We’re committed to making sure everyone feels empowered to use their voice, has a sense of belonging, and is represented at Kentik. 

We don’t look for individuals who fit the culture, but those who will continue to add to the culture.
We encourage everyone to apply, especially those individuals who are underrepresented in the industry: people of color, LGBTQI+ community, women, individuals with disabilities (both seen and unseen), veterans, and people of any age or family status. 

Kentik is committed to creating an inclusive interview process. If you require a reasonable accommodation during the application or interview process, please reach out to [email protected].

Come as you are!
You will be working at a fast-growing, well-funded startup alongside industry thought leaders and network aficionados as we build the future of observability and set the high bar for how network operations and digital businesses should run. With a competitive salary and amazing benefits on top of the meaningful and challenging projects you’ll take on, we’re sure you’ll enjoy joining the Kentik team.

#li-remote

The Company
HQ: San Francisco, CA
155 Employees
On-site Workplace
Year Founded: 2014

What We Do

Kentik is the network observability company. Our platform is a must-have for the network front line, whether digital business, corporate IT, or service provider. Network professionals turn to the Kentik Network Observability Cloud to plan, run, and fix any network, relying on our infinite granularity, AI-driven insights, and insanely fast search. Kentik makes sense of network, cloud, host, and container flow, internet routing, performance tests, and network metrics. We show network pros what they need to know about their network performance, health, and security to make their business-critical services shine. Networks power the world’s most valuable companies, and those companies trust Kentik.

Similar Jobs

Atlassian Logo Atlassian

Site Reliability Engineer

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Remote
San Francisco, CA, USA
11000 Employees

Atlassian Logo Atlassian

Principal Site Reliability Engineer

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Remote
San Francisco, CA, USA
11000 Employees
167K-269K Annually

Favor Delivery Logo Favor Delivery

Senior Site Reliability Engineer

Food • Logistics • Mobile • On-Demand • App development
Remote
Texas, USA
460 Employees

The PNC Financial Services Group Logo The PNC Financial Services Group

Infrastructure / Site Reliability Engineer (GOV) - Tempus

Machine Learning • Payments • Security • Software • Financial Services
Remote
USA
56000 Employees

Similar Companies Hiring

Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees
InCommodities Thumbnail
Renewable Energy • Machine Learning • Information Technology • Energy • Automation • Analytics
Austin, TX
234 Employees
RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
Charlotte, North Carolina
53 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account