Sr. Site Reliability Engineer
Location: New York, NY - 3 days per week on-site (Required)
Who We Are
DoubleVerify (DV) is a leading independent provider of marketing measurement software, data, and analytics. We authenticate the quality and effectiveness of digital media for the world’s largest brands and media platforms, ensuring media transparency and accountability. Since 2008, DV has empowered hundreds of Fortune 500 companies to maximize their media investments by delivering best-in-class solutions across the digital ecosystem, contributing to a stronger, safer, and more secure digital advertising industry. Learn more at www.doubleverify.com.
Position Overview
As a Senior Site Reliability Engineer (SRE) at DoubleVerify, you will play a critical role in building and scaling our SRE team. This dual-role position requires both hands-on technical expertise and a passion for mentoring and educating team members. You will be responsible for implementing and promoting SRE best practices, including the development and monitoring of Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs). Your contributions will ensure the reliability, scalability, and performance of our digital media measurement platforms, directly impacting our mission of delivering media transparency and accountability.
Responsibilities
- Team Development and Mentorship: Build and grow the SRE team by recruiting, mentoring, and educating team members on SRE principles, promoting a culture of reliability and automation.
- Technical Contributor: Contribute directly to the design, implementation, and maintenance of highly available infrastructure and services, with a focus on automation to minimize manual intervention.
- SLA/SLO/SLI Management: Define, monitor, and report on SLIs, SLOs, and SLAs to ensure alignment with business objectives and user expectations. Use these metrics to drive reliability improvements and guide decision-making.
- Incident Management and Response: Develop and implement robust incident response processes, including on-call rotations and post-incident reviews, to minimize downtime and prevent recurrence.
- Collaboration and Communication: Partner closely with development, operations, and product teams to integrate reliability into the software development lifecycle, promoting cross-functional collaboration.
- Continuous Improvement: Analyze system performance data to identify areas for improvement, implementing solutions to enhance reliability, scalability, and efficiency.
Requirements
- Experience: 5+ years in site reliability engineering, DevOps, or a related field, with experience mentoring and educating other engineers.
- Technical Proficiency: Expertise in Linux/Unix systems administration, cloud platforms (AWS, GCP, or Azure), and container orchestration tools like Kubernetes.
- Programming Skills: Proficiency in scripting and programming languages such as Python, Go, or Bash for automation and tool development.
- Monitoring and Observability: Experience with monitoring and logging tools such as Prometheus, Grafana, Splunk, or Nagios. Proven ability to develop and track SLIs, SLOs, and SLAs.
- Automation and Infrastructure as Code: Hands-on experience automating infrastructure and deployments using tools like Terraform, Ansible, or Chef.
- Communication and Mentorship: Strong verbal and written communication skills, with a passion for mentoring and educating team members on technical concepts and SRE best practices.
- Problem-Solving Aptitude: Exceptional analytical skills with a proactive approach to identifying and resolving system issues.
- Team Collaboration: Ability to work both independently and collaboratively within a team environment.
Preferred Qualifications
- Advanced Education: Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
- Certifications: Relevant industry certifications such as AWS Certified DevOps Engineer, Google Professional Cloud DevOps Engineer, or Certified Kubernetes Administrator (CKA).
- Security Awareness: Familiarity with security best practices in cloud and containerized environments.
- Configuration Management: Experience with infrastructure as code and configuration management tools like Terraform, Ansible, or Chef.
Why Join Us
At DoubleVerify, we are committed to fostering an inclusive and dynamic workplace where employees can bring their authentic selves to work. We value passion, accountability, collaboration, and innovation, believing that diverse perspectives drive better business outcomes. Join us to contribute to a mission-driven company dedicated to enhancing the digital advertising ecosystem.
DoubleVerify is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.
The successful candidate’s starting salary will be determined based on a number of non-discriminating factors, including qualifications for the role, level, skills, experience, location, and balancing internal equity relative to peers at DV.
The estimated salary range for this role based on the qualifications set forth in the job description is between [$107,000- $231,000]. This role will also be eligible for bonus/commission (as applicable), equity, and benefits.
The range above is for the expectations as laid out in the job description; however, we are often open to a wide variety of profiles, and recognize that the person we hire may be more or less experienced than this job description as posted.
Not-so-fun fact: Research shows that while men apply to jobs when they meet an average of 60% of job criteria, women and other marginalized groups tend to only apply when they check every box. So if you think you have what it takes but you’re not sure that you check every box, apply anyway!
Top Skills
What We Do
DV is powering the new standard of marketing performance, giving advertisers clarity and confidence in their digital investment. Built on best practices, DV solutions create value for media buyers and sellers by bringing transparency and accountability to the market, ensuring ad viewability, brand safety, fraud protection, accurate impression delivery and audience quality across campaigns to drive performance. Since 2008, DV has helped hundreds of Fortune 500 companies gain the most value out of their media spend by delivering best in class solutions across the digital ecosystem that help build a better industry.
Learn more at doubleverify.com.