Site Reliability Engineer - Big Data

Posted 2 Days Ago
Be an Early Applicant
Reston, VA
109K-147K
Mid level
Information Technology • Software
The Role
The Site Reliability Engineer will ensure stability and performance of big data platforms, automate processes, and collaborate with various teams to improve data systems.
Summary Generated by Built In

Verisign helps enable the security, stability, and resiliency of the internet. We are a trusted provider of internet infrastructure services for the networked world and deliver unmatched performance in domain name system (DNS) services. 

We are a mission focused, values driven company where each individual can contribute to building a stronger, more secure internet.  We offer a dynamic and flexible work environment with competitive benefits and the ability to grow your career.

Within Verisign, our team is responsible for building and managing Verisign Data Platform enabling the creation of large-scale, high-throughput (millions requests per second) data products and services delivering actionable operational and business intelligence. To help us advance the platform, we are looking for a highly skilled Mid-level Site Reliability Engineer (SRE).  This role will play a critical part in ensuring the stability, performance, and security of our data platforms

An ideal candidate should deeply care about big data systems and automation, be fluent in Infrastructure-as-Code, CI/CD, and be eager to learn as needed. The successful candidate should have an understanding of fundamentals, including core Computer Science concepts, operating systems, networking, file systems and databases accompanied by hands-on experience managing large-scale distributed systems. Acquiring these competencies typically requires an equivalent of a bachelor’s degree and 6 or more years of practical work experience. We are also open to other career paths.

The candidate will be involved in all aspects of the data platform, including ideation, design, implementation, deployment, customer onboarding and support. This implies regular cross-team collaboration with Data Engineering, Infrastructure, Engineering, Security, and Operation Teams. As part of the team, we expect the candidate to take ownership of the data platform, regularly interacting with the internal customers, proactively identifying, prioritizing, and delivering on their common data platform needs.

Key Responsibilities:

  • Architect, Design, deploy, monitor, and operate large scale data platforms like Hadoop, Kafka, Spark and Druid running both on physical servers and on top of Kubernetes
  • Participate in technical designs, Proof of Concepts for software solutions that combine Open-Source components, COTS (commercial off the shelf) components, and custom developed components
  • Deploy and manage Production releases with minimum supervision
  • Automate cluster provisioning (CI/CD, Infrastructure-as-Code), scaling, and monitoring using Ansible, Python, Jenkins, Terraform and other relevant tools
  • Build and deploy containerized applications using Docker and Kubernetes
  • Troubleshooting complex issues in large and distributed environments
  • Upgrading (including patching, deploying releases) large-scale data platforms improving system capabilities and security while ensuring minimal customer impact
  • Performance of occasional operations support functions, including problem isolation and resolution
  • Participate in the on-call rotation to monitor the health of the production systems and respond to incidents or customer needs
  • Ensuring platform SLOs by collecting, visualizing, and alerting on relevant telemetry
  • Supporting data platform customers and continuously improving the monitoring, performance, and functionality of the clusters
  • Staying up to date with the industry data platform best practices and standards, focusing on hybrid cloud environments

The candidate must have:

  • Bachelor’s degree in computer science or a related technical field, or equivalent combination of education and experience
  • 5+ years of experience managing big data platforms (Hadoop, Spark Kafka, Druid)
  • Excellent understanding of Linux configuration and administration
  • Strong automation experience - Not just developing automation, but knowing why we automate and what to automate
  • Strong understanding of infrastructure-as-code
  • Strong written and verbal communication skills – able to clearly and succinctly describe complex issues
  • Familiarity with networking protocols and systems

Desired Skills, Experience, and Attributes:

  • Experience with a high-level scripting language such as Python
  • Experience with RedHat Enterprise Linux and/or FreeBSD
  • Experience with network troubleshooting using such tools as ping, traceroute and dig
  • Deployment automation experience using tools such as Ansible
  • Experience working with teams using Kanban and/or Scrum a plus
  • Experience with Docker or Kubernetes in a production environment
  • Experience with OpenStack in a production environment
  • Experience administrating Unix systems in a large-scale environment
  • Experience using Jenkins in a continuous delivery and integration environment

This position is based in our Reston, VA office and offers a flexible, hybrid work schedule

The pay range is $108,900 - $147,300. 

The anticipated annual base salary range for this position is noted above, however, base pay offered may vary depending on job-related knowledge, skills, experience. Verisign offers a discretionary bonus which is based on individual and company performance, and certain roles may be eligible for discretionary stock awards.

Verisign is an equal opportunity employer. That means we recruit, hire, compensate, train, promote, transfer, and administer all terms and conditions of employment without regard to their race, color, religion, national origin, sex, sexual orientation, gender identity, age, protected veteran status, disability, or other protected categories under applicable law.

Additional Information:
Our Careers Page
Our Benefits Summary
Verisign in the Community
Our EEO Statement
Our Privacy Notice for Job Applicants/Candidates
Reasonable Accommodations

Staffing agency policy: No fees will be paid for unsolicited resumes submitted to Verisign or our employees by third parties.

Top Skills

Ansible
Docker
Druid
Hadoop
Jenkins
Kafka
Kubernetes
Linux
Python
Spark
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Reston, VA
1,286 Employees
On-site Workplace
Year Founded: 1985

What We Do

Verisign, a global provider of domain name registry services and internet infrastructure, enables internet navigation for many of the world’s most recognized domain names. Verisign enables the security, stability, and resiliency of key internet infrastructure and services, including providing root zone maintainer services, operating two of the 13 global internet root servers, and providing registration services and authoritative resolution for the .com and .net top-level domains, which support the majority of global e-commerce. To learn more about what it means to be Powered by Verisign, please visit Verisign.com.

Similar Jobs

Anduril Logo Anduril

Web Developer, Intelligence Systems

Aerospace • Artificial Intelligence • Hardware • Robotics • Security • Software • Defense
Reston, VA, USA
4500 Employees
138K-240K Annually

PwC Logo PwC

Forward Deployed Software Engineer-Palantir Foundry-Manager

Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Remote
Hybrid
67 Locations
370000 Employees
100K-232K Annually

The Aerospace Corporation Logo The Aerospace Corporation

Senior Software Systems Acquisition Engineer

Aerospace • Artificial Intelligence • Cloud • Machine Learning • Software • Cybersecurity • Defense
Hybrid
Chantilly, VA, USA
4600 Employees
40K-120K

The Aerospace Corporation Logo The Aerospace Corporation

Mod & Sim Engineer

Aerospace • Artificial Intelligence • Cloud • Machine Learning • Software • Cybersecurity • Defense
Hybrid
3 Locations
4600 Employees
86K-150K Annually

Similar Companies Hiring

True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
53 Employees
Red 6 Thumbnail
Virtual Reality • Software • Hardware • Defense • Aerospace
Orlando, Florida
113 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account