Site Reliability Engineer (5667)

Posted 2 Days Ago
Be an Early Applicant
Hanscom Air Force Base, MA
Mid level
Information Technology • Consulting
The Role
As a Site Reliability Engineer, you will design and implement scalable systems, optimize performance, manage incidents, and oversee automation initiatives. You'll work with cross-functional teams to troubleshoot issues and ensure service availability while developing documentation and participating in on-call rotations.
Summary Generated by Built In

As Site Reliability Engineer, you’ll lead the design, implementation, and management of highly available and scalable systems, applying industry best practices and reliability engineering principles.

We know that you can’t have great technology services without amazing people. At MetroStar, we are obsessed with our people and have led a two-decade legacy of building the best and brightest teams. Because we know our future relies on our deep understanding and relentless focus on our people, we live by our mission: A passion for our people. Value for our customers.

If you think you can see yourself delivering our mission and pursuing our goals with us, then check out the job description below!

What you’ll do:

  • Collaborate with cross-functional teams to identify performance bottlenecks, troubleshoot complex issues, and optimize system performance to meet defined service level objectives.
  • Design and implement monitoring, alerting, and incident response strategies to proactively identify and mitigate potential issues, ensuring uninterrupted service availability.
  • Drive automation initiatives to streamline deployment, configuration management, and infrastructure provisioning processes.
  • Develop and maintain comprehensive documentation for system configurations, processes, and procedures.
  • Participate in on-call rotations and respond to incidents, working diligently to resolve issues and prevent recurrence.

What you’ll need to succeed:

  • Possess an active Secret U.S. Government security clearance or higher
  • Bachelor’s degree in Computer Science, Information Technology, or a related field.
  • Minimum of 3 years of professional experience in a Site Reliability Engineering role or similar capacity.
  • Strong experience with cloud technologies (e.g., AWS, Azure, GCP) and infrastructure as code (e.g., Terraform, Ansible).
  • Proficiency in managing, leading, and engineering incident and outage response
  • Strong engineering experience in network protocols (e.g., TCP/IP, DNS, HTTP/HTTPS, Load Balancing, etc.)
  • Proficiency in programming and scripting languages (e.g., Python, Go, Bash) and RPA (e.g. Blue Prism, UIPath) to automate tasks and develop tools.
  • Deep understanding of containerization and orchestration technologies (e.g., Kubernetes, Docker).
  • Expertise in implementing and managing monitoring and logging solutions (e.g., Splunk, Prometheus, Grafana, ELK stack).
  • Familiarity with CI/CD pipeline development and management (e.g., GitLab CI, Azure DevOps, AWS Lambda, Jenkins)
  • Proven track record of designing, building, and maintaining highly available and scalable systems.
  • Expert proficiency in developing automated functional, regression and performance tests and developing automated testing standards for development teams.
  • Experience facilitating change and configuration management processes to drive reliability.
  • Strong problem-solving skills, with the ability to diagnose complex issues and implement effective solutions.
  • Excellent communication skills, with the ability to collaborate effectively across diverse teams.

Like we said, we are big fans of our people. That’s why we offer a generous benefits package, professional growth, and valuable time to recharge. Learn more about our company culture code and benefits. Plus, check out our accolades.

Don’t meet every single requirement? 

Studies have shown that women, people of color and the LGBTQ+ community are less likely to apply to jobs unless they meet every single qualification.  At MetroStar we are dedicated to building a diverse, inclusive, and authentic culture, so, if you’re excited about this role, but your previous experience doesn’t align perfectly with every qualification in the job description, we encourage you to go ahead and apply.  We pride ourselves on making great matches, and you may be the perfect match for this role or another one we have. Best of luck! – The MetroStar People & Culture Team

What we want you to know:

In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification form upon hire.

MetroStar Systems is committed to creating a diverse environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status. The statements herein are intended to describe the general nature and level of work being performed by employees and are not to be construed as an exhaustive list of responsibilities, duties, and skills required of personnel so classified. Furthermore, they do not establish a contract for employment and are subject to change at the discretion of MetroStar Systems.

Not ready to apply now?

Sign up to join our newsletter here.

"EEO IS THE LAW MetroStar Systems, LLC (MetroStar) invites any employee and/or applicant to review the Company’s Affirmative Action Plan. This plan is available for inspection upon request by emailing [email protected]."

Top Skills

Bash
Go
Python
The Company
HQ: Reston, VA
250 Employees
On-site Workplace
Year Founded: 1999

What We Do

MetroStar is a digital services and management consulting company specializing in emerging technologies within the public sector. MetroStar is a mission accelerator - we embrace disruptions in tech to propel progress. Through our user-centric capabilities, we create new paths to government innovation and shape thoughtful outcomes for the people.

Similar Jobs

Kensho Technologies Logo Kensho Technologies

Software Engineer, Site Reliability

Artificial Intelligence • Fintech • Machine Learning • Natural Language Processing • Software • Generative AI
Cambridge, MA, USA
100 Employees

Klaviyo Logo Klaviyo

Lead Software Engineer - SRE

Consumer Web • eCommerce • Marketing Tech • Retail • Software • Analytics • Generative AI
Hybrid
Boston, MA, USA
2000 Employees
192K-288K Annually

Klaviyo Logo Klaviyo

Lead Site Reliability Engineer - Security

Consumer Web • eCommerce • Marketing Tech • Retail • Software • Analytics • Generative AI
Hybrid
Boston, MA, USA
2000 Employees
192K-288K Annually

Klaviyo Logo Klaviyo

Senior Site Reliability Engineer

Consumer Web • eCommerce • Marketing Tech • Retail • Software • Analytics • Generative AI
Hybrid
Boston, MA, USA
2000 Employees
157K-235K Annually

Similar Companies Hiring

Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Enterprise Web • Consulting • Cloud
Chicago, IL
45 Employees
InCommodities Thumbnail
Renewable Energy • Machine Learning • Information Technology • Energy • Automation • Analytics
Austin, TX
234 Employees
HERE Thumbnail
Software • Logistics • Internet of Things • Information Technology • Computer Vision • Automotive • Artificial Intelligence
Amsterdam, NL
6000 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account