We are looking for a talented Site Reliability Engineer (SRE) with a deep interest in distributed systems, cloud computing and the architecture of large-scale systems. The Senior SRE will ensure our InsightIDR services have the ultra-high reliability and uptime necessary to meet our customers' needs.
About the Team
Our InsightIDR product helps identify and address key cybersecurity risks to our customers. We apply AI, ML, threat intelligence, and BI to event sources, including desktops, servers, network switches, firewalls, cloud services, directory servers, DHCP servers, and SIEMs in order to distill hundreds or thousands of daily events per customer into the few real, high priority threats that need attention. Our systems ingest large amounts of data that need to be highly available and performant at all times.
Some of the technologies we use include:
Java, Python, Cassandra, MySQL/RDS, Redis, ElasticSearch, Kafka, AWS (EC2, S3, CloudFormation, etc.), Zookeeper, Terraform, Jenkins, Artifactory, Chef, Puppet, Ansible, Kubernetes,....
About the Role
As SRE, you will work closely with our engineering team and partner teams throughout Rapid7 to help solve extremely challenging problems at a massive scale.
In this role, you will:
- Support services before they go live through activities such as design, deployment, migration strategy, monitoring, and playbook reviews
- Maintain services once they are live by measuring and monitoring availability, latency, and overall system health
- Scale systems through automation, driving service and infrastructure improvements as well as other ways
- Troubleshoot production issues and liaise with relevant Engineering or Infrastructure teams to find a resolution
- Participate in on-call support, and incident response follow-ups such as post-mortems
- Work closely with Engineering teams, Architecture, Infrastructure and Product teams to improve the lifecycle of the InsightIDR services - from inception, design, deployment, operations, monitoring, security, upgrade and maintenance
- Mentor and coach team members
- Continuously develop and refine your own skill set
The skills you'll bring include:
- Bachelor's degree in Computer Science, STEM-related field, or 3+ years industry experience
- 3+ years of experience in Unix/Linux systems, IP networking, performance and application issues, RESTFul architectures, database operation and optimization
- 3+ years of experience programming in one or more of the following languages: Java, Python, C, C++, Go, Rust, Ruby
- Knowledge of Public Cloud Providers (AWS, Azure, GCP)
- Strong written and verbal communication skills
Nice-to-have:
- 3+ years of experience in SRE or DevOps
- Knowledge in AWS services, including EC2, RDS, VPC, networking, S3, MSK, etc.
We know that the best ideas and solutions come from multi-dimensional teams. That's because these teams reflect a variety of backgrounds and professional experiences. If you are excited about this role and feel your experience can make an impact, please don't be shy - apply today.
About Rapid7
At Rapid7, we are on a mission to create a secure digital world for our customers, our industry, and our communities. We do this by embracing tenacity, passion, and collaboration to challenge what's possible and drive extraordinary impact.
Here, we're building a dynamic workplace where everyone can have the career experience of a lifetime. We challenge ourselves to grow to our full potential. We learn from our missteps and celebrate our victories. We come to work every day to push boundaries in cybersecurity and keep our 10,000 global customers ahead of whatever's next.
Join us and bring your unique experiences and perspectives to tackle some of the world's biggest security challenges.
#LI-SIM
Top Skills
What We Do
We do this by embracing tenacity, passion, and collaboration to challenge what’s possible and drive extraordinary impact.
Here, we’re building a dynamic workplace where everyone can have the career experience of a lifetime. We challenge ourselves to grow to our full potential. We learn from our missteps and celebrate our victories. We come to work every day to push boundaries in cybersecurity and keep our 11,000+ global customers ahead of whatever’s next.
Why Work With Us
What makes us unique is how we embrace, model, and celebrate our core values. By challenging convention, being an advocate, creating impact together, always bringing our full selves, and recognizing that our work is never done, we are able to make an extraordinary impact on our business, our industry, and our own career growth.
Gallery
Rapid7 Offices
Hybrid Workspace
Employees engage in a combination of remote and on-site work.
Our default working model is hybrid, with employees working three days per week in the office. This approach underpins our commitment to flexibility and adaptability while supporting our dedication to development, teamwork and customer purpose.