Senior Site Reliability Engineer - DevOps

Posted 7 Hours Ago
Easy Apply
Be an Early Applicant
London, Greater London, England
Hybrid
Senior level
Artificial Intelligence • Cloud • Information Technology • Machine Learning • Software
Hybrid Observability powered by AI
The Role
The Senior Site Reliability Engineer will maintain uptime, implement resilient applications, deploy production apps, monitor performance, ensure security, automate disaster recovery, and drive operational improvements. Responsibilities also include collaborating with engineers on architectural changes and participating in recruitment efforts.
Summary Generated by Built In

About Us:

We love going to work and think you should too. Our team is dedicated to trust, customer obsession, agility, and striving to be better everyday. These values serve as the foundation of our culture, guiding our actions and driving us towards excellence. We foster a culture of performance and recognition, allowing us to transform growth as we enable our employees to do the best work of their careers.

This position is located in London, England. Our office is situated in a core location near Waterloo and Blackfriars on the Southbank. Across the globe, our Centres of Energy serve as hubs where we accelerate productivity and collaboration, inspire creativity, and cultivate a culture of connection and celebration. Our teams coordinate their time in Centres of Energy to reflect how they work best.

What You'll Do:

LM Envision, LogicMonitor's leading hybrid observability platform powered by AI, helps modern enterprises gain operational visibility into and predictability across their IT stacks, so they can continue to deliver extraordinary employee and customer experiences. LogicMonitor has a layered approach to intelligence, where AI and Machine Learning is baked into every facet of the LM Envision platform to help IT teams improve efficiency, minimise alert fatigue, proactively predict trends, and maximise enterprise growth and transformation. 

Our customers love LogicMonitor's ability to bring cloud and traditional IT together into one view, as seen in minimal churn rates, expansion business, and exciting new customer references. In fact, LogicMonitor has received the highest Net Promoter Score of any IT Infrastructure Management provider. LogicMonitor also boasts high employee satisfaction. We have been certified as a Great Place To Work®, and named one of BuiltIn's Best Places to Work for the sixth year in a row! 

This role will take a lead in the operational uptime and continued expansion of LM Edwin AI infrastructure by serving as a facilitator of operational excellence. Responsibilities include designing and implementing new production deployments of SOA-based software across cloud datacentres, as well as providing guidance on organizing, securing and automating existing infrastructure and deployments. This position involves working with developers and providing feedback to drive operational performance improvements within the LM platform and operations infrastructure.

Here's a closer look at this key role:

  • Maintain uptime of LogicMonitor's (Edwin AI) SaaS-based service and drive technical/process enhancements to improve uptime.
  • Lead efforts to design and implement resilient IT applications using DevOps and SRE principles.
  • Deploy production applications and drive improvements to the deployment process.
  • Monitor system performance and troubleshoot issues to ensure high availability and reliability.
  • Design and deploy new application components .
  • Design and deploy new infrastructure components and integrations.
  • Ensure security of the production environment.
  • Develop and implement automated disaster recovery processes to minimise system downtime.
  • Identify opportunities for improvement in system performance, deployment speed, and scalability.
  • Write high-quality code to automate various aspects of infrastructure maintenance and and deployment.
  • Support engineering and work closely with engineers to drive operational and architectural/design changes.
  • Own, manage, and execute multiple large and technically complex projects across teams.
  • Providing alignment between business objectives and the team's pursuit of technology improvements.
  • Contribute to remediation actions relating to service disruptions and outages.
  • Provide direct technical guidance to help team members achieve goals and improve their productivity.
  • Participate in the recruitment and hiring of new engineers.
What You'll Need:
  • 5+ years as a DevOps Engineer or SRE with designing and implementing resilient IT applications using DevOps and SRE principles.
  • Good understanding of Linux system administration and 3+ years of hands-on experience.
  • Good understanding of networking technologies.
  • Experience building IaC automations using Terraform.
  • Production experience of containers and container orchestration tools (Docker/Kubernetes).
  • Good understanding of Amazon Web Services
  • Experience of designing/implementing CI/CD pipelines including production deployments.
  • Experience building and working with logging and metrics solutions such as Prometheus.
  • Experience programming with RESTful web services.
  • Proficient Python developer.
  • Well-versed in security principles, both systems and network.
  • Excellent written and verbal communications skills with a track record of improving documentation and processes.
  • Experience is carrying out complex problem determination and Root Cause Analysis across complex distributed systems.

Click here to read our International Applicant Privacy Notice.

LogicMonitor is an Equal Opportunity Employer
At LogicMonitor, we believe that innovation thrives when every voice is heard and each individual is empowered to bring their unique perspective. We’re committed to creating a workplace where diversity is celebrated, and all employees feel inspired and supported to contribute their best.

For us, equal opportunity means fostering a truly inclusive culture where everyone has the chance to grow and succeed. We don’t just open doors; we invite you to step through and be part of something bigger. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.

#LI-KW1 #BI-Hybrid #LI-Hybrid

Top Skills

Amazon Web Services
Ci/Cd
DevOps
Docker
Kubernetes
Linux
Prometheus
Python
Restful
Sre
Terraform

What the Team is Saying

Jude
Kenyon
Maddie
Franky
David
Kwame
Crystal
The Company
HQ: Santa Barbara, CA
1,100 Employees
Hybrid Workplace
Year Founded: 2007

What We Do

LogicMonitor® offers hybrid observability powered by AI. The company’s SaaS-based platform, LM Envision, enables observability across on-prem and multi-cloud environments. We provide IT and business teams operational visibility and predictability across their technologies and applications to focus less on troubleshooting and more on delivering extraordinary employee and customer experiences. For more information, visit www.logicmonitor.com.

Why Work With Us

We love going to work and think you should too. We are customer-obsessed, work as one agile team, and strive to be better every day while building trust. These are our core values. So it's no surprise that we work hard and genuinely have fun working with each other as we expand our global presence and achieve record-breaking success.

Gallery

Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery

LogicMonitor Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

We call our offices Centers of Energy, because they’re where we accelerate work, spark creativity, and ignite our culture of connection and celebration. Our teams coordinate their time in Centers of Energy to reflect how they work best.

Typical time on-site: Flexible
Company Office Image
HQSanta Barbara, CA
Company Office Image
Singapore
Company Office Image
Austin, TX
Company Office Image
Boston, MA
Company Office Image
London, UK
Company Office Image
Pune, IN
Company Office Image
San Francisco
Company Office Image
Sydney, Australia
Learn more

Similar Jobs

LogicMonitor Logo LogicMonitor

Sr. Customer Technical Architect

Artificial Intelligence • Cloud • Information Technology • Machine Learning • Software
Easy Apply
Hybrid
London, Greater London, England, GBR
1100 Employees

LogicMonitor Logo LogicMonitor

Sr. Customer Technical Architect

Artificial Intelligence • Cloud • Information Technology • Machine Learning • Software
Easy Apply
Hybrid
London, Greater London, England, GBR
1100 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account