The Role
The Observability Engineer will design, deploy, and maintain observability systems, manage monitoring and alerting, implement logging and tracing solutions, analyze system performance, and automate observability tasks. The role requires collaboration with development and operations teams and involves documentation and reporting.
Summary Generated by Built In
Roles & Responsibilities
Key Responsibilities
- Observability Systems Management:
- Design, deploy, and maintain observability tools and platforms, including monitoring, logging, and tracing systems.
- Ensure optimal configuration and performance of observability tools such as Prometheus, Grafana, ELK stack (Elasticsearch, Logstash, Kibana), Jaeger and cloud (AWS/GCP/Azure) Observability Tools.
- Monitoring and Alerting:
- Develop and manage dashboards and alerts to monitor the health and performance of applications and infrastructure.
- Implement robust alerting mechanisms to detect and notify of anomalies, outages, and system performance issues in real-time.
- Logging and Tracing:
- Implement centralized logging solutions to aggregate logs from various systems and applications.
- Develop and maintain distributed tracing solutions to provide end-to-end visibility into system transactions.
- Performance Analysis and Optimization:
- Analyze system performance metrics and identify bottlenecks and performance degradation. Understanding of SLOs and SLIs
- Work with development and operations teams to remediate performance issues and optimize system performance.
- Automation and Scripting:
- Create automation scripts to streamline observability tasks and processes.
- Develop self-healing mechanisms through automated incident response.
- Collaboration and Communication:
- Work closely with development, operations, and SRE teams to align observability solutions with business and technical requirements.
- Provide guidance and training on observability tools and best practices to other team members.
- Documentation and Reporting:
- Create and maintain detailed documentation for observability systems, processes, and procedures.
- Generate periodic reports and dashboards to provide insights into system performance and reliability.
Qualifications and Experience
- Education: Bachelor's degree in Computer Science, Information Technology, or a related field. Advanced degree preferred.
- Experience:
- Minimum of 7+ years of experience in IT infrastructure, with at least 3+ years in a observability or monitoring role.
- Proven experience in observability engineering, including deploying and managing observability solutions.
- Experience with monitoring tools (e.g., Prometheus, Grafana), logging tools (e.g., ELK stack), and tracing tools (e.g., Jaeger, OpenTelemetry).
- Experience with cloud platforms such as AWS, Azure, or Google Cloud and Database like MySQL.
- Technical Skills:
- Strong understanding of observability concepts including metrics, logging, and tracing.
- Proficiency in scripting languages such as Bash, Python, Perl or Go.
- Familiarity with containerization (e.g., Docker) and orchestration tools (e.g., Kubernetes) and CI/CD pipelines.
- Understanding of IP Network and monitoring on Network device (e.g. Router, Firewall).
- Experience with infrastructure as code tools (e.g., Terraform, Ansible).
- Soft Skills:
- Excellent problem-solving and analytical skills.
- Strong communication and collaboration skills.
- Ability to work independently and in a team-oriented environment.
- Preferred Qualifications:
- Experience with APM tools like New Relic, Datadog, or Dynatrace.
- Knowledge of service mesh technologies (e.g., Istio).
- Open-source contributions or relevant certifications in observability tools and methodologies.
What is in it for you?
- You get to build the next leading edge connected vehicle platform and internet of things platform
- The ability to collaborate with our highly skilled groups who work with cutting edge technologies
- High visibility as you support the systems that drive our public facing services
- Career growth opportunities
Aeris walks the walk on diversity. We’re a brilliant mix of varying ethnicities, religions, cultures, sexual orientations, gender identities, ages and professional/personal/military experiences – and that’s by design. Diverse perspectives are essential to our culture, innovative process and competitive edge. Aeris is proud to be an equal opportunity employer.
Top Skills
Bash
Go
Perl
Python
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.
Resume Uploaded Successfully
The Company
What We Do
Global organizations and service providers rely on Aeris IoT technology every day to deliver mission-critical IoT programs.