Site Reliability Engineer - Observability

Posted 14 Days Ago
Be an Early Applicant
Hiring Remotely in Tokyo
Remote
Mid level
Fintech • Payments • Software • Financial Services
The Role
The Site Reliability Engineer (SRE) will design and maintain the observability stack, ensuring system reliability and performance while collaborating with engineering teams to implement monitoring best practices.
Summary Generated by Built In

Description
About KOMOJU

KOMOJU (by Degica) is the leading cross-border payment gateway for Japan. We power payments for companies like video game distribution platform Steam and the popular mobile app TikTok. Today we help thousands of merchants by providing them with the payment infrastructure they need through developer-friendly API’s to integrations on popular platforms like Shopify and Wix; we help our merchants grow in all markets they are expanding.


About the position

As our systems grow in complexity, scale, and traffic, maintaining their reliability and availability becomes increasingly challenging—and critical. We're looking for a Site Reliability Engineer (SRE) with a focus for observability to help us meet these demands.

In this role, you'll be at the forefront of ensuring that our infrastructure is not just running, but understandable and measurable. Observability is a core pillar of our reliability strategy—it's how we detect issues before they impact our merchants and users, quickly understand the root causes of incidents, and continuously improve our systems performance and reliability.

You’ll design and evolve our observability platform, including metrics, logging, tracing, and alerting, and partner with development teams to embed observability into every stage of the software lifecycle. Your work will directly impact our ability to scale confidently and respond to incidents swiftly.

This is a key role for someone who wants to build resilient systems, empower teams with actionable insights, and make a real difference in how we operate at scale.

While we are a remote-first company, this position is based in Tokyo, and we expect candidates to be willing to relocate to Japan.

Responsibilities

  • Design, implement, and maintain our observability stack (metrics, logging, tracing, dashboards).
  • Define and monitor SLIs/SLOs to ensure service health and reliability.
  • Correspond with engineering teams to instrument applications for better visibility.
  • Build and maintain dashboards and alerts that provide actionable insights and minimize alert fatigue.
  • Troubleshoot system performance and reliability issues using observability data.
  • Educate and guide engineering teams on best practices in monitoring, alerting, and incident response.
  • Contribute to postmortems and continuously improve system transparency and resiliency.
Requirements
  • 3+ years in SRE roles.
  • Hands-on experience with observability tools, preferably Datadog.
  • Proficiency in Terraform.
  • Background in software development.
  • Proficiency in at least one scripting or programming language (Ruby/Rails, Python, Go, Shell Script, etc.).
  • Experience working with AWS.
  • Familiarity with monitoring design principles: RED, USE, SLI/SLO, alert tuning.
  • Ability to analyze logs, metrics, and traces to diagnose issues and identify trends.

Nice to have

  • Knowledge of CI/CD pipelines and integrating observability into build and deploy processes.
  • Familiarity with incident response, on-call rotations, and post-incident reviews.
  • Business-level Japanese.
Benefits
  • At Degica, we embrace remote work while also offering office space for those who prefer in-person collaboration
  • 10 days regular vacation, additional 5 days summer and 5 days winter vacation
  • Paid birthday holiday
  • Budget for self-learning allowance, to ensure our employees’ skills remain current
  • Language training for Japanese

Top Skills

AWS
Ci/Cd
Datadog
Go
Python
Ruby
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Musashino
75 Employees
On-site Workplace
Year Founded: 2005

What We Do

Overview
Headquartered in Tokyo, Japan, Degica is a leading payment service provider of International digital commerce solutions: our ePayment Platform "KOMOJU" provides global businesses and developers the access they need to grow and succeed in the Japanese market and other markets such as Asia,Europe.

We build and manage online sales for Retail, Digital, and Gaming companies that are looking to establish and expand their business presence in Japan and South Korea.

Japan has a complex digital eco-system, with multiple payment methods and platforms often incomprehensible to outsiders. Degica offers a gateway to the market through its customizable tools and services, whether you are launching a completely new product or trying to enhance your current reach with consumers.

Company Mission
Degica is your “digital cart” in Japan. We are the link you need to connect your product or service with local customers, integrating digital payment and platforms with partners across our wide network.

Company culture
Everyone at Degica loves Japan and cares about how it interacts with the rest of the world. We believe in a borderless world, and want to play a lead role in bringing Japan to the world and the world to Japan. All of us love games and all things digital and it is in this space that we excel.

Come to see our corporate pages for more information:
www.degica.com
www.komoju.com
konbini.co.jp

Similar Jobs

GitLab Logo GitLab

Intermediate Backend (Go) Engineer, Runway

Cloud • Security • Software • Cybersecurity • Automation
Easy Apply
Remote
4 Locations
2350 Employees

GitLab Logo GitLab

Zuora IT Enterprise Applications Engineer

Cloud • Security • Software • Cybersecurity • Automation
Easy Apply
Remote
5 Locations
2350 Employees

Sonar Logo Sonar

Developer Advocate

Cloud • Information Technology • Security • Software
Easy Apply
Remote
Hybrid
5 Locations
597 Employees

Udemy Logo Udemy

Senior Solutions Engineer

Artificial Intelligence • Consumer Web • Edtech • Enterprise Web • HR Tech • Social Impact • Generative AI
Easy Apply
Remote
Japan
1500 Employees

Similar Companies Hiring

True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
53 Employees
Red 6 Thumbnail
Virtual Reality • Software • Hardware • Defense • Aerospace
Orlando, Florida
113 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account