HPC Linux Operations Engineer

Posted 12 Days Ago
Be an Early Applicant
2 Locations
125K-175K Annually
Junior
Angel or VC Firm
The Role
The HPC Linux Operations Engineer provides support for Linux HPC compute environments, resolves problems reported by users, and manages the entire problem lifecycle. Responsibilities include operational support, writing diagnostic code, participating in maintenance operations, and collaborating across teams.
Summary Generated by Built In

Jump Trading Group is committed to world class research. We empower exceptional talents in Mathematics, Physics, and Computer Science to seek scientific boundaries, push through them, and apply cutting edge research to global financial markets. Our culture is unique. Constant innovation requires fearlessness, creativity, intellectual honesty, and a relentless competitive streak. We believe in winning together and unlocking unique individual talent by incenting collaboration and mutual respect. At Jump, research outcomes drive more than superior risk adjusted returns. We design, develop, and deploy technologies that change our world, fund start-ups across industries, and partner with leading global research organizations and universities to solve problems.

Core Development is a global team of technologists who architect, build and maintain our world-class trading platform. From optimizing our core trading engine to building custom hardware, we leverage software & hardware engineering, data science and research, to deliver the infrastructure and tools that drive our trading and business needs.

We are looking for an adaptable hands-on individual, passionate about the details and nuances of managing Linux HPC environments at scale, and eager to tackle complex and unpredictable operational work as their primary job function.


What You'll Do:

  • Provide front-line operational support for 24/7 Linux HPC compute, storage, and interconnects. Technologies involved include RDMA fabrics, parallel filesystems, HPC batch schedulers, FUSE filesystems, internal Jump software, multi-vendor hardware, cybersecurity requirements, a challenging and unpredictable client workload, and high user expectations.
  • Solve problem reports and questions posed by members of Jump's research community, escalating as needed and managing the entire problem lifecycle.
  • Respond to alerts in a timely fashion.
  • Participate in large, coordinated maintenance operations, including during evenings and weekends.
  • Work on global projects across a wide range of infrastructure.
  • Write code for diagnosing, resolving, and triaging difficult problems and automating frequently performed tasks.
  • Collaborate with team members and across teams to write code and testing infrastructures spanning both new and existing codebases in multiple programming languages.
  • Manage relationships with outside vendors, including traveling both domestically and internationally to meet with current and potential vendors.
  • Implement and support performance monitoring and fault monitoring systems.
  • Develop and improve systems and user documentation.
  • Develop and monitor the tools used to maintain a production computing environment.
  • Provide operational support as primary job function.
  • Adhere to all company cybersecurity and IT policies, including performing all work using only approved hardware and software.
  • Participate in an on-call rotation.
  • Other tasks as assigned or needed.
  • Work from company office an average of 5 days a week.
  • Must be willing to work a maintenance window of either Friday evening or Saturday morning.

Skills You'll Need:

  • A desire for operational work as primary job function.
  • At least 2+ years of professional experience with Linux systems.
  • High performance computing (HPC), including parallel filesystems (e.g., Lustre, GPFS), batch systems (e.g., Slurm, Grid Engine), and high-performance network interconnects experience is a plus, but not required.
  • High proficiency with at least one programming/scripting language (e.g., Go, Python, C) and ability to learn additional languages quickly.
  • Ability to perform root cause analysis.
  • Strong verbal and written communication skills, including the ability to communicate effectively and efficiently with both coworkers and third-party vendors.
  • Strong collaboration skills with a willingness to undertake tasks of various technologies and complexities.
  • Ability to independently manage complex projects and multiple workstreams.
  • Strong sense of urgency.
  • Willingness to perform regular operational maintenance work during evenings and weekends and as needed.
  • Ability to work effectively in a busy, open floor plan office environment.
  • Reliable and predictable availability.


Benefits

   - Discretionary bonus eligibility
   - Medical, dental, and vision insurance
   - HSA, FSA, and Dependent Care options
   - Employer Paid Group Term Life and AD&D Insurance
   - Voluntary Life & AD&D insurance
   - Paid vacation plus paid holidays
   - Retirement plan with employer match
   - Paid parental leave
   - Wellness Programs

Annual Base Salary Range

$125,000$175,000 USD

Top Skills

C
Go
Python
The Company
HQ: Chicago, IL
1,089 Employees
On-site Workplace
Year Founded: 1999

What We Do

Jump Trading is committed to world class research. We empower exceptional talents in Mathematics, Physics, and Computer Science to seek scientific boundaries, push through them, and apply cutting edge research to global financial markets.

Our culture is unique. Constant innovation requires fearlessness, creativity, intellectual honesty, and a relentless competitive streak. We believe in winning together and unlocking unique individual talent by incenting collaboration and mutual respect.

At Jump, research outcomes drive more than superior risk adjusted returns. We design, develop, and deploy technologies that change our world, fund start-ups across industries, and partner with leading global research organizations and universities to solve problems.

Similar Jobs

2 Locations
1488 Employees
125K-175K Annually

Caterpillar Logo Caterpillar

IT Architect

Artificial Intelligence • Cloud • Internet of Things • Software • Cybersecurity • Industrial
Hybrid
Peoria, IL, USA
100000 Employees
111K-196K Annually

Discover Logo Discover

Application Engineer(Splunk/Machine Learning)

Cloud • Fintech • Machine Learning • Analytics • Financial Services
Hybrid
Chicago, IL, USA
18000 Employees
87K-146K Annually

Capital One Logo Capital One

Senior Manager, Software Engineering, Full Stack

Fintech • Machine Learning • Payments • Software • Financial Services
Hybrid
Chicago, IL, USA
55000 Employees
205K-234K Annually

Similar Companies Hiring

Cie Thumbnail
Software • Enterprise Web • Digital Media • Consulting • Co-Working Space or Incubator • Angel or VC Firm • Agency
Irvine, CA
65 Employees
The HEICO Companies, LLC Thumbnail
Manufacturing • Industrial • Angel or VC Firm
Warrenville, IL
9000 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account